A. Character Encoding
1.1 What is a character encoding
The computer is based on the electrical work, in other words the computer only recognizes a binary number (0, 1), so that the computer can understand human language, that is, how to achieve digital character to the process, how a character corresponding to a specific number the standard, which is called character encoding.
1.2 History of character encoding
1.ASCII code table
Expressed by 8-bit binary numbers (1bytes) an English character, the number of up to 256 (0-255 / 00000000-1111 1111)
2. States to develop their own code table
In order to meet the needs of Chinese and English, the Chinese developed GBK
GBK: 2Bytes represent a Chinese character, 1Bytes represent an English character
Japan enacted the shift_JIS
Korean developed Euc_kr
3. uniform standards
unicode: a unified representation of characters with two Bytes
unicode disadvantages:
1. waste of storage space
2.io number increased operating efficiency program to reduce (fatal)
unicode advantages:
1. compatible character nations
2. Other countries encoded data from the hard disk into memory when the Unicode encoding other countries have a corresponding relationship
For the full article in English speaking, Unicode style more than doubled storage space, so too wasteful, so then appeared utf-8. In utf-8 in the English characters are 1Bytes said of Chinese characters using 3Bytes representation.
status quo:
Memory using Unicode
Hard disk using utf-8
4. encoding and decoding process
Coding (encode):
Data stored by the memory to the hard disk
Binary data format unicode >>>> binary coding (encode) 1. memory >>>>> utf-8 format
Data from the hard disk to the hard disk memory
1. hard binary data in utf-8 format >>>> decoding (decode) >>>>> binary data in memory in Unicode
ps:. 1 encoding solution on how to ensure what with what encoding the file is not written is garbled
2. The difference of about python2 and python3
In accordance with the text file pytho2 read into interpreter default ascii code (not prevalent Unicode)
In accordance with the python3 interpreter reads the text file is used by default utf-8
Header: # condong :( character encoding) eg: (# conding: utf-8)
Written in the beginning of the file, to allow the interpreter to interpret the character encoding specified file
= X ' on ' RES1 = x.encode ( ' UTF-8 ' ) # unicode coded to the storage and transmission utf- binary data of 8 print(res1) # b'\xe4\xb8\x8a' # Bytes type byte string type you can put it as a binary data RES2 = res1.decode ( ' UTF-8 ' ) of the hard disk # utf- binary data format decoder 8 into binary data format unicode print(res2)
II. File Handling
2.1 What is a file
Simple interface to the operating system is exposed to the complexity of our operating hardware (hard disk) of
2.2 Why manipulate files
Person or application you want to permanently store data
2.3 How to file
f=open()
f.read()
f.close()
2.4 How python code file operation
Use the open command such as:
r unescaping F = Open (R & lt ' D: \ projects the Python \ day07 \ a.txt ' , encoding = ' UTF-. 8 ' ) # open a file, the operating system sends a request to the # Application in order to operate the computer hardware must be through the introduction of an operating system print (f) # f is a file object print (f.read ()) # windows operating system default encoding is gbk python default is UTF - 8 f.read () # retransmission request to the operating system to read the contents of the file f.close () # tell the operating system to close open files print(f) print(f.read())
ps: To open a.txt when the input absolute file path can be a path which is the path name of all files, a relative path may be used, added in the file 'day07' folder, another moment 'day07' file under b.txt folder can be entered directly r'a.txt ', encoding =' utf-8 'to open the file
2.5 context file operation
with open(r'D:\Python项目\day07\a.txt',encoding='utf-8') as f ,\ Open (R & lt ' D: \ projects the Python \ day07 \ b.txt ' , encoding = ' UTF-. 8 ' ) AS F1: # F is a variable name you just put it as a remote control Print (F) Print ( f.read ()) Print (f1) Print (f1.read ())
2.6 file open mode
t: operating file contents are based on a string as a unit, will automatically help us decode, you must specify the encoding parameters
b: the file operations are in Bytes (binary) as a unit, stored in the hard disk is taken out what to what must not specify the encoding parameter
ps: the file open mode must be used, such as Open and "rt" together, wherein t Mode for text files only, b mode can be used for any file
2.7 file open the way
r: read-only mode
w: write-only mode
a: Append write mode
r modes: read-only mode, if the open cursor jumps to the beginning of the file exists in the file, if the file does not exist will be given
Open with (R & lt ' D: \ Python \ Python practice \ a.txt ' , MODE = ' RT ' , encoding = ' UTF-. 8 ' ) AS F: Print (f.readable ()) # is readable True Print ( f.writable ()) # is writable False Print (f.read ()) # -time read all the contents of the file
ps: where the mode parameter can not write, do not write the default mode is rt, read-only text file t rt does not write can not write
with open(r'D:\python\python练习\a.txt') as f: pass
relative path:
Open with (R & lt ' data type classification .jpg ' , MODE = ' RB ' ) AS F: Pass
rb mode:
Open with (R & lt ' C: \ the Users \ Xiaodong \ Desktop \ theme class \ data type classification .jpg ' , MODE = ' RB ' ) AS F: Print (f.readable ()) # is readable True Print (F. Writable ()) # is writable False Print (f.read ()) # -time read all the contents of the file
Open with (R & lt ' D: \ Python \ Python practice \ a.txt ' , MODE = ' RT ' , encoding = ' UTF-. 8 ' ) AS F: Print (f.readable ()) # is readable Print (F .writable ()) # is writable Print ( " >>> 1: " ) Print (f.read ()) # -time all contents of the file read Print ( ' >>> 2: ' ) Print (f. Read ()) # cursor after the file has been read once at the end of the file, read the contents not readable Print (f.readlines ()) #Returns the file line by line is a list of the contents of a list of elements corresponding to the last cursor returns since [] Print (f.readline ()) # only reads the contents of the file a line
w mode: write-only mode, open a new file if the file exists, if present, is opened and its contents emptied and then write (caution)
Open with (R & lt ' D: \ Python \ Python practice \ a.txt ' , MODE = ' wt ' , encoding = ' UTF-. 8 ' ) AS F: Print (f.readable ()) # is readable Print (F .writable ()) # is writable f.write ( ' learning to make me happy, I love to learn \ the n- ' ) f.write ( ' learning to make me happy, I love to learn \ the n- ' ) f.write ( ' learning to make me happy, I love to learn \ the n- ' ) f.write ( ' learning to make me happy, I love to learn \ the n- ' ) f.write ( ' learning to make me happy, I love to learn \ the n- ' ) L = [ ' learn music, I learned \ the n- ' , ' learning to make me happy, I Xi \ the n- ' , ' learn music, I love to learn \ the n- ' ] f.writelines (L) # input multiple rows
a mode: Open the file additional write mode if there is, it will automatically create the file, it will open the file does not exist Ruoyi empty the contents and the cursor moves to the last
Open with (R & lt ' D: \ Python \ Python practice \ a.txt ' , MODE = ' A ' , encoding = ' UTF-. 8 ' ) AS F: Print (f.readable ()) # is readable False Print ( f.writable ()) # is writable True f.write ( ' I love learning \ the n- ' )