Character encoding review by the file base (b)

1. Character encoding
alphabetic and numeric correspondence table
ASCII: English characters can be identified, 1Byte = 1 English characters
Hello
8bit | 8bit | 8bit | 8bit | 8bit

GBK: Chinese characters and English characters, 2Byte = 1 Chinese characters English characters 1Byte = 1
you are a good
8bit | 8bit | 8bit | 8bit | 8bit
your
1111 1111 | 1111 1111 for the first part identifies the Chinese characters

0111 1111 | 0111 1111 English characters in the first place is 0
removed first save saved in the memory is encoded in


Unicode
1 . compatible with the nations of characters
2. with binary coding have any other relationship mapping
Unicode-> GBK coding
GBK-> Unicode decoding
UTF-8: Unicode is a condensed version (format conversion) 3 Bytes = 1 Chinese characters, 1Byte English characters 1 =

2 character encoding in python application
1. Ensuring python program execution without distortion of the first two stages, adding the header
in the first line of the file is written: #coding: when a file stored code

2. evolution of type string:
there are two kinds python2 type "string":
type a
#coding: GBK
'on' x = 'a' would be stored # Hou GBK encoding binary

type two
'on' x = u 'a' would be stored in Unicode #

python3 two "string" type associated
x = 'a' would be stored on Hou # Unicode encoding binary

x.encode ( 'gbk') is converted to binary code is a byte
string can be encoded in a binary type

3. Conclusion:
1. to ensure that no distortion: what encoding to what it should be stored encoded take
2. string defined in python2, should prefix U
3. in must be added to the file header when writing python

basic flow file processing:
1. open the file to get the file object (the file object ====> operating system to open a file ==> hard disk)
F = open (R & lt 'file path ', mode =' file open mode ', encoding =' character encoding ') r representative of native string

2. file operations: read / write
reached, f.read ()
f.readlines ()
f.readline ()
f.readable ()


3. Close file command to the operating system, operating system resources recovery
f.close ()
F = Open (R & lt 'file location', MODE = 'RT', encoding = 'UTF-. 8')
f.close ()
Print ( f)


II: context management
with open (r 'file location'mode='rt',encoding='utf-8') as f:
data=f.read()
Print (Data)
Print ( '=' * 100)
for in Line F: Step # have not read the contents of the file to read the contents of the file pointer is pointing to the last
print (line)

today SUMMARY
1. File Open Mode
2 file operation method
pointer moves within the document 3. (active control is not a passive trigger)

to open the file mode has three modes purity: r (default) WA
two modes of control operation of the file content: t (default) b
premise: TB mode can not be used alone, it must be used with the pure mode
t text mode:
1. files are read in units of the string
2. only for text files
3. must specify the encoding parameter
b binary mode:
1. to read and write files are Bytes / binary units
2 can for all files
3. be sure not specify the encoding parameters

Two open file mode Detailed
1.r read-only mode: When the given file does not exist, exist in the file pointer to the beginning of the file directly
with open ( 'file location', mode = 'rt', encoding = 'utf-8' )
Print (f.readlines ())


2.w write-only mode: create empty document in the file does not exist, there is a case file pointer file will go to the beginning of the file
with open ( 'b.txt', mode = 'wt', encoding = 'utf-8') F AS:
Print (f.writable ()) return true # is writable writable
print (f.readable ()) # returns false if the unreadable readable
f.write ( 'Hello \ n-')
f.write ( 'I'm so \ n') # stressed: in the file without closing emptied, after written content must follow behind the front to write the contents of
f.write ( 'Hello everyone \ the n-')
f.write ( '111 \ n22222 \ n3333 \ n-')
Lines = [' 1111 ',' 2222 ',' 33333 ']
for Line in Lines:
f.write (Lines)
f.writelines (Line)

# User authentication function
inp_name = input ( 'Please enter your name:') Strip ().
Inp_pwd the INPUT = ( 'Please enter your password:') Strip ().
With Open (r'.txt ', the MODE =' RT ', encoding =' UTF-8 ') AS f:
for f in Line: # the name and password entered by the user and read the contents do than
u, p = line.strip (' \ n ') split (. ':')
IF inp_name inp_pwd == == U and the p-:
Print ( 'successful landing')
BREAK
the else:
Print ( 'account and password error')

# registration function:
name = the INPUT ( 'username >>>:') .strip ()
pwd = INPUT ( '>>> password:'). Strip ()
with Open (r'.txt ', MODE =' AT ', encoding =' UTF-. 8 ') AS F:
info ='% S:% S \ the n-'% (name, pwd)
f.write (info)
3.a only additional write mode: create empty document in the file does not exist, there is the end of the file will be moved directly to the ruling files
with open (r'.txt ',mode='at',encoding='utf-8') as f:
f.write('4444\n555\n')

R & lt W + A + # +
with Open (r'a.txt ', MODE =' + R & lt T ', encoding =' UTF-. 8 ') AS F: T + # R & lt read-write mode
print (f.readline ())
f.write ( 'Hello')

b: binary units are read
with Open (r'.txt ', MODE =' RB ') AS F:
Print (f.readline ())
f.write (' Hello ')
with Open (R & lt' .txt ', MODE =' RB ') AS F:
Print (f.readline ())
Data reached, f.read = ()
Print (data.decode (' utf-8 ')) # decoding using utf-8
with open (r'1.png ', MODE =' RB ') AS F:
Print (f.readline ())
Data reached, f.read = ()
Print (Data) can not be decoded outputs a binary #

Open with (r'.txt ', MODE =' WB ') AS F:
f.write (' Hello '.encode (' utf-8 ' ))
Copy paste principle
with open (r'1.png', = MODE 'RB') AS F:
Data reached, f.read = ()

with open(r'2.png',mode='wb') as f:
f.write(data)
# 拷贝工具
src_file=input('源文件路径: ').strip()
dst_file=input('目标文件路径: ').strip()
with open(r'%s' %src_file,mode='rb') as read_f,open(r'%s' %dst_file,mode='wb') as write_f:
for line in read_f:
# print(line)
write_f.write(line)

Guess you like

Origin www.cnblogs.com/liugangjiayou/p/11616119.html