python codecs.open() read and write files

with codecs.open() as f 相比 with open () as f

      Python read and write files are estimated to use the open built-in function.

      The way to open a file is generally: with open(file_name,access_mode ='r',buffering = -1) as f. file_name is the path of the file plus the file name. Without the path, the file will be stored in the path of the python program.

access_mode is the mode of operating files, mainly r, w, rb, wb, etc. There are a lot of details on the Internet, and buffering = -1 is used to indicate the cache method used to access the file. 0 means no cache; 1 means only one line is cached, and n means n lines are cached. If it is not provided or is a negative number, it means that the system default caching mechanism is used.

      After opening, it is the operation of writing and reading. However, there are some problems when opening with the open method. Open can only write str type to open files, no matter what encoding method the string is.

This is perfectly possible. But sometimes when we crawl or write some data in other ways, there will be a problem of inconsistent encoding when writing files, so we generally convert to unicode uniformly. At this time, there is a problem with writing to a file opened in open mode. E.g

>>> line= u'我'
>>> f.write(line)


UnicodeEncodeError: 'ascii' codec can't encode characters

What to do, we can encode the above line into str type, but it is too troublesome. We have to decode what we get to unicode and then encode to str. . .

input file (gbk, utf-8...) ----decode-----> unicode -------encode------> output file (gbk, utf-8... )

Instead of this cumbersome operation is codecs.open, for example

>>> import codecs
>>> with codecs.open('test1.txt','a','utf-8') as f:
>>>        fw.write(line)
 

Guess you like

Origin blog.csdn.net/Growing_hacker/article/details/107957570