File Operations
1. Common file IO operations
1.1 open operation
open (fi le, mode = ' r', bu ff ering = -1, encoding = None, errors = None, newline = None, closefd = True, opener = None)
to open a file and returns a file object (stream object), and file description symbol, open the file fails, the exception return. Basic use: Create a file test, then open it and run off.
File operations, the most common operation is to read and write. There are two modes of file access: text mode and binary mode. In different modes, different operating functions, performance results are not the same.
Note: windows used codepage code page, you can assume that every code page is a coded form. cp936 equivalent to GBK.
1.2open parameters
file: open or file name to be created. If you do not specify a path, the current path is the default
mode mode:
In the above example, the text can be seen to open the default mode, and is read only. The default is to open a read-only mode r to open the file already exists. r read-only file is opened, if the write method will throw an exception. If the file does not exist, an exception is thrown FileNotFoundError, w represents the only way to open the write, read if an exception is thrown if the file does not exist, create the file directly if the file exists, then empty the contents of the file. x file does not exist, create a file, and open for writing only, file exists, an exception is thrown FileExistsError.
a 文件存在,只写打开,追加内容 文件不存在,则创建后,只写打开,追加内容;r是只读,wxa都是只写。wxa都可以产生新文件,w不管文件存在与否,都会生成全新内容的文件;a不管文件是否存在,都能在打开的文件尾部追加;x必须要求文件事先不存在,自己造一个新文件。
文本模式t:字符流,将文件的字节按照某种字符编码理解,按照字符操作。open的默认mode就是rt。二进制模式b :字节流,将文件就按照字节理解,与字符编码无关。二进制模式操作时,字节操作使用bytes类型。
+ 为r、w、a、x提供缺失的读或写功能,但是,获取文件对象依旧按照r、w、a、x自己的特征。+不能单独使用,可以认为它是为前面的模式字符做增强功能的。
+模式,即读写模式,r+,其中+补充了写入功能,w+ 补充了读取的功能,但是读取不到内容,使用时以+号前的为准。
1.3文件指针
文件指针,指向当前字节位置,mode=r,指针起始在0 mode=a,指针起始在EOF,tell() 显示指针当前位置。seek(offset[, whence]) 移动文件指针位置。offest偏移多少字节,whence从哪里开始。文本模式下 whence 0 缺省值,表示从头开始,offest只能正整数, whence 1 表示从当前位置,offest只接受0,whence 2 表示从EOF开始,offest只接受0。
2.缓存区buferring
Use -1 indicates the default buffer size. If binary mode, use io.DEFAULT_BUFFER_SIZE value, default is 4096 or 8192. If it is a text mode, if it is a terminal device, a line buffer mode, if not, use binary mode strategy.
0 used in the binary mode, showing off buffer
in a text mode, refers to the use of the line buffer. Means that the saw flush newline
greater than 1 is used to specify the size of the buffer
(small file storage wasted space, find trouble)
Queue (FIFO) buffer on the nature, the dictionary is essentially the cache, to facilitate batch processing buffer data, is not suitable for Find, cache find more efficient. Dictionary Cache are to be found quickly by key value (because the key is a hash, the hash value is the same value of the same can be quickly positioned to the value stored at a position corresponding to the hash value), time complexity is O (1).
A buffer memory space, in general, is a FIFO queue (First In First Out), the buffer is full or reaches a threshold value, the data will be flush to disk.
flush () buffer data is written to disk
close () calls before closing flush ()
io.DEFAULT_BUFFER_SIZE default buffer size, bytes