The second module: file operations for python learning

file manipulation in python

1. File reading, the difference between r and rb

see the following code

file = open('test','r',encoding='UTF-8')
data = file.read()
file.close()

Look at the first line of code, open is a function to open a file with three parameters, the first 'test' represents the file name, which can be a relative path or an absolute path, and the second parameter 'r' represents The way to open the file, in this case it is read-only, and other ways, such as 'w', 'r+', 'a', etc., the third parameter encoding='UTF-8' represents the encoding of the document The format is utf-8.

In the second line of code, the file is read all at once and assigned to data, and the data is still in memory at this time

In the third line of code, when each file is read, the file needs to be closed when the reading is completed.

see the following code

file = open('test','rb')
data = file.read()
file.close()

The biggest difference from the above code is the difference between r and rb, where rb directly reads the format of the file 0101010 in the hard disk, and r reads the data according to the default encoding format

You can see the big difference by printing the above two codes separately.

rb form read

b'\xe7\xbe\x8e\xe5\xa5\xb3\xe5\x95\x8a\r\n\xe5\xa4\xa7\xe7\xbe\x8e\xe5\xa5\xb3\xe5\x95\x8a\r\n\xe5\x93\x88\xe5\x93\x88\xe5\x93\x88\xe5\x93\x88\r\n

is read in hexadecimal

read in r form

Beautiful , beautiful
, beautiful,
hahahaha

However, in addition to distinguishing the above two forms, you need to know that both r and rb can only read files, but cannot write files.

2. Circulation of files

see the following code

file = open('test','r',encoding='utf-8')
data = file.readlines()
for i in data:
print(i)
file.close()

In fact, it is a for loop, but the difference between read and readlines will be discussed later

3. File writing, the difference between w and wb

see the following code

file = open('test2','w',encoding='utf-8') 
file.write("I am writing a file, now")
file.close()

Compared with the first code, the biggest difference in the first line is the difference in the second parameter, w is the meaning of write

The second line is where I write my test statement in this file

The third line still closes the file daily

In terms of file writing, it should be noted that w will clear all the files and rewrite them, so it will cause data loss, use it with caution, and another point is that if the file to be written does not exist, it will be created. This file is sometimes used to determine whether the file exists, and if it does not exist, a file is created

Regarding the difference between w and wb, it is that w is converted into specific 010101 data and written to the hard disk according to the default or a certain encoding format, while wb directly writes 01010101 data directly according to a certain format

4. Append file

Appending a file means adding a line of data to the end of the file

see the following code

file = open('test2','a',encoding='utf-8') 
file.write("\nI am writing a file, now2") file.close
()

The biggest difference from the previous code is that the second parameter becomes a, and the file is displayed as follows before appending

I'm writing a file, now

After the file is appended

I'm writing a file, now 
I'm writing a file, now2

It should be noted that there is also a difference between a and ab for appending, but it is the same as the above. Do you want to convert it into a specific 010101 write file according to the specified encoding format, or you can directly write it in binary form

5. The difference between r+ and w+

For file operations, some of the above and two other operations

For W+, according to his format, it is both writable and readable.

code show as below

file = open('test2','w+',encoding='utf-8') 
file.write("I am writing a file, 1\n")
file.write("I am writing a file, 2\n")
file.write("I am writing a file, 3\n")
file.close()

Text document before code execution

I'm writing a file, now 
I'm writing a file, now2

Text document after code execution

I am writing a file, 1 
I am writing a file, 2
I am writing a file, 3

Note that although the file can be read after w+, the feature of w will still be executed and the file will be emptied, so it is rarely used in daily situations.

For r+, it can be understood as readable and writable, and can also be written on the basis of readability

see the following code

file = open('test2','r+',encoding='utf-8') 
file.write("I am writing a file, 5\n")
file.close()

It can be found that the file can also be written while reading. At this time, you will find that the text is appended to the end of the file by default, that is because the cursor moves to the end of the file after the file is read, and the default is appended from the end. . The use of r+ is relatively common. Faced with this feature, some people will think that since this is the case, then I move the cursor to a certain position, read and write, which means whether the file can be modified, the code is as follows

file = open('test2','r+',encoding='utf-8')
file.seek(8)
file.write("我在写\n")
file.close()

The results can be found as follows

I'm writing 

I'm writing a file, 2
I'm writing a file, 3

The reason for this is that for the hard disk, the space of the file is the specified size. Although we can move the cursor to select the position we want to modify, it will still cause garbled characters in some cases. Use it with caution, and it is generally recommended not to use it.

6. Other operations

def fileno(self, *args, **kwargs): # real signature unknown
        返回文件句柄在内核中的索引值,以后做IO多路复用时可以用到
def flush(self, *args, **kwargs): # real signature unknown
        把文件从内存buffer里强制刷新到硬盘(注意这个,有些情况还是使用量较大,有些文件需要实时的刷入文件,就需要调用这个方法)
def readable(self, *args, **kwargs): # real signature unknown
        判断是否可读
def readline(self, *args, **kwargs): # real signature unknown
        只读一行,遇到\r or \n为止
def seek(self, *args, **kwargs): # real signature unknown
        把操作文件的光标移到指定位置
        *注意seek的长度是按字节算的, 字符编码存每个字符所占的字节长度不一样。
        如“路飞学城” 用gbk存是2个字节一个字,用utf-8就是3个字节,因此以gbk打开时,seek(4) 就把光标切换到了“飞”和“学”两个字中间。
        但如果是utf8,seek(4)会导致,拿到了飞这个字的一部分字节,打印的话会报错,因为处理剩下的文本时发现用utf8处理不了了,因为编码对不上了。少了一个字节
def seekable(self, *args, **kwargs): # real signature unknown
        判断文件是否可进行seek操作
def tell(self, *args, **kwargs): # real signature unknown
        返回当前文件操作光标位置 
def truncate(self, *args, **kwargs): # real signature unknown
        按指定长度截断文件
        *指定长度的话,就从文件开头开始截断指定长度,不指定长度的话,就从当前位置到文件尾部的内容全去掉。
def writable(self, *args, **kwargs): # real signature unknown
        判断文件是否可写

7. The difference between read, readline and readlines

read: For read, if a file is read, it will read the entire file into the memory space at one time. The entire file is of str type. It can be used for small files currently used. It is not recommended for large files.

readline: For readline, only one line of the file is read, and if it encounters an interruption such as a newline, tab, etc., the content of the read line is usually put into a string variable, and the str type is returned.

readlines: For readlines(), read the entire file content by line each time, put the read content into a list, and return the list type code as follows

file = open('test','r+',encoding='utf-8')
data = file.readlines()
for i in data:
print(i)
file.close()
print(type(data))

结果:<class 'list'>

You can clearly see that the type of readlines is a list

8. Exercise questions and answers

Exercise 1 - Global Replacement Program:

  • Write a script that allows users to globally replace the contents of a specified file when executed as follows

      `python your_script.py old_str new_str filename`
    
  • After the replacement is completed, print how many places have been replaced

code show as below

Exercise 2 - Mock Login:

  • User enters account password to log in
  • User information is stored in the file
  • After entering the wrong password for three times, the user will be locked, and the next time you log in, it is detected that the user cannot log in.

code show as below

 

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324770696&siteId=291194637