Basic usage of file operation technology

Article Directory

1. Text files and binary files:

Text file: store ordinary character text, the default is the unicode character set (two bytes represent one character), which can represent up to 65536
Binary file: The content is stored in "bytes" and cannot be opened with Notepad. Common ones are: video, audio, picture, DOC file

2. File operation related modules

Input and output of io file stream
os basic operating system functions
glob finds file path names that meet specific rules
fnmatch uses a pattern to match the path name of a file
fileinput handles multiple input files
filecmp for file comparison
cvs is used for cvs file processing
pickle & cPickle for serialization and deserialization
xml is used for XML data processing
bz2 gzip zipfile zlib tarfile is used to process compressed and decompressed files corresponding to different algorithms

3. Create a file object

open()： open(filename [,openmodel]) ： f = open(r “d:\b.txt”, “a”)

If it is just a file name, it represents the file in the current directory; the file name can be entered in the full path; r reduces the input of \

Open method:
- r read mode
- w Write mode if the file does not exist, create it, if it exists, read and write the content again
- a Append mode will be automatically created if the append file does not exist
- b In the binary mode, if “b” is not written, the default is a text file
- + Read and write mode

4. Common encoding methods

ASCII 7 bits can only represent 128 characters
SIO8859-1 8 bits represent 1 character, can represent 256 characters, compatible with ASCII
GBK GB2312 GB18030 National standard Chinese characters 2 bytes
Unicode fixed-length encoding, 2 bytes represent the encoding form used by python by default for a character
UTF-8 variable length encoding 1-4 bytes represents one character for English and one byte for Chinese characters and 3 bytes for Chinese characters

5. File writing steps

Create file object
data input
Close file object

There are three more common methods, which are recorded as follows:

filename = "my01.txt"
f = open(filename,'w', encoding='UTF-8')
s = "OLG\nHello\n小可爱\n"
f.write(s)
f.close() # 关闭文件流

try:
    f = open(r"test/my02.txt","a") # 先写入缓存区
    str = "hello,my02"
    f.write(str) #再将内容写入文件
except BaseException as e:
    print(e)
finally:
    f.close()

# with语句 上下文管理器：自动管理上下文资源，不管什么原因跳出with块，都能确保文件正常关闭：
s = ["小可爱\n","hello\n"]
with open(r"my03.txt","a",encoding="UTF-8") as f:
    f.writelines(s) #逐行写入

6. Reading of files

read([size]) reads size strings from the file and returns them as the result. If there is no size, the entire file is read to the end and an empty string is returned
readline() reads a line as the result and returns it. Read to the end of the file, an empty string will be returned
readlines() In the text file, each line is stored in the list as a string, and the list is returned

with open("test/my01.txt",'r',encoding='UTF-8') as f:
    for a in f:
        print(a,end=" ")

Note: enumerate() add serial number

a = ['a\n','b\n','c\n']
b = enumerate(a)
print(a)
print(list(b)) #b:（0，‘cao’）。。。

# 读取行号
c = [temp.rstrip() + " # " + str(index) for index,temp in enumerate(a)] # .rstrip() 去分隔符\n
# print(c)

with open('caotest/my01.txt','r',encoding='utf-8') as f:
    lines = f.readlines()
    lines = [line.rstrip() + '#' + str(index) + '\n' for index,line in enumerate(lines)]
    print(lines)

7. Reading and writing of binary files

file_mode = wb rb ab and the rest are the same

8. Common attributes and methods of file objects

Attributes:

name
mode
closed

Open mode:

Common methods for file objects:

read([size])
readline()
readlines()
write(str)
writelines(str) does not add newlines
seek(offset[,whence]) Move the file pointer to a new position, offset represents the position relative to whyce
- offset：
  - off is positive to move to the end direction; negative to move to the start direction
- whence：
  - 0 from the beginning of the file
  - 1 Calculate from the current position
  - 2 Calculate from the end of the file
tell() returns the current position of the file pointer
truncate(size) No matter where it is, only the size bytes before the pointer are left, and the rest are deleted
flush() writes the contents of the buffer to the file, but does not close the file
close() Write the contents of the buffer to the file, close the file at the same time, and release related resources

with open('caotest/my01.txt','r',encoding='utf-8') as f:
    print('filename:{0}'.format(f.name))
    print(f.tell()) #读取文件指针位置
    print('读取内容：{0}'.format(str(f.readline())))
    print(f.tell())
    f.seek(5) #改变文件的指针位置
    print('读取的内容：{0}'.format(str(f.readline())))

9. Serialize using pickle

In python, everything is an object, essentially a "memory block for storing data",

Serialization: Convert the object into a "serial ratio" data form, store it on the hard disk or transfer it to other places over the network
Deserialization: the reverse process, the read "serialized data" is converted into an object

pickle().dump(obj,file) obj is the object to be serialized, and file refers to the stored file

pickle().load(file) Read data from file and deserialize it into an object

# 序列化
with open(r'caotest/my07.txt','wb') as f:
    a1 = 'caoyh'
    a2 = 235
    a3 = [20,30,50]
    a4 = '小可爱'
    pickle.dump(a1,f)
    pickle.dump(a2, f)
    pickle.dump(a3, f)
    pickle.dump(a4,f)
# 反序列化
with open('test/my07.txt','rb') as f:
    for a in (a1,a2,a3,a4):
        a = pickle.load(f)
        print(a)

10. CSV file operation

csv is a good delimiter text format, often used for data exchange,
The export and import of Excel file and database data is
different from Excel file, csv file
- Value has no type, all values are strings
- Cannot specify styles such as font color
- Can't specify cell height and width, can't merge cells
- No multiple worksheets
- Cannot embed images

import csv
with open('example-write.csv','r',encoding='utf-8') as f:
    a_csv = csv.reader(f)
    # print(a_csv)
    # print('*'*20)
    # print(list(a_csv))
    # print('*' * 20)
    for row in a_csv:
        print(row)

# csv文件写入
with open('example.csv','w') as f:
    b_csv = csv.writer(f)
    b_csv.writerow(["005","bb","18","1000"])

11.os and os.path modules

os.system() directly calls the system command os.system("notepad.exe")

os.system() calls the ping command of the window system os.system("ping www.baidu.com")

os module

Insert picture description here

import os
os.system('cmd')
# 直接打开应用
os.startfile(r'C:\Program Files (x86)\Sangfor\SSL\SangforCSClient.exe')

'''
获取文件和文件夹的相关信息
'''
print(os.name) # 返回操作系统的名字   window-->nt linux/unix-->posix
print(os.sep)  # 返回操作系统的分隔符 window-->\ linux unix-->/
print(repr(os.linesep))  # window-->\r\n linux unix-->\n\

print(os.stat('my01.txt')) # 获取文件信息

'''
创建目录，创建多级目录，删除目录
'''
#返回当前工作目录
print(os.getcwd())
#创建子目录
# os.mkdir("bookcaoyh")
#先指定目录，再创建子目录
os.chdir("D:/")
os.mkdir("caoyh")

'''
创建目录，创建多级目录，删除目录
'''
os.mkdir("小可爱") #创建目录
os.rmdir("cao") #删除目录
os.makedirs("cao/y/h") #创建多级目录
os.removedirs("cao/y/h") #删除多及目录 只能是空的才可以

os.makedirs("../cao/y") #../ 指的是上一级目录

os.rename("小可爱","cao") # 修改目录名字

dirs = os.listdir("caoyh") #列出一级子目录和子文件
print(dirs)

os.path module

Insert picture description here

import os.path

#### 指的是相对路径
############判断：绝对路径、是否目录、是否文件、文件是否存在###################
print(os.path.isabs("d:/onedrive")) # True
print(os.path.isdir("d:/onedrive")) # True
print(os.path.isfile("d:/a.txt"))   # False
print(os.path.exists("d:/onedrive"))# True

############## 获取文件的基本信息 #############################
print(os.path.getsize("my01.txt")) #获得文件大小
print(os.path.abspath("my01.txt")) #获得文件的绝对路径
print(os.path.dirname("my01.txt")) #获得文件的相对路径

################ 路径的操作 ########################
path = os.path.abspath("my01.txt")
path2list = os.path.split(path)
print(path2list)

print(os.path.splitext(path))

print(os.path.join('caoyh','join'))

'''
练习指定目录下的所有.py文件，并输出文件名
'''
import os
path = os.getcwd()
file_list = os.listdir(path)
for filename in file_list:
    if filename.endswith("py"):
        print(filename,end='\t')

print('\n'+"*"*20)

file_list2 = [filename for filename in os.listdir(path) if filename.endswith(".py")]
for f in file_list2:
    print(f,end='\t')

os.walk() traverse all files recursively

Insert picture description here

'''
测试os.walk()递归遍历所有的子目录和文件夹
'''
path = os.getcwd()#返回绝对路径
print(path+'\n')
list_files = os.walk(path)

for dirpaths,dirnames,filenames in list_files:
    for dir in dirnames:
        print(os.path.join(dirpaths,dir))
    # print('*'*20)
    for file in filenames:
        print(os.path.join(dirpaths,file))

12.Shutil module (copy and compression)

Insert picture description here

## shutil ： 拷贝和压缩

import shutil

# 拷贝文件
shutil.copyfile('1.txt','1_copy.txt')

# 拷贝文件夹 以及其中的内容 只能拷贝一次，多次则报错
shutil.copytree('movie','example')

# 压缩 解压缩
shutil.make_archive('example/haha','zip','movie') # 将movie文件夹下的文件，以zip格式压缩至example文件夹下命名为haha

import zipfile
# z1 = zipfile.ZipFile('z1.zip','w')
# z1.write('1.txt')
# z1.write('1_copy.txt')
# z1.close()

z2 = zipfile.ZipFile('z1.zip','r')
z2.extractall('z2')
z2.close()

Basic usage of file operation technology

Article Directory

1. Text files and binary files:

2. File operation related modules

3. Create a file object

4. Common encoding methods

5. File writing steps

6. Reading of files

7. Reading and writing of binary files

8. Common attributes and methods of file objects

9. Serialize using pickle

10. CSV file operation

11.os and os.path modules

os module

os.path module

os.walk() traverse all files recursively

12.Shutil module (copy and compression)

Guess you like