Python office automation (1)

The code in this article is implemented with reference to other tutorial books.

file read and write

open function

The open function has 8 parameters, and the first 4 are commonly used. Except for the file parameter, other parameters have default values. file specifies the name of the file to be opened, and should include the file path. If the path is not written, it means that the file and the current py script are in the same folder. buffering is used to specify the buffering method used to open the file, and the default value -1 means to use the system's default buffering mechanism. File reading and writing needs to interact with the hard disk. The purpose of setting the buffer is to reduce the number of times the CPU operates the disk and prolong the service life of the hard disk. encoding is used to specify the encoding method of the file, such as GBK, UTF-8, etc. UTF-8 is used by default. Sometimes opening a file is full of garbled characters, because the encoding parameters are different from the encoding method used when creating the file.

mode specifies the opening mode of the file. The basic modes of opening a file include r, w, a, corresponding to read, write, and append. Additional modes include b, t, +, indicating binary mode, text mode, and read-write mode. Additional modes need to be combined with basic modes to use, such as "rb" means to open a file in binary read-only mode, and "rb+" means to open a file in binary read-only mode. Open the file in write mode.

It should be noted that for any mode with w, you must be very cautious when operating, it will clear the original file first, but there will be no prompt. All files with r must exist first, otherwise an error will be reported because the file cannot be found.

Create a new text file python_zen.txt, copy and paste the Zen text of python (the text returned by import this). Save as UTF-8 BOM-free encoding format
insert image description here
Common object methods and their functions

method effect
read Read the file into a string, you can also read specified bytes
readline Read a line from a file into a string
readlines read the entire file into a list line by line
write write string to file
writelines Write a list of row data to a file
close close file
flush Write the contents of the buffer to disk
tell Returns the current position of the file operation marker, starting at the beginning of the file
next Return the next line and shift the file operation marker to the next line
seek Move the file pointer to the specified location
truncate truncate file

read text file

# 使用open函数打开文件
f=open('./python_zen.txt',mode='r',encoding='utf-8')
type(f)#查看类型
_io.TextIOWrapper
# 使用read方法将文件读入字符串中
texts=f.read()
print(texts)#输出文件全部内容

insert image description here

f.seek(0)#移动文件指针到文件开始处
0
# 使用readline方法读入文件的一行到字符串
texts=f.readline()
print(texts)

insert image description here

# 继续使用readline方法读取
texts=f.readline()
print(texts)#第二行该行为空行

insert image description here

# 继续使用readline方法读取
texts=f.readline()
print(texts)#第三行

insert image description here

# readline方法每次只读取一行,它常常与for循环配合使用
f.seek(0)
for line in f:
    print(line,end='')

insert image description here

# readlines方法读取效果
f.seek(0)
texts=f.readlines()
print(texts)

insert image description here
The effect of readlines is to read the entire file at once, and automatically decompose the file content into a list by line.
After reading, use the close method to close the file.

f.close()

When reading or writing a Python file, it is necessary to call the close method to close the file. The
former is to avoid occupying memory, and the latter is to ensure that the content is successfully written into the target file.
Sometimes we forget to call the close method, or the code makes an error in the middle of running, resulting in the close method not being run.
To avoid this situation, try...finally... structure can be used.

try:
    f=open(r'./python_zen.txt','r')
    ...
finally:
    f.close()

This structure simply says: Regardless of whether an exception occurs or not, the statement in finally will be executed before the end of the program.

# 此外,可以用上下文管理器with语句,确保不管使用过
# 程中是否发生异常都会执行必要的“清理”操作,以释放资源。
with open(r'./python_zen.txt','r') as f:
    texts=r.read()
    ...

Count the frequency of words

from collections import Counter
lists=[]
punctuation=',。!?、()【】<>《》=:+-*—“”...\n'#跳过其他字符
with open('./python_zen.txt',mode='r',encoding='utf-8') as f:
    for line in f:
        for word in line.split(' '):#如果统计字母则去掉[.split(' ')]即可
            if word not in punctuation:
                lists.append(word)
counter=Counter(lists)
print(counter)

insert image description here

write to text file

# 写入一个文本文件
f=open(r'./python_zen_write.txt',mode='w',encoding='utf-8')
#首尾文本紧跟引号可以防止输入多余的空行
f.write(
'''The Zen of Python, by me

Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.'''
)
f.close()

insert image description here

# 有时,我们需要逐步写入内容,每次只写一句话,
# 这时不能用w模式,w会覆盖之前的文本内容,而应该使用追加模式a
f=open(r'./python_zen_write.txt',mode='a',encoding='utf-8')
f.write('这是python之禅的内容')
#我们尝试不运行f.close()会发生什么

Open python_zen_write.txtthe file to view, the content you want to append is not written.
insert image description here
When writing a file, the operating system often does not write the data to the hard disk immediately, but puts it in the memory and caches it first, and then writes it one after another. Only when the close method is called, the operating system ensures that all unwritten data is written to the hard disk. The consequence of forgetting to call the close method is that although the file is created, the data is not written to the file.

# 可以使用flush方法,强制将缓存的数据写入文件
f.flush()

It can be seen that the additional content has been written
insert image description here
, but it does not seem to be appended after a newline, \njust add a newline.

# 要在文件开始位置插入一句话
# file.seek(off, whence)
# whence(0代表文件开始位置,1代表当前位置,2代表文件末尾)偏移off字节
#文章开头介绍过,r+,读写模式,忘了的回到文章开头看
with open('./python_zen_write.txt',mode='r+',encoding='utf-8') as f:
    content=f.read()
    f.seek(0,0)
    f.write('开始位置:python之禅\n'+content)

insert image description here
Try appending at the end

with open('./python_zen_write.txt',mode='r+',encoding='utf-8') as f:
    f.seek(0,2)
    f.write('\n末尾位置:结束语')

insert image description here

File and Directory Operations

use os library

import os

Commonly used operation functions

function illustrate
getcwd Get the current working directory, that is, the directory path where the current python script is located
is a list List all files and subdirectories under the specified directory, including hidden files
mkdir Create a directory
unlink Delete Files
remove Delete Files
is rm delete empty directory
removedirs If the directory is empty, delete it, and recurse to the upper level directory, if the upper level directory is empty, delete it
rename rename file
stat Get the attributes and status information of a file

os.path can call the ntpath.py module

os.path

insert image description here
Commonly used operation functions

function illustrate
abspath Returns the canonicalized absolute path
basename Return the last filename part
dirname back to catalog section
split split filename into directory and filename
splitext separate extension
join Combine multiple paths, start splicing with the first path containing / in the string
getctime Returns the time when a file or directory was created (copied to a directory)
getatime Access time, read the content of the file once, this time will be updated
getmtime Modify the time, modify the content of the file once, and the time will be updated
getsize get file size
isabs Return True if path is an absolute path
exists Returns True if path exists; returns False if path does not exist
underground Returns True if path is an existing directory, otherwise returns False
isfile Returns True if path is an existing file, otherwise returns False
os.getcwd()#当前工作目录

insert image description here

# 修改工作目录
os.chdir('D:\\Anaconda3\\AnacondaProjects')
print(os.getcwd())
os.chdir('D:\\Anaconda3\\AnacondaProjects\\python自动化办公')
print(os.getcwd())

insert image description here

os.listdir()#获取当前工作目录的全部文件和子目录

insert image description here

# 遍历文件目录
# os.listdir()方法不能获取子目录里面的文件,
# 要进一步获取则需要用到os.walk方法。
path=r'D:\Anaconda3\AnacondaProjects\python自动化办公'
for foldName,subfolders,filenames in os.walk(path):
    for filename in filenames:
        print(foldName,filename)#foldName文件目录,filename文件名

insert image description here

# 拆分绝对路径文件名
path=r'D:\Anaconda3\AnacondaProjects\python自动化办公\python_zen.txt'
print(os.path.split(path))
print(os.path.dirname(path))
print(os.path.basename(path))
print(os.path.splitext(path))

insert image description here

#组合文件名
print(os.path.join(os.getcwd(),os.path.basename(path)))

insert image description here

# 获取文件属性
path=r'D:\Anaconda3\AnacondaProjects\python自动化办公\python_zen.txt'
print(os.path.getctime(path))#创建时间
print(os.path.getmtime(path))#修改时间
print(os.path.getatime(path))#访问时间

insert image description here
The time in the above format indicates how many seconds have passed since January 1, 1970. To convert it into an understandable time, use the time module.

import time
print(time.ctime(os.path.getctime(path)))#创建时间
print(time.ctime(os.path.getmtime(path)))#修改时间
print(time.ctime(os.path.getatime(path)))#访问时间

insert image description here
The creation time here does not refer to the original time of the content of the file. If the file is copied from elsewhere, it is the time of copying.

print(os.path.getsize(path))#查看文件大小

insert image description here

# stat方法获取文件的属性及状态信息
print(os.stat(path))

insert image description here

# 输出文件大于0且后缀为.txt的文件名
for file in os.listdir():
    path=os.path.abspath(file)
    filesize=os.path.getsize(path)
    if filesize>0 and os.path.splitext(path)[-1]=='.txt':
        print(os.path.basename(path))

insert image description here
Similarly, files that meet certain conditions can also be deletedos.remove(file)

# 新建一个文本文件
with open('new.txt','w',encoding='utf-8') as f:
    f.write('一个新的txt文件')
for foldName,subfolders,filenames in os.walk(os.getcwd()):
    print('foldName:',foldName,'\n','subfolders:',subfolders,'\n','filenames:',filenames)

insert image description here

# 将当前目录及子目录所有new.txt文件改名为new2023.txt
for foldName,subfolders,filenames in os.walk(os.getcwd()):
    for filename in filenames:
        #不加这个筛选条件,则是更改所有文件文件名。也可加其他筛选条件
        if filename=='new.txt':
            abspath=os.path.join(foldName,filename)
            extension=os.path.splitext(abspath)[1]
            new_name=filename.replace(extension,'2023'+extension)
            os.rename(abspath,os.path.join(foldName,new_name))

Use the shutil library

Let's continue writing the next article, Python Automation Office (2).

Guess you like

Origin blog.csdn.net/weixin_46322367/article/details/129467780