【学习笔记】Python办公自动化 - Task 01 文件自动化处理 & 邮件自动发送

这次学习过程主要尝试以练代抄，在动手写代码解决具体问题的过程中使用并熟练各项基本的操作方法。
本笔记主要记录在编代码以解决习题时的思路、遇到的问题和解决方法。

题目一：生成随机的测验试卷文件
假如你是一位地理老师，班上有 35 名学生，你希望进行美国各州首府的一个小测验。不妙的是，班里有几个坏蛋，你无法确信学生不会作弊。你希望随机调整问题的次序，这样每份试卷都是独一无二的，这让任何人都不能从其他人那里抄袭答案。当然，手工完成这件事又费时又无聊。好在，你懂一些 Python。

import os
import random

#America 各州对应首府字典
capitals = {
    
    'Alabama': 'Montgomery', 'Alaska': 'Juneau', 'Arizona': 'Phoenix',
'Arkansas': 'Little Rock', 'California': 'Sacramento', 'Colorado': 'Denver',
'Connecticut': 'Hartford', 'Delaware': 'Dover', 'Florida': 'Tallahassee',
'Georgia': 'Atlanta', 'Hawaii': 'Honolulu', 'Idaho': 'Boise', 'Illinois':
'Springfield', 'Indiana': 'Indianapolis', 'Iowa': 'Des Moines', 'Kansas':
'Topeka', 'Kentucky': 'Frankfort', 'Louisiana': 'Baton Rouge', 'Maine':
'Augusta', 'Maryland': 'Annapolis', 'Massachusetts': 'Boston', 'Michigan':
'Lansing', 'Minnesota': 'Saint Paul', 'Mississippi': 'Jackson', 'Missouri':
'Jefferson City', 'Montana': 'Helena', 'Nebraska': 'Lincoln', 'Nevada':
'Carson City', 'New Hampshire': 'Concord', 'New Jersey': 'Trenton', 'New Mexico': 'Santa Fe', 'New York': 'Albany', 'North Carolina': 'Raleigh',
'North Dakota': 'Bismarck', 'Ohio': 'Columbus', 'Oklahoma': 'Oklahoma City',
'Oregon': 'Salem', 'Pennsylvania': 'Harrisburg', 'Rhode Island': 'Providence',
'South Carolina': 'Columbia', 'South Dakota': 'Pierre', 'Tennessee':
'Nashville', 'Texas': 'Austin', 'Utah': 'Salt Lake City', 'Vermont':
'Montpelier', 'Virginia': 'Richmond', 'Washington': 'Olympia', 'West Virginia': 'Charleston', 'Wisconsin': 'Madison', 'Wyoming': 'Cheyenne'}

# 获取各州和对应首府的名称列表
questions = list(capitals.keys())
answers = list(capitals.values())

#创建存放试卷和答案的文件夹
if os.path.exists('./papers')==False:
    os.makedirs('./papers')
if os.path.exists('./keys')==False:
    os.makedirs('./keys')

#为第stu（1-35）位同学生成试卷
for stu in range(1,36):
    random.shuffle(questions)
    #新建试卷文件paper[stu]和对应答案文件key
    paper_name = 'paper['+str(stu)+'].txt'
    key_name = 'key['+str(stu)+'].txt'
    paper = open(os.path.join('./papers',paper_name),'w',encoding='utf-8')
    key = open(os.path.join('./keys',key_name),'w',encoding='utf-8')
    #向paper写入50道试题和对应选项
    for i in range(1,51):
        paper.write(str(i)+'.where is the capital of '+questions[i-1]+'?\n')  #第i个题目
        sele_num = ['A','B','C','D']                #选项序号
        selections = []                             #选项内容列表
        right_ansewer = capitals[questions[i-1]]    #正确答案
        selections.append(right_ansewer)            #将正确答案加入选项列表

        # # 在控制台输出题目和正确答案
        # print(str(i)+'.where is the capital of '+questions[i-1]+'?')
        # print('answer is:'+right_ansewer+'\n')
		
		#随机打乱答案顺序
		random.shuffle(answers)
        #添加3个随机错误答案进入选项列表
        for _ in range(0,50):
            if len(selections) == 4:
                break
            if answers[_] != right_ansewer:
                selections.append(answers[_])
        random.shuffle(selections)  #打乱答案排序
        right_ansewer_num = selections.index(right_ansewer)     #保存正确答案序号
        #输出4个选项
        for _ in range(0,4):
            paper.write(sele_num[_]+'.'+selections[_]+'\n')
        #向答案文件中写入第i题的答案
        key.write('第'+str(i)+'道题答案为：'+sele_num[right_ansewer_num]+'\t'+right_ansewer+'\n')
    #关闭试卷和答案文件
    paper.close()
    key.close()

1.题目思路：

（1）首先建立问题（州名称）和答案（首府名称）的两个列表。
（2）生成每份试卷前，打乱问题列表，之后先按顺序生成50个问题。
（3）每生成一个问题，打乱答案列表，建立一个选项列表（含4个答案，通过字典找出一个正确的，并按答案列表找出3个错误的），并打乱该选项列表。
（4）从答案列表中找出正确答案序号写进对应试卷的答案文件当中。

2.斜杠与反斜杠作为本地路径划分符的使用：

（1）斜杠：【/】 [可记作斜率为正的斜线] 只可用于windows
反斜杠：【\】 [可记作斜率为负的斜线] 可用于windows和unix
（2）在Python编程中，我们可以直接全部使用斜杠【/】来划分地址路径，如 `D:/Python/data`，则可在windows和unix平台上通用。如果使用反斜杠【\】标记windows环境下路径，则需要使用双反斜杠如`D:\\Python\\data`，其中第一个用于转义。

题目二：生成随机的测验试卷文件
编写一个程序，遍历一个目录树，查找特定扩展名的文件（诸如.pdf 或.jpg）。不论这些文件的位置在哪里，将它们拷贝到一个新的文件夹中。

import os
import shutil

rootPath = 'F:\\pycharm\\work\\OfficeAutomation'    #设置遍历根路径
fileType = 'txt'                                    #设置遍历文件后缀类型
destination = 'F:\\copy'                            #设置文件复制后地址
for curFolder,subFolder,files in os.walk(rootPath):
    # print('The current folder is ' + curFolder)
    for file in files:
        if(file.split('.')[-1]==fileType):
            print("copy "+os.path.join(curFolder,file)+" to "+destination+"...")
            shutil.copy(os.path.join(curFolder,file),destination)

1.题目思路：

（1）使用os.walk()遍历根目录及子目录下的每一个文件
（2）使用字符串分割方法split分割出后缀名，如果是需要的则将该文件通过shutil.copy方法拷贝至对应目标位置下

2.destination中的目标文件夹最好提前手动生成，不然拷贝过去后会成为一个没有后缀名的未知文档。

3.注意不要将destination设置在rootPath下，否则刚刚被复制进来的文件也会被遍历到，从而导致复制自身到原位置而报shutil.SameFileError的错误

题目三：一些不需要的、巨大的文件或文件夹占据了硬盘的空间，这并不少见。如果你试图释放计算机上的空间，那么删除不想要的巨大文件效果最好。但首先你必须找到它们。编写一个程序，遍历一个目录树，查找特别大的文件或文件夹，比方说，超过100MB 的文件（回忆一下，要获得文件的大小，可以使用 os 模块的 os.path.getsize()）。将这些文件的绝对路径打印到屏幕上。

import os
import send2trash

rootPath = 'F:\\pycharm\\work\\OfficeAutomation'    #设置遍历根路径
threshold = pow(2,20)*100                           #删除大于阈值(100M)的文件

for curFolder,subFolder,files in os.walk(rootPath):
    for file in files:
        fileSize = os.path.getsize(os.path.join(curFolder,file))
        if(fileSize>threshold):
            print("The size of "+os.path.join(curFolder,file)+" is "+str(fileSize)+",which > "+str(threshold)+" , deleting...")
            send2trash.send2trash(os.path.join(curFolder,file))

1.题目思路：

（1）使用os.walk()遍历根目录及子目录下的每一个文件
（2）使用os.path.getsize（）方法判断文件大小与设定阈值之间的关系，大于阈值则打印并删除

2.os.path.getsize()方法获取的文件大小单位为字节Byte，我们通常使用的单位1MB= 2^10KB = 2 ^20Byte，比较时要注意换算关系。

题目四：编写一个程序，在一个文件夹中，找到所有带指定前缀的文件，诸如 spam001.txt,spam002.txt 等，并定位缺失的编号（例如存在 spam001.txt 和 spam003.txt，但不存在 spam002.txt）。让该程序对所有后面的文件改名，消除缺失的编号。作为附加的挑战，编写另一个程序，在一些连续编号的文件中，空出一些编号，以便加入新的文件。

import os
import shutil

rootPath = 'F:\\pycharm\\work\\OfficeAutomation\\files'    #设置遍历根路径
prefix = 'spam'                                            #设置文件名前缀
fileList = []                                              #旧文件名列表
fileList_New = []                                          #新文件名列表

# #生成测试文件
# for i in range(11,21):
#     file = open(os.path.join(rootPath,'spam{:03d}.txt'.format(i)),'w')
#     file.write('Name of this file is spam{:03d}.txt'.format(i))
# file.close()

count = 1

for name in os.listdir(rootPath):                           #遍历rootPath下所有文件或文件夹
    if os.path.isfile(os.path.join(rootPath,name)):         #找到文件类型的文件（非文件夹）
        if name.startswith(prefix):                         #将所有含有前缀名的文件名加入旧列表
            fileList.append(name)
            newName = prefix+'{:03d}'.format(count)+'.txt'  #将当前文件的正确文件名加入新列表
            fileList_New.append(newName)
            count+=1
fileList.sort()                                             #将旧文件名列表排序

#将错误序号之后的所有文件夹重命名为正确的文件名
for old,new in zip(fileList,fileList_New):
    if(old!=new):
        print("change the name ["+old+"] to ["+new+"]")
        shutil.move(os.path.join(rootPath,old),os.path.join(rootPath,new))

1.题目思路：

（1）使用os.listdir()获取当前路径下的每一个文件（排除文件夹）
（2）若文件符合前缀命名，则加入到旧命名列表，并将文件对应的正确命名加入新命名列表
（3）遍历旧和新的文件列表，如果不等则使用shutil.move方法将命名更正。