Use Python to make it: EXCEL file merge/split tool (including VBA version)

Foreword:

What do you do when you have collected the EXCEL records of n individuals and need to aggregate them into a summary table? If you don't use technical means, it would be too troublesome to open and copy and paste one by one! At this point, you need a tool that can complete the merge in a few seconds.

One, merge EXCEL

1. VBA realizes the merger

No routine, the following VBA code (from the network, after modification):

Sub 合并当前目录下所有工作簿的全部工作表()
Dim MyPath, MyName, AWbName
Dim Wb As Workbook, WbN As String
Dim G As Long
Dim Num As Long
Dim BOX As String
Application.ScreenUpdating = False
MyPath = ActiveWorkbook.Path
MyName = Dir(MyPath & "\" & "*.xls")
AWbName = ActiveWorkbook.Name
Num = 0
Do While MyName <> ""
If MyName <> AWbName Then
Set Wb = Workbooks.Open(MyPath & "\" & MyName)
Num = Num + 1
With Workbooks(1).ActiveSheet
.Cells(.Range("B200000").End(xlUp).Row + 2, 1) = Left(MyName, Len(MyName) - 4)
For G = 1 To Sheets.Count
Wb.Sheets(G).UsedRange.Copy .Cells(.Range("B200000").End(xlUp).Row + 1, 1)
Next
WbN = WbN & Chr(13) & Wb.Name
Wb.Close False
End With
End If
MyName = Dir
Loop
Range("B1").Select
Application.ScreenUpdating = True
MsgBox "共合并了" & Num & "个工作薄下的全部工作表。如下:" & Chr(13) & WbN, vbInformation, "提示"
End Sub

It doesn't matter if you don't understand, you don't need to learn VBA anymore, just run and you're done. So how to use it?

There are 3 EXCEL tables to be merged in the following directory, and the data in each table is different;
Insert picture description here
create an EXCEL file and open it to store the merged data;
Figure

Open the VBA interface through the shortcut key Alt + F11;
Figure

Open Sheet1, copy and paste the above code into it, press F5 to run;
Figure

The data in the other 3 EXCEL files in the same directory will be merged here;
Figure

There are various methods for office automation. Let's take a look at how omnipotent Python implements this function.

2. Python implements merge

Go directly to the code, see the note for instructions:

def merge_excel(dir):
    print('--- 执行合并 ---')
    filename_excel = [] # 存表名
    frames = [] # 存表内容
    d = dir.replace('/','\\\\') # 因pandsa读取路径为双斜杠,需转换
    if d.endswith('\\\\') == False: # 若为磁盘根目录则路径结尾自带\\,若为文件夹则无,需添加\\
        d = d + '\\\\'
    print("路径是:",d,"\n有以下文件:")
    for files in os.listdir(path=dir): # 遍历目录下的文件
        print(files)
        if 'xlsx' in files or 'xls' in files : # 搜索xlsx/xls后缀文件
            filename_excel.append(files)	
            df = pd.read_excel(d+files) # 读取一个表内容存入一个DataFrame
            frames.append(df)
    if len(frames)!= 0: # 若存在EXCEL表则合并保存
        result = pd.concat(frames) # 上下连接多个df
        result.to_excel(d+"合并结果表.xlsx")

merge_excel("D:/某文件夹")

Second, split EXCEL

Divide for a long time, and divide for a long time ( this sentence is not so quoted ). So if you want to distribute work, for example, how to divide a large table into multiple small tables according to the number of rows? Let's take a look at the VBA version first.

1. Split by VBA

Sub ChaiFenSheet()
    Dim r, c, i, WJhangshu, WJshu, bt As Long
    r = Range("A" & Rows.Count).End(xlUp).Row
    b = InputBox("请输入分表行数")
    If IsNumeric(b) Then
           WJhangshu = Int(b)
        Else
            MsgBox "输入错误", vbOKOnly, "错误"
            End
    End If
    c = Cells(1, Columns.Count).End(xlToLeft).Column
    bt = 1 '标题行数
    'WJhangshu = 50 '每个文件的行数
    WJshu = IIf(r - bt Mod WJhangshu, Int((r - bt) / WJhangshu), Int((r - bt) / WJhangshu) + 1)
    
    '------
    Set fs = CreateObject("Scripting.FileSystemObject") '
    For i = 0 To WJshu
        Workbooks.Add
        Application.DisplayAlerts = False
        ActiveWorkbook.SaveAs Filename:=ThisWorkbook.Path & "\" & Format(i + 1, String(Len(WJshu), 0)) & "." & fs.GetExtensionname(ThisWorkbook.FullName)   '扩展名
        Application.DisplayAlerts = True
        ThisWorkbook.ActiveSheet.Range("A1").Resize(bt, c).Copy ActiveSheet.Range("A1")
        ThisWorkbook.ActiveSheet.Range("A" & bt + i * WJhangshu + 1).Resize(WJhangshu, c).Copy _
        ActiveSheet.Range("A" & bt + 1)
        ActiveWorkbook.Close True
    Next
End Sub

Similar to the merged table, first open the large table to be split, press Alt + F11 to enter the VBA interface, and then press F5 to run the code. As shown in the figure below, 15 tasks in one table are split into 3 new tables
Insert picture description here
Insert picture description here

2. Python implements split

The source code of the split part was written by my colleague yang:

def split_excel(path,num):
    # print("--- 执行拆分 ---")
    p = path.replace('/', '\\\\') # 传入pd库read_excel方法的路径,含文件名
    dir = p[ : p.rfind('\\') + 1 ] # 输出被拆分表的目录,不含文件名
    sheetname = path[ path.rfind('/') + 1 :].strip('.xlsx').strip('.xlx') # 无后缀的文件名
    data = pd.read_excel(p) # 数据
    nrows = data.shape[0]  # 获取行数
    split_rows = num # 自定义要拆分的条数,即分隔成多少行一份
    count = int(nrows/split_rows) + 1  # 拆分的份数
    # print("应当拆分成%d份"%count)
    begin = 0
    end = 0
    for i in range(1,count+1):
        sheetname_temp = sheetname+str(i)+'.xlsx' # 拆分后的每个表名
        if i == 1:
            end = split_rows
        elif i == count:
            begin = end
            end = nrows
        else:
            begin = end
            end = begin + split_rows
        print(sheetname_temp)
        data_temp = data.iloc[ begin:end , : ] # [ 行范围 , 列范围 ]
        data_temp.to_excel(dir + sheetname_temp)
    # print('拆分完成')

split_excel("test.xlsx",5)

As a fan of PyQt5 and drawing, I put these two pieces of code into the interface and packaged it into a small gadget. The icon of the exe file is a combination of letters X and L drawn by the author ( because XL speed reading is EXCEL ), The merge and split function icons are also very vivid (a serious boast of selling melons ). The executable program and complete source code have been uploaded to github (← click), welcome to download and use!

Figure

thanks for reading!

Guess you like

Origin blog.csdn.net/zohan134/article/details/107291303