A small case for getting started with Pandas

The colleague who happens to be testing now needs to implement a small function, that is, to create an excel table with only the header in advance, and then need to pass in a list, and determine whether it exists in the header, if it exists, mark'OK' under the column, If it does not exist, mark'X'! Very simple example.

installation

pip install pandas==1.0.0
Original form:

Insert picture description here

Form after processing:

Insert picture description here

Read excel file

data = pd.read_excel(path, sheet_name=False, index=False)
  • Since all the columns are empty except for the title at the beginning, you need to assign an empty list to each column:
header = set(data.keys())
for i in header:
    if not len(data[i]):
        data[i] = []
    else:
        data[i] = list(data[i])
  • If it is not empty, you also need to convert the column result to a list (list), otherwise the read out is not a list, but table-like data:
账户 Dashboard 订单列表 账户花费 系列花费 产品审核 SKU列表 广告账户 用户管理 部门管理 店铺管理 部门配置 团队ROI报表 异常数据 SKU销量 广告素材 个人中心
0  X         X   ok   ok   ok   ok    ok   ok   ok   ok   ok   ok      ok   ok    ok   ok   ok

The loop printing results are as follows:

>>> for i in data.keys():
...     print(i, data[i])
...
账户 0    X
Name: 账户, dtype: object
Dashboard 0    X
Name: Dashboard, dtype: object
订单列表 0    ok
Name: 订单列表, dtype: object
账户花费 0    ok
Name: 账户花费, dtype: object
系列花费 0    ok
Name: 系列花费, dtype: object
产品审核 0    ok
Name: 产品审核, dtype: object
SKU列表 0    ok
Name: SKU列表, dtype: object
广告账户 0    ok
Name: 广告账户, dtype: object
...
  • If you add, delete, or modify it without changing the list type, the following error will be reported:
Traceback (most recent call last):
  File "test.py", line 36, in <module>
    main()
  File "test.py", line 20, in main
    data[key].append('ok')
  File "E:\test\venv\lib\site-packages\pandas\core\series.py", line 2582, in append
    return concat(
  File "E:\test\venv\lib\site-packages\pandas\core\reshape\concat.py", line 271, in concat
    op = _Concatenator(
  File "E:\test\venv\lib\site-packages\pandas\core\reshape\concat.py", line 357, in __init__
    raise TypeError(msg)
TypeError: cannot concatenate object of type '<class 'str'>'; only Series and DataFrame objs are valid
  • Take header intersection
new_list = set(list1) & set(list2)
或
list1.intersection(list2)
  • Take the header difference
new_list = set(list1) - set(list2)
或
list1.difference(list2)
  • Take the header union
new_list = set(list1) | set(list2)
或
list1.union(list2)
  • Then mark the header of these intersections with'OK' and the difference with'X':
for key in new_list:
    if len(data[key]):
        data[key].append('ok')
    else:
        data[key] = ['ok']
cj_hd = header - set(nb)
for key in cj_hd:
    n_data = 'X' if key != '账户' else user_name # 如果需要对某列特定修改
    if len(data[key]):
        data[key].append(n_data)
    else:
        data[key] = [n_data]
  • After each test, open the file to view. If you want to execute the program again to update the file, you need to close the file, otherwise an error will be reported:
Traceback (most recent call last):
  File "test.py", line 37, in <module>
    main()
  File "test.py", line 32, in main
    df.to_excel('3.xlsx', index=False)
  File "E:\test\venv\lib\site-packages\pandas\core\generic.py", line 2174, in to_excel
    formatter.write(
  File "E:\test\venv\lib\site-packages\pandas\io\formats\excel.py", line 738, in write
    writer.save()
  File "E:\test\venv\lib\site-packages\pandas\io\excel\_openpyxl.py", line 43, in save
    return self.book.save(self.path)
  File "E:\test\venv\lib\site-packages\openpyxl\workbook\workbook.py", line 392, in save
    save_workbook(self, filename)
  File "E:\test\venv\lib\site-packages\openpyxl\writer\excel.py", line 291, in save_workbook
    archive = ZipFile(filename, 'w', ZIP_DEFLATED, allowZip64=True)
  File "c:\users\user\appdata\local\programs\python\python38\lib\zipfile.py", line 1216, in __init__
    self.fp = io.open(file, filemode)
PermissionError: [Errno 13] Permission denied: 'xxx.xlsx'
Attach source code
import pandas as pd
import os


def main(path, user_name, nb):
    data = dict(pd.read_excel(path, sheet_name=False, index=False))
    header = set(data.keys())
    for i in header:
        if not len(data[i]):
            data[i] = []
        else:
            data[i] = list(data[i])
    new_b = header & set(nb)
    for key in new_b:
        if len(data[key]):
            data[key].append('ok')
        else:
            data[key] = ['ok']
    cj_hd = header - set(nb)
    for key in cj_hd:
        n_data = 'X' if key != '账户' else user_name
        if len(data[key]):
            data[key].append(n_data)
        else:
            data[key] = [n_data]
    df = pd.DataFrame(data=data)

    df.to_excel(path, index=False)



if __name__ == '__main__':
    path = os.getcwd() + '/test.xlsx'
    nb = ["A","Dashb", "订单列表",  "店铺管理", "产品审核", "SKU列表", "广告账户", "用户管理", "部门配置", "团队ROI报表", "异常数据",
          "SKU销量", "广告素材", "个人中心", ]
    user_name = '1221'
    main(path, user_name, nb)

If the article is helpful to you, please like, bookmark and follow~

Guess you like

Origin blog.csdn.net/Lin_Hv/article/details/106143498