Use Python to process data in excel

1. Read the data in excel

First introduce the pandas library, if not, use the console installation - pip install pandas .

import pandas as pd     #引入pandas库,别名为pd

#read_excel用于读取excel中的数据,这里只列举常用的两个参数(文件所在路径,忽略头字段)
data = pd.read_excel('excel路径', header = 0)
print(data)  #可以打印看看自己读取的数据

The specific parameters of read_excel can be selected according to actual needs.

2. Transform and process data

If you want to process the data read in excel, it is best to convert it into a list and use dataFrame when outputting.

#输入之后转换一维数组
data = datas.values  #只读取excel中的值,不读取序号
data = list(np.concatenate(data.reshape((-1, 1), order="F")))  #转换
print(list)  #查看数组

#其他处理代码……

#输出的时候,定义一个空的dataFrame,把数据添加到dataFrame中
df = pd.DataFrame()  #定义空的dataFrame
#通过循环将数据添加到df
for i in data:
    df = df.append([i])   #把数据添加到末尾

3. Export data to excel

The output also only needs one line of simple code, and uses the dataFrame format to output the data to the table.

#to_excel用于输出excel中的数据,这里同样只列举两个常用参数(文件输出路径,忽略头字段)
df.to_excel('输出路径', index = False)

The specific parameters of to_excel can be selected according to actual needs.

4. A simple example of using python to process excel data (with detailed notes)

The excel in the example has only one column of data, and the main function of processing is to read the data in excel and extract it into a pure Chinese string - using regular expression matching.

You can convert and process the data according to your actual needs.

import pandas as pd
import numpy as np
import re  #正则

datas = pd.read_excel('old.xlsx', header = 0)  #从excel中读取数据(这里使用的是相对路径)

data = datas.values  #只读取excel中的值,不读取序号

resource = list(np.concatenate(data.reshape((-1, 1), order="F")))  #将读取的数据转换为list

# print(resource)  #打印list

#提取中文字符串函数
def chinese(s):
    # res = re.findall('[^0-9]', s)  #使用正则表达式匹配非数字的字符
    res = re.findall('[\u4e00-\u9fa5]', s)      #使用正则表达式匹配中文字符
    return ''.join(res)     #将字符拼接成字符串

df = pd.DataFrame()  #定义空的dataFrame

#依次读取list中的数据,将之处理为纯中文字符串
for i in resource:
    i = str(i)      #excel中的部分数据为非字符串,这里全部转换为字符串
    ch = chinese(i)     #提取中文字符串
    df = df.append([ch])   #将数据添加到df

df.to_excel('new.xlsx', index = False)  #将处理后的数据输出到excel表格

Guess you like

Origin blog.csdn.net/weixin_49851451/article/details/129255265