2020 Python data analysis study notes-Pandas data acquisition and storage (6)

table of Contents

1. Data reading (csv file)

2. Data reading (excel file)

3. Data storage

4. Explanation of na_values ​​function:

5. Introduction to data reading related parameters:

 


  

1. Data reading (csv file)

(If you are reading an excel file, you only need to change csv to excel, and the relevant parameters are the same as the csv parameters)

import os
import pandas as pd
print(os.getcwd())      # 路径读取
# >>>  F:\Python\自学部分

# 读取文件
df = pd.read_csv('预测结果.csv',encoding='utf-8', nrows=10) 
# nrows=10  只读取前10行数据     
# 如果读取的是excel文件,则只需将csv换成excel即可,相关参数与csv参数使用相同
 

print(df)    # 打印数据

operation result

2. Data reading (excel file)

import os
import pandas as pd
print(os.getcwd())      # 路径读取
# >>>  F:\Python\自学部分

# 读取文件
df = pd.read_excel('score.xlsx',encoding='utf-8')

print(df)

operation result:

Read multiple work pages in batch:

import os
import pandas as pd
print(os.getcwd())      # 路径读取
# >>>  F:\Python\自学部分

# 读取文件
df = pd.read_excel('score.xlsx',encoding='utf-8')

sheet_name = ['score' + str(i) for i in range(1,4)]
print(sheet_name)
data_all = pd.DataFrame()
for i in sheet_name:
    data = pd.read_excel('score.xlsx',encoding='gbk',sheet_name=i)
    data_all = pd.concat([data_all,data],axis = 0,ignore_index = True)

print(data_all)

operation result:

3. Data storage

import os
import pandas as pd
print(os.getcwd())      # 路径读取
# >>>  F:\Python\自学部分

# 读取文件
df = pd.read_excel('score.xlsx',encoding='utf-8')

sheet_name = ['score' + str(i) for i in range(1,4)]
print(sheet_name)
data_all = pd.DataFrame()
for i in sheet_name:
    data = pd.read_excel('score.xlsx',encoding='gbk',sheet_name=i)
    data_all = pd.concat([data_all,data],axis = 0,ignore_index = True)


#  保存数据为CSV文件格式
print(data_all.to_csv('data_all.csv',index=False,encoding='utf-8'))

#  保存数据为EXCEL文件格式
print(data_all.to_excel('data_all.xlsx',index=False,encoding='utf-8'))

Result display:

4. Explanation of na_values function:

import os
import pandas as pd
print(os.getcwd())      # 路径读取
# >>>  F:\Python\自学部分

# 读取文件
df = pd.read_csv('预测结果.csv',encoding='utf-8', nrows=10, na_values=118.0,header=0)
# nrows=10  只读取前10行数据    na_values=118.0   将数据中为118.0的数据读为缺失值
# header=0   将数据第一行作为表头

print(df)    # 打印数据

operation result:

5. Introduction to data reading related parameters:

import os
import pandas as pd
print(os.getcwd())      # 路径读取
# >>>  F:\Python\自学部分

# 读取文件
df = pd.read_csv('预测结果.csv',encoding='utf-8', nrows=10, na_values=118.0)
# nrows=10  只读取前10行数据    na_values=118.0   将数据中为118.0的数据读为缺失值

# print(df)    # 打印数据

print(df.head(5))     # 输出前5行数据

print(df.tail(5))     # 打印最后5行数据

print(df.dtypes)      # 输出每一列的数据类型

Show running results:

 

 

 

Guess you like

Origin blog.csdn.net/weixin_44940488/article/details/106794977