目标
- 下载801条宝可梦数据,通过对41个属性的分析,挑选出10只适合培养的宝可梦
依赖环境
- python3
- pandas
- matplotlib
- seaborn
数据集下载
!wget -O pokemon_data.csv https://pai-public-data.oss-cn-beijing.aliyuncs.com/pokemon/pokemon.csv
导入数据
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
df = pd.read_csv("./pokemon_data.csv")
数据分析
df.head()
df.info()
# 计算出每个特征有多少百分比是缺失的
percent_missing = df.isnull().sum() * 100 / len(df)
missing_value_df = pd.DataFrame({
'column_name': df.columns,
'percent_missing': percent_missing
})
# 查看丢失数据的top10属性
missing_value_df.sort_values(by='percent_missing', ascending=False).head(10)