替换值
import pandas as pd
import numpy as np
df = pd.DataFrame({'one':[10,20,30,40,50,2000],'two':[1000,0,30,40,50,60]})
print(df)
print(df.replace({1000:10,2000:60}))
结果:
one two
0 10 1000
1 20 0
2 30 30
3 40 40
4 50 50
5 2000 60
one two
0 10 10
1 20 0
2 30 30
3 40 40
4 50 50
5 60 60
pip install missingno
pip install quilt
quilt install ResidentMario/missingno_data //下载数据集
pip install geoplot
pip install geopandas
示例
from quilt.data.ResidentMario import missingno_data
from matplotlib import pyplot as plt
collisions=missingno_data.nyc_collision_factors()
collisions=collisions.replace("nan",np.nan)
print(collisions)
import missingno as msno
//%matplotlib inline magic,pycharm没有
msno.matrix(collisions.sample(250)) //250个样本
plt.show()
结果:
Unnamed: 0 DATE … VEHICLE TYPE CODE 4 VEHICLE TYPE CODE 5
0 0 11/10/2016 … NaN NaN
1 1 11/10/2016 … NaN NaN
2 2 04/16/2016 … NaN NaN
… … … … … …
7302 7302 01/02/2016 … NaN NaN
[7303 rows x 27 columns]
白的地方是缺失值
8 1:01:27