dropna default losing any row with missing values.
date = DataFrame([[1.,2.,3.],[NA,NA,NA],
[1.,3.,NA],[1.,5.,NA]])
clean = date.dropna()
print(clean)
You may want to discard the row or column containing NA, the transmission how = 'all' drops only the row containing the NA.
date = DataFrame([[1.,2.,3.],[NA,NA,NA],
[1.,3.,NA],[1.,5.,NA]])
clean = date.dropna(how='all')
print(clean)
Use this way discarded column, only you need to pass axis = 1 can be.
date = DataFrame([[1.,2.,NA],[NA,NA,NA],
[1.,3.,NA],[1.,5.,NA]])
clean = date.dropna(axis=1,how='all')
print(clean)
Another problem DataFrame filter out rows of data involving time series, suppose you want to leave a portion of the observed data can be used to thresh the parameters for this purpose.
df = DataFrame(np.random.randn(6,3))
df.ix[:4,1]=NA;df.ix[:2,2]=NA
print(df)
df = DataFrame(np.random.randn(5,3))
df.ix[:4,1]=NA;df.ix[:2,2]=NA
print(df.dropna(thresh=2))
You may not want to filter in addition to missing data, but to fill those voids by other means, it will be replaced with missing values for the constant value by a constant call fillna.
df = DataFrame(np.random.randn(5,3))
df.ix[:4,1]=NA;df.ix[:2,2]=NA
print(df.fillna(0))
If the call fillna through the dictionary, you can achieve filled with different values for different columns.
df = DataFrame(np.random.randn(5,3))
df.ix[:4,1]=NA;df.ix[:2,2]=NA
print(df.fillna({1:0.5,2:1}))
fillna default returns a new object, you can also modify existing objects in place.
df.ix[:4,1]=NA;df.ix[:2,2]=NA
_ = df.fillna(0, inplace=True)
print (df)
Those effective to reindex interpolation methods may also be used fillna.
df = DataFrame(np.random.randn(6,3))
df.ix[2:,1]=NA;df.ix[4:,2]=NA
print(df)
print(df.fillna(method='ffill'))
df = DataFrame(np.random.randn(6,3))
df.ix[2:,1]=NA;df.ix[4:,2]=NA
print(df)
print(df.fillna(method='ffill',limit=2))
还可以利用fillna实现很多功能,比如说:你可以传入Series的平均值或中位数。
date = Series([1,NA,2,NA,3])
print(date.fillna(date.mean()))
参数 说明
value 用于填充缺失值得标量值或字典对象
method 插值方式,如果函数调用时未指定其他参数的话,默认为“ffill”
axis 待填充的轴,默认axis=0
inplace 修改调用者对象而不产生副本
limit 对于向前和向后填充,可以连续填充的最大数量