Previous chapter we mainly introduce the specific attributes pandas both data types, data of the basic means of this chapter describes the operation Series and DataFrame.
Re-index
An important method pandas object is reindex, its role is to create a new object, its data are consistent with the new index:
import pandas as pd obj = pd.Series([4.5, 7.2, -5.3, 3.6], index=['d', 'b', 'a', 'c']) print(obj)
d 4.5 b 7.2 a -5.3 c 3.6 dtype: float64
It will be rearranged with the Series according to reindex the new index. If an index value does not currently exist, on the introduction of missing values:
obj2 = obj.reindex(['a', 'b', 'c', 'd', 'e']) print(obj2)
a -5.3 b 7.2 c 3.6 d 4.5 e NaN dtype: float64
For ordered data such as time series, you may need to do some interpolation processing from the new index. method option to achieve this purpose, e.g., prior to the filling with the fill value may be achieved:
obj3 = pd.Series(['blue', 'purple', 'yellow'], index=[0, 2, 4]) print(obj3)
0 blue 2 purple 4 yellow dtype: object
print(obj3.reindex(range(6), method='ffill'))
0 blue 1 blue 2 purple 3 purple 4 yellow 5 yellow dtype: object
With DataFrame, reindex can be modified (row) index and column. When only a passing sequence, the line will re-index the results:
frame = pd.DataFrame(np.arange(9).reshape(3, 3), index=['a', 'c', 'd'], columns=['Ohio', 'Texas', 'California']) print(frame)
Ohio Texas California a 0 1 2 c 3 4 5 d 6 7 8
frame2= frame.reindex(['a', 'b', 'c', 'd']) print(frame2)
Ohio Texas California a 0.0 1.0 2.0 b Gulf Shores c 3.0 4.0 5.0 d 6.0 7.0 8.0
The index column can be re-used columns Keywords:
states = ['Texas', 'Utah', 'California'] print(frame.reindex(columns=states))
Texas Utah California 1 of 2 c 4 5 7 of 8
Continuously updated in ......