pandas Data Analysis - Hierarchical Index

Original link: http://www.cnblogs.com/li98/p/10991709.html

Hierarchical index is an important feature of the pandas, he allows you to have more than one axis (two or more) index level, abstract point that it allows you to handle data at high latitudes low-dimensional form, let look at a simple example, create a Series, and with a list of list or as an array index.

date = Series(np.random.randn(10),
index = [['a','a','a','b','b','b','c','c','d','d'],
[1,2,3,1,2,3,1,2,2,3]])
print(date)

This is the interval between the output Series formatted form, the index represents an index directly with MultiIndex using the above labels.

print (date.index)

For a hierarchical index of the object, select a subset of data is very simple.

print(date['b'])

print(date['b':'c'])

print(date.ix[['b','d']])

You may be selected even in the inner layer.

print(date[:,2])

Hierarchical index data and remodeling play an important role in the operation of a packet-based (e.g., pivot table generation) in. For example, this data may be rearranged into a DataFrame unstack method by which:

print (date.unstack())

unstack inverse operation is stack:

print (date.unstack().stack())

对于一个DataFrame,每条轴都有分层索引:
frame = DataFrame(np.arange(12).reshape((4,3)),
index = [['a','a','b','b'],[1,1,3,4]],
columns=[['OH','OH','DH'],
['RE','CE','FE']])
print(frame)

各层都可以有名字(可以是字符串,也可以是别的Python对象)。如果指定了名称,它们就会显示在控制台输出中(不要将索引名称和轴标签混为一谈!):
frame.index.names = ['key1','key2']
frame.columns.names = ['state','color']
print(frame)

有了分部的列索引,因此可以轻松选取列分组:对于一个DataFrame,每条轴都有分层索引:

print(frame['OH'])

可以单独创建MultiIndex然后复用,上面那个DataFrame也可以这样创建:

MultiIndex.from.arrays([['OH','OH','DH'],['RE','CE','FE']],
names = ['state','color'])





 
 

转载于:https://www.cnblogs.com/li98/p/10991709.html

Guess you like

Origin blog.csdn.net/weixin_30438813/article/details/95279609