pandas 19 - 分层索引创建(MultiIndex)( tcy)

创建分层索引(MultiIndex)  2018/12/14

用途:在较低维度的数据结构中存储和操作具有任意数量维度的数据1d或2d。 

函数: 

pd.MultiIndex.from_tuples(tuples, sortorder=None, names=None) # 将元组列表转分层索引
  # 参数:tuples : list / tuple-每个元组都是一行/列的索引。sortorder : int or None
pd.MultiIndex.from_arrays(arrays, sortorder=None, names=None) # 数组转分层索引
  # 参数:list / array
pd.MultiIndex.from_product(iterables, sortorder=None, names=None)# 迭代转分层索引(交叉迭代集)
  # 参数:list / sequence of iterables 

实例:   

实例1:
arrays = [['s1', 's1', 's2', 's2', 's3', 's3', 's4', 's4'],['ss1', 'ss2', 'ss1', 'ss2', 'ss1', 'ss2', 'ss1', 'ss2']]
tuples = list(zip(*arrays))# [('s1', 'ss1'),('s1', 'ss2'),('s2', 'ss1'),('s2', 'ss2'),('s3', 'ss1'),('s3', 'ss2'),('s4', 'ss1'),('s4', 'ss2')]

index = pd.MultiIndex.from_tuples(tuples, names=['first', 'second']) 

实例2:
arrays = [['s1', 's1', 's2', 's2', 's3', 's3', 's4', 's4'], ['ss1', 'ss2', 'ss1', 'ss2', 'ss1', 'ss2', 'ss1', 'ss2']]
index = pd.MultiIndex.from_arrays(arrays, names=('first', 'second')) 

实例3:#两个迭代中的每个元素配对
iterables = [['s1', 's2', 's3', 's4'], ['ss1', 'ss2']]
index=pd.MultiIndex.from_product(iterables, names=['first', 'second'])

# MultiIndex(levels=[['s1', 's2', 's3', 's4'], ['ss1', 'ss2']],
# labels=[[0, 0, 1, 1, 2, 2, 3, 3], [0, 1, 0, 1, 0, 1, 0, 1]],names=['first', 'second'])
 

应用: 

实例4:#用多层索引
s = pd.Series(np.arange(8), index=index)

实例5:#自动构建多层索引:将数组列表直接传递给Series或DataFrame
arrays = [np.array(['s1', 's1', 's2', 's2', 's3', 's3', 's4', 's4']),
np.array(['ss1', 'ss2', 'ss1', 'ss2', 'ss1', 'ss2', 'ss1', 'ss2'])]
s = pd.Series(np.arange(8), index=arrays)
s.index.names=['first','second']

# first second
# s1 ss1       0
#    ss2       1
# s2 ss1       2
#    ss2       3
# s3 ss1       4
#    ss2       5
# s4 ss1       6
#    ss2       7
# dtype: int32 
实例6: 
df = pd.DataFrame(np.arange(24).reshape(3, 8), index=['A', 'B', 'C'], columns=index)
'''''''''
first   s1      s2      s3      s4
second ss1 ss2 ss1 ss2 ss1 ss2 ss1 ss2
A       0   1   2   3   4   5   6   7
B       8   9  10  11  12  13  14  15
C      16  17  18  19  20  21  22  23
'''
pd.DataFrame(np.arange(36).reshape(6, 6), index=index[:6], columns=index[:6])
'''
first     s1      s2      s3 
second   ss1 ss2 ss1 ss2 ss1 ss2
first second 
s1    ss1  0  1   2   3   4   5
      ss2  6  7   8   9  10  11
s2    ss1 12 13  14  15  16  17
      ss2 18 19  20  21  22  23
s3    ss1 24 25  26  27  28  29
      ss2 30 31  32  33  34  35
''' 

猜你喜欢

转载自blog.csdn.net/tcy23456/article/details/84999681
今日推荐