pandas related operations

Import PANDAS AS PD
 Import numpy AS NP
 '' ' 
. Create a df 
    1. defined df: transmitting the dictionary 
    name of each column 1.1 Each key has a key array as a value [key: Array] 
    1.2 a nested dictionary generation df class key column name as the key elements of the two rows of elements as the name 
    1.3 select Create pd.DataFrame df (dict, columns = [ 'key1', 'key2']) 
    1.4 df specified label (index) pd.DataFrame (dict, Columns = [ 'key1', 'key2'], index = [ 'One', 'TWO', `` `` `]) 
    
    2. define df: index data matrix + Columns + 
    2.1 directly into the value tag column name pd.DataFrame (np.arange (16) .reshape ( 4,4), index = [ 'one', 'two', ....], columns = [ 'object', '', ..] ) 
    
II. select element 
    1. The value 
    1.1 [See column name tag value frame.columns frame.values] frame.index 
    1.2 to take a frame [[ 'object1','object2']]   frame.object
    1.3取行 frame[1:3]
    1.4 takes a single value frame [ 'object'] [2 ] df ---- Series ----- index value 
    
    2. Assignment 
    
    2.1 to ranks tag name frame.index, name = 'id' frame.columns.name = 'Item' 
    2.2 add a frame [ 'new'] = 12 translate Frame [ 'new new'] = [12,12,12,12] 
    2.3 modify a Frame [ 'new new'] Series =     
    2.4 modify a single value df - column series - index Frame [ 'new new'] [2] =. 3 
    
    
    3. determines whether the elements in the df 
    
    3.1 frame.isin df = ([1.0, 'PEN']) 
    
    4. remove column 
    4.1 del frame [ 'new' ] 
    
    5. filter 
    5.1 all filters Frame [Frame <12 is] 
    5.2 screening of a column Frame [frame.new <12 is] 
    
    6. The transposition 
    frame.T   
    
three .index objects 
    1. determine whether a duplicate index serd.index.is_unique frame.index.is_unique 
    2. for duplicate index SERD [ 'duplicate index'] Returns a Series frame [ 'duplicate index'] Back Frame 
    3.Series.reindex ([Index Array], method = 'ffill') frame.reindex ([ Index Array], = Method 'ffill', Columns = [ '', '', ....]) 
    
    4. remove drop () returns the index [excluding deleted and a new object element] ser.drop ([ '', '' ]) deleting a plurality of indexes, the input array use 
    delete rows: Frame.drop ([ '', ' ', '']) 
    delete column: Frame.drop ([ '', ' ', ''], axis = 1 ) 
    
    5 and its operation data element level (plus) 
    
    4.1 objects have two series of label, adding only the data corresponding to one object wherein some of the filling index NaN3 
    4.2 two objects Frame frame has two columns, and the index adding corresponding elements of the contrary NaN filled with 
    
four data structure operation.    
    1. Math (element level): [] satisfying broadcast mechanism 
    a.add (B) 
    Sub () 
    div () 
    MUL () 
    
. five functions and applications [map] library function 
    1.Function operating element (generic function) 
        square root np.sqrt (frame) ## of each element 
    function operation 2. ranks
    the lambda X = F: x.max () - x.min () 
    7.NaN data - x.min () 
    frame.apply (F) are calculated row ###
    frame.apply (f, axis = 1) ### by the column arithmetic 
    
    2.1 plurality of return value 
    DEF F (X): 
        return pd.Series ([x.max (), x.min (),], index = [ 'min', 'max']) 
    frame.apply (F) 
    
    3. statistical functions 
    frame.sum () 
    frame.mean () 
    frame.describe () 
    
    
    4. Sort 
    
    4.1 ser.sort_index () 
    4.2 ser.sort_index (Axis = . 1) 
    4.3 frame.sort_index () 
    4.4 frame.sort_index (Axis =. 1) 
    
    4.4 frame.sort_index (by = [ 'columns1', 'columns2']) 
    
    5. The qualifying times Rank 
    
    ser.rank () 
    ser.rank (mothod = '' First) 
    ser.rank (Ascending = False) 
    
    6. The correlation Corr () and covariance CoV () 
    
    
    7.1 Create np.NaN pd.series ([1,2,3, np.NaN, 4], index = [ '',' ',.....]) 
    7.2 filter NaN ser.dropna () ser [ser there will be a NaN directly delete .notnull ()] frame.dropna () row or column 
                                                            deleted frame.dropna (how = 'all') ranks all the elements are NaN 
    7.3 is NaN fill value 
    frame.fillna (0) filling all NaN 0 
    frame.fillna ({ 'Ball':. 1, 'Mug': 0, 'PEN': 99}) of different columns different values NaN replaced 
    
    
    8. hierarchical level index and the 
    8.1 level index: 
    MSER PD = .Series (np.random.rand (. 8), index = [[ 'White', 'White', 'White', 'Blue', 'Blue', 'Red', 'Red', 'Red'], 
                                          [ 'up', 'down', 'right', 'up', 'down', 'up', 'down','left']])
    white  up       0.322237
           down     0.093246
           right    0.181997 
    Blue up .887448 
           Down 0 .032504
    red    up       0.612139
           down     0.125961
           left     0.030511
    dtype: float64
    
    print(mser['white'])
    print(mser[:,'up'])
      
    dtype: float64
    up       0.256720
    down     0.849860
    right    0.581021
    dtype: float64
    white    0.256720
    blue     0.412591
    red      0.893404
    dtype: float64
    
    print('选取特定元素:',mser['white','up'])
    选取特定元素: 0.9149258487509073
    
    
'''
mser = pd.Series(np.random.rand(8),index=[['white','white','white','blue','blue','red','red','red'],
                                          ['up','down','right','up','down','up','down','left']])
 Print (MSER)
 Print (MSER [ ' White ' ])
 Print (MSER [:, ' up ' ])
 Print ( ' select specific elements: ' , MSER [ ' White ' , ' up ' ]) 
A = MSER .unstack ()
 Print ( ' converted to DF: \ n- ' , A)
 '' ' 
is converted into DF:             
Down left right up 
Blue NaN3 NaN3 0.025439 0.241679 
Red NaN3 0.180735 0.225099 0.410451
NaN3 .900275 0.536098 0.266825 White 
'' ' 
Print ( ' DF is converted into Series: \ n- ' , a.stack ())
 ' '' 
DF is converted into Series: 
Blue Down 0.241679 
       up 0.025439 
Red Down .225099 
       left .410451 
       up .180735 
White Down 0.266825 
       right .536098 
       up .900275 
DTYPE: float64 

'' ' 


# ## defines a level index ranks 
mframe = pd.DataFrame (np.random.randn (16) .reshape (4,4 & ), 
                      index = [[ ' White ' , 'white' , ' Red ' , ' Red ' ], [ ' up ' , 'down','up','down']],
                      columns=[['pen','pen','paper','paper'],[1,2,1,2]])
print(mframe)
'''
                 pen               paper          
                   2. 1 2. 1 
White up 1.729195 -0.451135 -0.497403 -0.938851
      down -1.267124  0.422545  0.069564 -0.735792
red   up    0.298684 -0.442771  1.301070  0.234371
      .108434 2.266180 -0.549653 -0.394364 Down 
 
Object Paper PEN           
ID. 1. 1 2 2'' ' 

# ##, and to re-adjust the order of an ordered hierarchy 
mframe.columns.names = [ ' Object ' , ' ID ' ]    # # column name plus classification name 
mframe.index.names = [ ' Colors ' , ' Status ' ] # # row names plus the category name 
Print (mframe)
 ' ''
status colors                                         
White up .288562 -0.519511 0.516333 0.643500 
       Down 1.759466 -1.194383 -0.624583 1.027694 
Red up -0.660548 1.074917 0.425757 -1.028554 
       Down .242714 -0.550235 -0.749478 -0.015347 

'' ' 
# # adjust the order of colors and transducer position status column Swaplevel 
Print (mframe. Swaplevel ( ' Colors ' , ' Status ' ))
 '' ' 
Object Paper PEN           
ID. 1. 1 2 2 
Status Colors                                         
up White .621721 1.227554 -1.051002 -0.937241 
Down White 0.951904 0.585412 -0.315780 -0.336806 
up -1.824083 .284429 .310883 .031538 Red 
Down Red .851415 .598169 1.967784 -0.421712 
'' ' 
'

 

Guess you like

Origin www.cnblogs.com/liuhuacai/p/11588243.html