Data analysis was performed using the Python data structured Chapter 8: polymerization, merging and remodeling .md

Learning time: start 2019/11/03 23:30 Sunday night, planned 1110 completion

Learning objectives: Page218-249, a total of 32; completing certain 6 days (each page 20min, 1 hour a day / day 3, required 10 days )

Actual feedback: Actual XXX completion, took X days, X hours and X minutes per page on average.

 

Practical applications, the data may be spread across many files or databases, the storage form is not conducive to analysis. This section concerns the polymerization can be combined, a method of remodeling data.

 

8.1 Hierarchical Index

Hierarchical index (hierarchical indexing) is an important feature pandas, which makes it possible to have multiple (two or more) index level on one axis. Abstract point that it can handle such high-dimensional data in a low dimensional form.

Look at the following example, create a Series with a list and a list or as an array index:

 

  Ps: the above results is the result Series format landscaped with MultiIndex index.

  

 

1) For a hierarchical index of the object, so-called indexing portions, using the operation to select the subset of data it simpler:

 

Can also be selected in the "inner" in:

 

2) hierarchical index data and remodeling play an important role in the operation of a packet-based (e.g., pivot table generation) in. For example, this data may be rescheduled by a DataFrame unstack Method:

 

Among them, when the inverse unstack the stack:

 

 

 

 

8.1.1 hierarchically ordered rearrangement

8.1.2 Depending on the level summary statistics

8.1.3 DataFrame column index

 

8.2 merge data sets

DataFrame 8.2.1 database merge style

 

 

The combined index on 8.2.2

 

 

8.2.3 axial connection

 

 

8.2.4 merge data overlap

 

8.3 remodeling and axial rotation

8.3.1 reshape the level of the index

 

 

 

8.3.2 The "long format" rotate "wide format"

 

8.3.3 The "wide format" rotated "long format"

 

8.4 summary

至此,已经掌握了pandas数据导入、清洗、重塑,可进一步学习matplotlib数据可视化。稍后会回到pandas,学习更高级的分析。

 

Guess you like

Origin www.cnblogs.com/ElonJiang/p/11789939.html