[Data Analysis - Pandas for Basic Introduction ①] - Introduction to pandas

foreword

1. Introduction to pandas

pandas is Python 核心数据分析支持库, providing fast, flexible, and explicit data structures designed to handle relational, labeled data simply and intuitively.

The goal of pandas is to become an essential advanced tool for Python data analysis practice and combat . Its long-term goal is to become the most powerful and flexible open source data analysis tool that can support any language. After years of unremitting efforts, pandas is getting closer and closer to this goal.

When it comes to data analysis with Python, pandas is pretty much everyone's name. 通俗来讲,pandas 是 Python 编程界的 Excel.

Click me on the pandas official website , access is slower without a VPN.

The pandas Chinese website clicks me , and it can be accessed normally, which is more user-friendly.

Two, pandas advantage

为什么 pandas 能成为 Python 数据分析的利器和核心支持库?I think the answer can be found in the following points.

2.1 Powerful data structure support

The main data structures of pandas are Series (one-dimensional data) and DataFrame (two-dimensional data) , which are sufficient to handle most typical use cases in the fields of finance, statistics, social science, engineering, etc.

For R users, DataFrame provides richer functions than R language data.frame. pandas is developed based on NumPy and can be perfectly integrated with other third-party scientific computing support libraries.

2.2 Advantages

  • 1. Handle missing data in floating-point and non-floating-point data, expressed as NaN

  • 2. Variable size

Insert or delete columns of multidimensional objects such as DataFrame;

  • 3. Automatic, display data alignment

Display aligns objects with a set of labels, or ignore labels, and automatically aligns with data when Series and DataFrame are calculated;

  • 4. Powerful and flexible group by function

Split-apply-combine data sets, aggregate and transform data;
easily convert irregular and different index data in Python and NumPy data structures into DataFrame objects;

  • 5. Based on smart tags, perform operations such as slicing, fancy indexing, and subset decomposition on large data sets;

  • 6. The axis supports structured labels: one scale supports multiple labels;

  • 7. Mature IO tools

Read data from sources such as text files (CSV and other files that support delimiters), Excel files, databases, etc., and use the ultra-fast HDF5 format to save/load data;

  • 8. Time series

Support date range generation, frequency conversion, moving window statistics, moving window linear regression, date displacement and other time series functions.

Three, pandas learning route

First the Series:

after that the DataFrame:


epilogue

The learning of pandas is bound to encounter many difficulties. This reminds me of when I first learned the Java framework Spring, I felt that I couldn’t stand it anymore.边学习边实践,拒绝拖延,是提高学习积极性的好办法。

Related reading

Article direct Link
Review of last issue [Data Analysis - Basic Introduction to NumPy⑥] - NumPy Case Consolidation and Strengthening
Next issue notice [Data Analysis-Basic Introduction to Pandas②]-Pandas Data Structure——Series

Guess you like

Origin blog.csdn.net/qq_62592360/article/details/131607994