Article Directory
Introduction to Pandas
Pandas is a tool set
for analyzing structured data in Python. The foundation is numpy: high-performance matrix operations. The
graphics library matplotlib: provides data visualization.
ipython tools
Open using the command line
Pandas core data structure
Series creation
Series is a one-dimensional labeled array. Any data (integer, floating point, string, Python Object) can be placed in the array. The
basic format:
s=pd.Series(data,index=index)
where index is a list, Used as a label for data. data can be of different data types: Python dictionary, ndarray object, a scalar value.
The nature of Series objects:
ndarray-like objects, dict-like objects, label alignment operations.
DataFrame creation
DataFrame is a two-dimensional array with row and column labels, which can be Excel tables, SQL database tables, and Series object dictionaries. It is the most commonly used data structure in Pandas.
Basic format:
df=pd.DataFrame(data,index=index,columns=columns)
where index is the row label, columns is the column label, data can be: one-dimensional numpy array, dictionary composed of list and Series, two-dimensional numpy Array, a Series, DataFrame object.
1. Create a one-dimensional date to
create a two-dimensional array
2. Create a dictionary
Pandas basic operations
View element
- View the data in the first few rows The method of
directly finding the interval of the data is inefficient and
efficient
- View the data of a certain column
- View the row labels, column labels and attributes of the data
Data conversion
Sort
Data is sorted by row, column, and specific label
Numerical judgment
Numerical judgment of elements in the data
Copy data and modify elements