Chapter 4 Introduction to pandas series types

know

pandas is a python third-party library that provides high-performance, easy-to-use data types and analysis tools.
It introduces two data types, series and DataFrame.
Series index + one-dimensional data
DataFrmae index + two-dimensional data
pandas The original design of pandas is to establish a corresponding relationship between data and index. By manipulating the index, you can manipulate the data in disguise without caring about the latitude of the data and reducing the burden of thinking.
pandas expects users to treat series and DataFrame objects as if they were a single piece of data.
write picture description here
The data has two columns, the index on the left and the data on the right.
Comparing Numpy and Pandas
, the former pays more attention to the structural expression of a set of data, that is, the latitude of the data. The latter is more concerned with the application representation of the data.

create

write picture description here
The index can also be specified using the form index=[].
write picture description here
Series can be created from types like
python lists, python dictionaries, scalar values ​​(one value), ndarrays, other functions.
Scalar value creation:
write picture description here
very similar to the dictionary type, creation, the key is the index:
write picture description here
when the specified index is different from the index created by the dictionary, the two are automatically merged, the 'd' index has no corresponding value, and is marked with NAN, other The values ​​are changed from int64 to float64, because pandas is based on Numpy, and Numpy defaults to floating-point numbers.
write picture description here
Creating from ndarray types
write picture description here
Not only values ​​can be created from ndarray types, but indexes can also be created from ndarray types:
write picture description here

Basic operation

Because the series type includes two parts, index and values, its operation can also be summarized into these two parts, which is similar to the ndarray type and the dictionary type.
The index keyword can be omitted.
For a series type, the index is obtained through .index, the type name is 'index'.values ​​to obtain data, and the array indicates that it is a numpy type:
write picture description here
this test reveals the nature of pandas, the value part is a numpy type, and a new index and value are created separately. Association, the combination of the two, is the series type.
Or you can think of pandas as a kind of "new dictionary", the key is the automatically created index, and the value is numpy.
It is logical to obtain the values ​​directly through the key:
write picture description here
Note that even if the user defines the index of the series type, the default index is automatically generated, so the value can be obtained through b[1], but the two indexes cannot be mixed; When there are values, the key should be marked again with , and [].

Slicing operation:

It can be sliced ​​by automatic index. If there is a custom index, it will be sliced ​​together. The
write picture description here
difference is that if a custom index is used to slice, including the rightmost, the 'c' index is still sliced.
write picture description here
To judge whether a custom index is in the series type, use The keywords in, in will not judge the automatic index:
write picture description here
write picture description here
you can also use the get() method to get the values:
write picture description here
The operations and operations in Numpy can be used for the series type.
Two series types are merged, that is, series+series, automatic alignment operation:
write picture description here
the value with the same index value is operated, and the value with different index is set to NaN.
Both the Series object and the index can have a name, which is stored in the attribute .name by
write picture description here
default . , b is no additional name, b.name is not displayed. You can specify the name of the series object, and the name of the index.
Note that only one [] is required for modifying the value, which is different from the return value.
write picture description here
write picture description here
How to modify the index of the series?
For the time being, I only know that I can use .index to reassign:
write picture description here
it still feels inconvenient. If I only want to modify one of the indexes, do I need to reassign all of them? ? For example, I just want to change 'h' to 't', and the others remain unchanged. I wonder if I can specify the modification?

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325192518&siteId=291194637