Data analysis was performed using the Python -Pandas

Data analysis was performed using the Python -Pandas:

  The most important library in Pandas two data types, namely Series and DataFrame. The following narrative content deployed mainly around two aspects!

  During data analysis, we know that there are two basic third-party libraries is particularly important in data processing, database Pandas were NumPy library, and we have the previous section for a detailed description of entry NumPy this chapter we mainly systematic summary for Pandas library. He speaks a little digression, before time learning for knowledge, basically watching video on the Internet, but when watching the video, was basically able to understand and feel very simple, do not make a note for knowledge, that later slow slow to forget a lot of content, the saying goes, a good memory as bad written, so the knowledge of their own after school will try to record on the blog, so to facilitate future inquiries knowledge points.

This article compiled by the use of the environment:

python: 3.6.6 compiler: pycharm operating system: win10

A, Series

1, Series of structure:

Type series by a set of data and associated index data composed of:

 for example:

import pandas as pd
a = pd.Series([9, 8, 7, 6])
print(a)
0    9
1    8
2    7
3    6
dtype: int64

From the above results, we can see the Pandas library will automatically increase the index to the data! But we can provide a custom index, use the index parameter can provide an index value:

import pandas as pd
b = pd.Series([9, 8, 7, 6], index=['a', 'b', 'c', 'd'])
print(b)
a    9
b    8
c    7
d    6
dtype: int64

Above we can see that the index data has been changed to an index value that we provide!

2, Series type of creation:

  1. You can create a python list;
  2. May be created by a scalar value (a value);
  3. You can create a python dictionary;
  4. It can be created by ndarray;
  5. It can be created by other functions.

 for example:

Created from a scalar value: When using a scalar value created, index can not be omitted:

import pandas as pd
s = pd.Series(25, index=['a', 'b', 'c'])
print(s)
a    25
b    25
c    25
dtype: int64

 Created from a dictionary type:

import pandas as pd
d = pd.Series({'a': 9, 'b': 8, 'c': 7})
print(d)
a    9
b    8
c    7
dtype: int64

Also: When creating a dictionary, the index value may be selected according to operation index:

import pandas as pd
e = pd.Series({'a': 9, 'b': 8, 'c': 7}, index=['c', 'a', 'b', 'd'])
print(e)
c    7.0
a    9.0
b    8.0
d    NaN
dtype: float64

When no index value corresponding to the value displayed is NaN.

 From ndarray type to create: When not provide index, series provides automatic indexing

import pandas as pd
import numpy as np

n = pd.Series(np.arange(5))
print(n)
0    0
1    1
2    2
3    3
4    4
dtype: int32

From ndarray type to create: When providing index, series use the index provided

import pandas as pd
import numpy as np


m = pd.Series(np.arange(5), index=np.arange(9, 4, -1))
print(m)
9    0
8    1
7    2
6    3
5    4
dtype: int32

3, the basic operation of type Series

 index values ​​and series type comprising two parts: in fact, for series operation of the theoretical values ​​for the index and the operation of:

For example: First, create a series type

import pandas as pd
b = pd.Series([9, 8, 7, 6], ['a', 'b', 'c', 'd'])
print(b)
a    9
b    8
c    7
d    6
dtype: int64

series type of index values: Use the following method .index

print(b.index)
Index(['a', 'b', 'c', 'd'], dtype='object')

series type of data: the following method using .values

print(b.values)
[9 8 7 6]

Further, reference to values ​​that may be used in the form of slices:

print(b['b'])
8
print(b[1])
8

  

Guess you like

Origin www.cnblogs.com/lsyb-python/p/11938662.html