Details DataFrame structure: DataFrame (1)

1, a data structure of explanation DataFrame

Here Insert Picture Description
  index represents the row index, column represents the column index, values represents the value, in fact, whether it is the row index or column index can be viewed as an index Index. From each row to see, DataFrame Series can be seen as a sequence of vertically stacked rows, each row is indexed Series Index [0,1,2,3]; see each column, it can be regarded as a DataFrame column the series sequence piled up around each series of the index is the row index [0,1,2].
  DataFrame default way to understand is: DataFrame is actually made up of many different types of data columns Series components. For the figure, in fact, this consists of the following four DataFrame Series, which are are indexed row index [0,1,2].
Here Insert Picture Description
  A DataFrame can, in analogy to a table MySQL:
  MySQL table, the data type of each column is substantially not the same field, there are many columns for each table field;
  if the MySQL each column in the table to see do is a data type of the Series, a MySQL table it can be seen by a number of different data types Series composition, and above us is about the same.

2, DataFrame columns of attributes and attribute index

1) configuration of a DataFrame
import numpy as np
import pandas as pd

df = pd.DataFrame(np.random.randint(70,100,(3,5)), 
                  index=["地区1", "地区2", "地区3"], 
                  columns=["北京","天津", "上海","沈阳", "广州"])
display(df)

The results are as follows:
Here Insert Picture Description

2) index and columns attribute
df = pd.DataFrame(np.random.randint(70,100,(3,5)), 
                  index=["地区1", "地区2", "地区3"], 
                  columns=["北京","天津", "上海","沈阳", "广州"])
display(df)

x = df.index
display(x)
list(df.index)

y = df.columns
display(y)
list(df.columns)

The results are as follows:
Here Insert Picture Description

① modify row index: df.index
df = pd.DataFrame(np.random.randint(70,100,(3,5)), 
                  index=["地区1", "地区2", "地区3"], 
                  columns=["北京","天津", "上海","沈阳", "广州"])
display(df)

df.index = ["a","b","c"]
display(df)

The results are as follows:
Here Insert Picture Description

② modify the column index: df.columns
df = pd.DataFrame(np.random.randint(70,100,(3,5)), 
                  index=["地区1", "地区2", "地区3"], 
                  columns=["北京","天津", "上海","沈阳", "广州"])
display(df)

df.columns = ["a","b","c"]
display(df)

The results are as follows:
Here Insert Picture Description

3) DataFrame indexed objects Index

  Observation "DataFrame data configuration diagram" can be found in: both a row index for each df index, index columns have a column. But regardless of the row index index, or column index columns, unified both are called "Index Object." The difference is that when you create df, parameter names specified parameters, in order to facilitate regional branches and column indices, the index of the line "Index object" called the index, the index of the column "Index object" called the columns.
  Remember: Index Index object elements can not be modified.

# pd.Index()用于创建一个Index对象
x = pd.Index([1,2,3])
display(x)
display(type(x))

x[0] = 1

The results are as follows:
Here Insert Picture Description

3, name attribute

1) understand how the name attribute DataFrame

Here Insert Picture Description
  We know: Each row out DataFrame in each column is a Series, each composed sereis this DataFrame object has a name, which is the line that corresponds to the index column. As shown above the "orange, yellow, indigo Zihei" eight colors, numbered 1-8, respectively, corresponds to each number is a Series. Series1's name is "region 1", Series2's name as "area 2" ... Series8's name as "Guangzhou."
  Next, we use the code test it.

df = pd.DataFrame(np.random.randint(70,100,(3,5)), 
                  index=["地区1", "地区2", "地区3"], 
                  columns=["北京","天津", "上海","沈阳", "广州"])
display(df)

df.loc["地区1"].name
df.loc["地区2"].name
......
df["广州"].name

The results are as follows:
Here Insert Picture Description

2) is a row index and column index set property name Name: df.index.name and df.columns.name
df = pd.DataFrame(np.random.randint(70,100,(3,5)), 
                  index=["地区1", "地区2", "地区3"], 
                  columns=["北京","天津", "上海","沈阳", "广州"])
display(df)

df.index.name = "index_name"
df.columns.name = "columns_name"
display(df)

The results are as follows:
Here Insert Picture Description
To sum up: The above presentation, we not only DataFrame each row, each column has a name name, and we can also give DataFrame row and column indices set a name name, respectively.

Published 68 original articles · won praise 78 · views 10000 +

Guess you like

Origin blog.csdn.net/weixin_41261833/article/details/104162585