[Pandas] Build a DataFrame data frame

DataFrame is a two-dimensional data structure in which data is arranged in rows and columns

The most basic definition format for building a DataFrame is as follows

df = pd.DataFrame(data=None, index=None, columns=None)

Parameter Description

data: specific data

index: row index, if not specified, RangeIndex(0,1,2,...,n) will be automatically generated

columns: column index (header), if not specified, will automatically generate RangeIndex(0,1,2,...,n) 

We can directly use pd.DataFrame() to create an empty DataFrame data frame

import pandas as pd
df = pd.DataFrame()
'''
Empty DataFrame
Columns: []
Index: []
'''
print(df)

The following is a commonly used method of constructing a DataFrame data frame

Method 1:  Build a DataFrame data frame using a dictionary dict

The key in the dictionary is the column name, and the value is generally a list, tuple or ndarray array object, which is specific data

import pandas as pd
import numpy as np

data = {'a':[1, 2, 3, 4],  # 列表
        'b':(4, 5, 6, 7),  # 元组
        'c':np.array([8, 9, 10, 11])  # ndarry数组
}
# 创建Dataframe
df1 = pd.DataFrame(data) 

df1

It can be seen that a new DataFrame data frame has been successfully created. The system generates a row index for us by default, and the column index is the key in the dictionary dict. We can also manually specify the row index when creating a Dataframe, just modify the parameters index can be

import pandas as pd
import numpy as np

data = {'a':[1, 2, 3, 4],  # 列表
        'b':(4, 5, 6, 7),  # 元组
        'c':np.array([8, 9, 10, 11])  # ndarry数组
}
# 创建Dataframe
df1 = pd.DataFrame(data,index=['one','two','three','four']) 

df1

We can also use the dictionary composed of Series to build a DataFrame data frame

A key-value pair in the dictionary is a column of data, the key is the column name, and the value is a Series

import pandas as pd

data = {"x": pd.Series([1, 2, 3, 4], index=['a', 'b', 'c', 'd']), 
        "y": pd.Series([5, 6, 7, 8], index=['a', 'b', 'c', 'd'])}

# 创建DataFrame
df2 = pd.DataFrame(data)

df2

Method 2:  Build a Dataframe data frame using a list

We can build a DataFrame from a list of dictionaries, where each dictionary is a row of data

import pandas as pd

# 定义一个字典列表
data = [{'x':1, 'y':2, 'z':3},
        {'x':4, 'y':5, 'z':6}]

# 创建DataFrame
df3 = pd.DataFrame(data, index=['a','b'])

df3

We can also create a DataFrame data frame using a two-dimensional list

import pandas as pd
data = [['Alex',10],['Bob',12],['Clarke',13]]
df4 = pd.DataFrame(data,columns=['Name','Age'])

df4

Tips

In actual business, we generally don’t need us to generate data, but we have already collected data sets, which can be loaded directly into DataFrame

Guess you like

Origin blog.csdn.net/Hudas/article/details/130466113