Series object, how to generate Series object.
Datadf object, how to generate Datadf
df.Head() function, df.tail() function, df.loc() function (get data by index position)
Del df['eastern'], the del function is used to delete a column.
df.T, transpose function
df.values, returns the data of Datadf
df.index, returns the index of Datadf
df.reindex(), used to modify the index
df.drop(axis = 0), used to delete a row or a column, the default parameter is axis = 0, the default delete row, when axis=1, delete the column, or axis ='columns' is also possible. This function returns a deleted object and does not modify the original data. When the parameter inplace = True is passed in, the object is modified in place, and no new object is returned.
df.loc(), use index tag to get data
df.iloc(), get data through integer index
Np.abs(df), you can use numpy functions to manipulate pandas objects
df.apply(f), function f, acts on each column of df
df.applymap(f), function f, acts on every element of df
df.sort_index(axis = 0, ascending = True), sort the index on a certain axis, the default is axis = 0, that is, the row miniature, and can be set to axis = 1, that is, to sort the column index. Ascending is True by default, which means sorting in ascending order, or you can set ascending = False, which means sorting in descending order.
df.sort_values(), to sort the values, the missing values will be placed at the end of the Series, the by parameter is to sort according to the values in one or more columns, if you want to sort according to multiple columns, you need to pass in the list
df.index.is_unique, this is an attribute, you can see whether the value of the index is unique
df.sum(axis = 0), this method returns a Series containing the sum of the columns, the default is axis = 0, calculate the sum of each column, you can also modify the parameter to axis = 1, and the sum operation will be performed according to the row
df.mean(axis = 0', skipna =True), returns a Series containing the average value of the column, skipping null values.
df.idmax(), returns the index of the maximum value of each column
df.idmin(), returns the index of the minimum value of each column
df.cumsum(), the cumulative sum of each column
df.describe(), generate multiple summary statistics at once, including total, average, minimum, maximum, quantile, etc.
There are also: count(), max(), min(), argmax()/returns the integer index of the maximum value, argmin()/returns the integer index of the minimum value, quantile/calculates the quantile 0 to 1, sum, mean, median, mad, var, std, skew, kuit, cumsum, cummin, cummax/cumulative maximum and minimum of sample value, cumulative product of cumprod, pct_change calculate percentage change (calculate stock return)
Ser1.corr(Ser2), to calculate the correlation system of two Series overlapping, non-NA, aligned by index
Df.corr(), returns the related system matrix of this dataframe
Df.cov(), returns the covariance matrix of this dataframe
Df.corrwith(), when a Series is passed in, the correlation coefficients between all the columns of the dataframe and the Series will be calculated. When a DataFrame is passed in, the correlation coefficients will be matched according to the column names and then the correlation coefficients will be calculated.
Ser.unique(), returns an array of unique values in the Series
Ser.value_counts(), used to calculate the frequency of each value in a Series
pd.value_counts(Ser.values), this value_counts is still a top-level method
Ser.isin(), used to determine whether the value in the Series is in a list