pandas Learning 11: DataFrame- processing data (packet polymerization, windows, related statistics)

For the purpose of data processing is data analysis, the following functions to share common data will be used in the analysis.

A packet, and the polymerization

 groupby a data packet, then the packet aggregation function can be called directly evaluated; AGG () function call to the packet and functions integrated into a polymerizable functions to achieve:

DataFrame.groupby(self, by=None, axis=0, level=None, as_index=True, sort=True, group_keys=True, squeeze=False, observed=False, **kwargs)
DataFrame.agg(self, func, axis=0, *args, **kwargs)

Second, window

Rolling () is in accordance with the rolling evaluation window, Expanding () refers to an ascending order to calculate the cumulated; EWM refers to an exponentially weighted rolling average:

DataFrame.rolling(self, window, min_periods=None, center=False, win_type=None, on=None, axis=0, closed=None)
DataFrame.expanding(self, min_periods=1, center=False, axis=0)
DataFrame.ewm(self, com=None, span=None, halflife=None, alpha=None, min_periods=0, adjust=True, ignore_na=False, axis=0)

For more information, refer to: PANDAS Learning 4: the sequence of processing (application, polymerization conversion, mapping, packet, rolling, extension, exponential weighted moving average)

Third, relevant

Calculating a correlation between the two pairs of values:

DataFrame.corr(self, method='pearson', min_periods=1)

method: method of calculating the correlation, the effective value is 'pearson', 'kendall', 'spearman' or callable

min_periods: Each column must have a minimum number of valid results observed, currently only available in: Pearson and Spearman correlation.

Fourth, statistical functions

Commonly used statistical functions:

  • min, max: minimum, maximum,
  • mode: the mode
  • var: variance
  • std: standard deviation
  • sum: cumulative and
  • mean: Mean
  • mad: the mean absolute
  • median: median
  • quantile: percentile
  • count: count
  • cumsum: cumulative sums
  • cumprod: Cumulative product
  • cummin, cummax: cumulative minimum, maximum cumulative

 

Reference documents:

pandas DataFrame

Guess you like

Origin www.cnblogs.com/ljhdo/p/11599177.html