DataFrame of these operations and the Series is very similar, here briefly.
First, the application
apply () function to the level of an axis, applymap applied to the element level:
DataFrame.apply(self, func, axis=0, raw=False, result_type=None, args=(), **kwds)
DataFrame.applymap(self, func)
Definition of a function fun, the use of apply () function applied to the fun-dimensional array composed of a row DataFrame object, typically a polymeric fun function is a function thereof.
f=lambda x: x.max()-x.min df.apply(f)
Definition of a function foo, using applymap () function foo function is applied to the individual elements of DataFrame object,
foo=lambda x: '%.2f' % x df.applymap(foo)
Conversion data, function calls data element processing cycle:
DataFrame.transform(self, func, axis=0, *args, **kwargs)
Second, grouping and aggregate
groupby a data packet, then the packet aggregation function can be called directly evaluated; AGG () function call to the packet and functions integrated into a polymerizable functions to achieve:
DataFrame.groupby(self, by=None, axis=0, level=None, as_index=True, sort=True, group_keys=True, squeeze=False, observed=False, **kwargs)
DataFrame.agg(self, func, axis=0, *args, **kwargs)
Third, the window
Rolling () is in accordance with the rolling evaluation window, Expanding () refers to an ascending order to calculate the cumulated; EWM refers to an exponentially weighted rolling average:
DataFrame.rolling(self, window, min_periods=None, center=False, win_type=None, on=None, axis=0, closed=None) DataFrame.expanding(self, min_periods=1, center=False, axis=0) DataFrame.ewm(self, com=None, span=None, halflife=None, alpha=None, min_periods=0, adjust=True, ignore_na=False, axis=0)
Fourth, additional data line
Append data to the end of the line data box
DataFrame.append(self, other, ignore_index=False, verify_integrity=False, sort=None)
Fifth, the natural connection
Connected in the two data blocks on the condition, or as an index, or connected in the same column name, according to the matching equivalence conditions:
DataFrame.join(self, other, on=None, how='left', lsuffix='', rsuffix='', sort=False)
Parameter Notes:
- on: If set to None, the row index according to match; if the value is set to the column, then matching is performed according to two DataFrame on the column specified
- how: the type of connection, { 'left', 'right', 'outer', 'inner'}, default 'left'
- lsuffix: prefix table of the same name left field
- rsuffix: Right Table Prefix field of the same name
CONSOLIDATED
Connecting operation is similar to the relational database, and functions the same as the join function, according to the matching equivalence conditions, but more flexible than the use of the join function:
DataFrame.merge(self, right, how='inner', on=None, left_on=None, right_on=None, left_index=False, right_index=False, sort=False, suffixes=('_x', '_y'), copy=True, indicator=False, validate=None)
Parameter Notes:
- right: the right table
- how:{‘left’, ‘right’, ‘outer’, ‘inner’}, default ‘inner’
- on: connection conditions, required fields match according to the same name
- left_on, right_on: specify the order of the left and right tables are connected to a field, the field is significant
- left_index, right_index: index left and right tables were designated index according to match
- suffixes: tuples (str, str), prefixes are used to specify the left and right tables of the same name field
- indicator: the indicator increased, if set to True, increase a "_merge"
- validate:检查merge的类型(“one_to_one” or “1:1”,“one_to_many” or “1:m”,“many_to_one” or “m:1”和“many_to_many” or “m:m”)
Reference documents: