1.apply () Description
Scope: pandas in Series
and DataFrame
effect: By using apply()
the method, we can call their own definition of the function, makes the code more clear structure, concise.
2.apply () application process
If a Series
and DataFrame
call the apply()
method, and then use their own definition of the function, which since the first parameter of the function definition, represent Series
and DataFrame
the next "latitude" .
For example, if it is DataFrame
, then the argument is his every column. If yes Series
, it is his every value.
(1) DataFrame
, each parameter is a column of his
eg: computing a DataFrame
respective field of skewness and kurtosis
import pandas as pd
import numpy as np
df = pd.DataFrame({
'key1':[1, 2, 3, 4, 5],
'key2':[4, 5, 6, 2, 1]
})
def skew_kurt(x):
print(x, type(x))
skews = x.skew()
kurts = x.kurt()
return pd.Series([skews, kurts], index=['skew', 'kurt']) # Series的参数为Series,则会变为DataFrame,且参数变为列
print(df.apply(skew_kurt))
# 结果:
0 1
1 2
2 3
3 4
4 5
Name: key1, dtype: int64 <class 'pandas.core.series.Series'>
0 1
1 2
2 3
3 4
4 5
Name: key1, dtype: int64 <class 'pandas.core.series.Series'>
0 4
1 5
2 6
3 2
4 1
Name: key2, dtype: int64 <class 'pandas.core.series.Series'>
key1 key2
skew 0.0 -0.235514
kurt -1.2 -1.963223
Can be seen by the output of the function will be executed many times, and every time the execution DataFrame
of a biography in the past.
(2) Series
, each parameter value of his
eg: one Series
for each value of the first character is replaced with that value
import pandas as pd
import numpy as np
s = pd.Series(['wang', 'li', 'zhao'])
def text(x):
print(x, type(x))
return x[0] # Series的参数为Series,则会变为DataFrame,且参数变为列
print(s.apply(text))
# 结果:
wang <class 'str'>
li <class 'str'>
zhao <class 'str'>
0 w
1 l
2 z
dtype: object
3.apply () Application
(1) above by way of example, we can see that it can be applied Series
, and DataFrame
(2) apply()
method may also be applied to the packet --- groupby()
. Also represents the next parameter latitude .
eg:
import pandas as pd
import numpy as np
df = pd.DataFrame({'data1':np.random.rand(5),
'data2':np.random.rand(5),
'key1':list('aabba'),
'key2':['one','two','one','two','one']})
print(df.groupby('key1').apply(lambda x: x.describe()))
# 结果:
a count 3.000000 3.000000
mean 0.693046 0.608697
std 0.257070 0.522231
min 0.396401 0.011814
25% 0.614231 0.422315
50% 0.832060 0.832817
75% 0.841368 0.907138
max 0.850676 0.981459
b count 2.000000 2.000000
mean 0.352287 0.482039
std 0.343271 0.675147
min 0.109558 0.004638
25% 0.230922 0.243339
50% 0.352287 0.482039
75% 0.473651 0.720740
max 0.595016 0.959441
4. Summary
- By using the
apply()
method, we can call their own definition of the function, makes the code more clear structure, concise. - Since the first argument definition, it represents the next "latitude" .