pandas-function index

Insert picture description here

import pandas as pd 
import numpy as np
现有一个DataFrame变量
df=pd.DataFrame({
    
    'col1': [1, 2, 3], 'col2': [4, 5, 6]})。
使用iloc索引获取数据,以下写法哪个是正确的?
df['col1'].iloc[lambda x: x[[0, 1]]]

现有一个DataFrame变量
df=pd.DataFrame({
    
    'col1': [1, 2, 3], 'col2': [4, 5, 6]})。
使用loc索引获取数据,以下写法哪个是正确的?
df.loc[lambda x: (x['col1'] > 1) & (x['col2'] < 6)]

第一课 数据分析工具Pandas高阶
第一节 函数索引
import pandas as pd
import numpy as np
# 确认Pandas的版本>=0.18.1
pd.__version__
'0.25.1'
读取文件
raw_df = pd.read_csv('./datasets/2016_happiness.csv')
raw_df.head()
Country	Region	Happiness Rank	Happiness Score	Lower Confidence Interval	Upper Confidence Interval	Economy (GDP per Capita)	Family	Health (Life Expectancy)	Freedom	Trust (Government Corruption)	Generosity	Dystopia Residual
0	Denmark	Western Europe	1	7.526	7.460	7.592	1.44178	1.16374	0.79504	0.57941	0.44453	0.36171	2.73939
1	Switzerland	Western Europe	2	7.509	7.428	7.590	1.52733	1.14524	0.86303	0.58557	0.41203	0.28083	2.69463
2	Iceland	Western Europe	3	7.501	7.333	7.669	1.42666	1.18326	0.86733	0.56624	0.14975	0.47678	2.83137
3	Norway	Western Europe	4	7.498	7.421	7.575	1.57744	1.12690	0.79579	0.59609	0.35776	0.37895	2.66465
4	Finland	Western Europe	5	7.413	7.351	7.475	1.40598	1.13464	0.81091	0.57104	0.41004	0.25492	2.82596

过滤排名在10-20的国家和排名
raw_df.loc[lambda df: (df['Happiness Rank'] >= 10) & (df['Happiness Rank'] <= 20), ['Country', 'Happiness Rank']]
Country	Happiness Rank
9	Sweden	10
10	Israel	11
11	Austria	12
12	United States	13
13	Costa Rica	14
14	Puerto Rico	15
15	Germany	16
16	Brazil	17
17	Belgium	18
18	Ireland	19
19	Luxembourg	20

过滤排名前5行数据的国家和排名
raw_df.iloc[:5, lambda df: [0, 2]]
Country	Happiness Rank
0	Denmark	1
1	Switzerland	2
2	Iceland	3
3	Norway	4
4	Finland	5

获取幸福指数平均得分大于7的地区
raw_df.groupby('Region')['Happiness Score'].mean()
Region
Australia and New Zealand          7.323500
Central and Eastern Europe         5.370690
Eastern Asia                       5.624167
Latin America and Caribbean        6.101750
Middle East and Northern Africa    5.386053
North America                      7.254000
Southeastern Asia                  5.338889
Southern Asia                      4.563286
Sub-Saharan Africa                 4.136421
Western Europe                     6.685667
Name: Happiness Score, dtype: float64
raw_df.groupby('Region')['Happiness Score'].mean().loc[lambda s: s>7]
Region
Australia and New Zealand    7.3235
North America                7.2540
Name: Happiness Score, dtype: float64
 

Guess you like

Origin blog.csdn.net/lildn/article/details/114645842