python_数据_pandas_3

其他 2019-01-22 21:58:48 阅读次数: 0

pandas_映射

start

import pandas as pd
import numpy as np
from pandas import Series,DataFrame

df = DataFrame(np.random.normal(100,scale = 30,size=(40,3)),columns=['yu','shu','ying'],dtype=np.uint8)
df

out:yu shu ying
0 109 164 90
1 118 94 158
2 105 70 115
…

df.set_index('yu')

out: 可以用set_index 来设置index

df.replace({70:60,115:60},inplace=True)

将dataframe中的70和115替换为60

df.replace({115:60,np.nan:1024},inplace=True)

inplace = True 表示原数据保留改变

df['Java'] = df['ying'].map(lambda x :int(((x + 10) / 3) * 2))

根据’ying’项创建’Java’项
out:yu shu ying Java
0 109 164 90 66
1 118 94 158 112
…

def cover(x):
    if x < 60:
        return '不及格'
    elif x < 80:
        return '及格'
    elif x < 100:
        return '中等'
    elif x < 120:
        return '良好'
    else:
        return '优秀'   
df['Lever'] = df['ying'].map(cover)
df

out:yu shu ying Java Lever
0 109 164 90 66 中等
1 118 94 158 112 优秀
2 105 70 115 83 良好

更改索引名字

df2.rename(mapper={
    0:'A',
    1:'B',
    2:'C',
},axis=0,inplace=True)
df2

out:yu shu ying Java Lever
A 109 优秀中等 66 中等
B 118 中等优秀 112 优秀
C 105 及格良好 83 良好
3 78 中等良好 86 良好
4 109 中等中等 62 中等
5 73 及格良好 73 良好
6 65 良好及格 50 及格

df2.rename(mapper={
    'yu':'语文',
    'shu':'数学',
    'ying':'英语',
},axis=1,inplace=True)

异常值检测与过滤

df = DataFrame(np.random.normal(100,scale = 30,size=(40,3)),columns=['yu','shu','ying'],dtype=np.uint8)
df

out: yu shu ying
0 85 64 53
1 70 90 126
2 110 93 128
3 94 108 73
4 67 132 154
5 81 158 69
6 95 157 90
7 123 112 105
8 108 78 157
…

m = df.mean()
std = df.std()
df.iloc[8,2] = 200
cond = df - m > 3 * std
df[cond.any(axis = 1)]

out:yu shu ying (8, 2)
8 104 98 200 299

index = df[cond.any(axis = 1)].index
df.drop(labels=index,axis=0,inplace=True)

去除异常数据

index = np.random.randint(0,40,size = 10)
df1 = df.take(index)      # 随机抽样

使用take和normal可以完成随机抽样的效果
out:yu shu ying
18 81 132 84
2 85 74 84
4 118 112 103
5 106 49 66
30 130 91 66
6 81 114 152
34 104 119 61
19 110 72 39
22 159 147 81
7 132 106 111

数据聚合

df = DataFrame({'color':['white','black','white','white','black','black'],
               'status':['up','up','down','down','down','up'],
               'value1':[12.33,14.55,22.34,27.84,23.40,18.33],
               'value2':[11.23,31.80,29.99,31.18,18.25,22.44]})

ret = df.groupby(by = ['color']).mean()
ret

out: value1 value2
color
black 18.760000 24.163333
white 20.836667 24.133333

ret = df.groupby(by = ['color','status']).mean()
ret

out: value1 value2
color status
black down 23.40 18.250
up 16.44 27.120
white down 25.09 30.585
up 12.33 11.230

ret = df.groupby(by = ['color','status'])

def covert(x):
    return (np.round(x.mean(),1),x.min(),x.max())

ret.agg(covert)

out: value1 value2
color status
black down (23.4, 23.4, 23.4) (18.2, 18.25, 18.25)
up (16.4, 14.55, 18.33) (27.1, 22.44, 31.8)
white down (25.1, 22.34, 27.84) (30.6, 29.99, 31.18)
up (12.3, 12.33, 12.33) (11.2, 11.23, 11.23)

猜你喜欢

转载自blog.csdn.net/sinat_39045958/article/details/86524968

python_数据_pandas_3

Python数据分析与机器学习-Pandas_3

python_数据_pandas_4

python_数据_pandas_1

python_数据_pandas_2

python_数据_knn_3

【3】python_基础

Python_ Day 3

python_数据绘图

python_数据_numpy

python_数据_ipython

python_数据_pydub

python_数据可视化_pandas_导入CSV数据

python_求球体数据

python_爬虫_数据提取

python_数据驱动_ddt

python3：pandas（选择数据）

python3：pandas（处理丢失数据）

Python数据分析-pandas3

python数据结构：pandas(3)

【20171002】python_语言设计（3）函数

python_高级进阶(3)线程

Python_推荐系统Hadoop（3）

python3 Pandas

python pandas 笔记3

python学习-pandas(3)

Python之Pandas（3）

Python_数据类型的魔法_02

python_基本数据类型

Python_数据结构（list/tuple）

今日推荐

openKylin 社区生态委员会第六次会议圆满召开

阿里云正式发布通义千问 2.5

Python 3.13 发布首个 Beta：实验性自由线程模式和 JIT、改进交互式解释器

Stack Overflow 拿我的代码去训练 AI 大模型，还封了我的账号

Pop!_OS 的 COSMIC 桌面完成 App Store 上架工作

报告：Django 仍然是 74% 开发者的首选

《2024 年一季度互联网投融资运行情况》研究报告

15 年前上了“FFmpeg 耻辱柱”，今天他还得谢谢咱——腾讯QQPlayer一雪前耻？

TIOBE 5 月榜单：Fortran “复活”进入 Top 10

GCC 14.1 发布

面壁智能发布 Eurux-8x22B 开源大模型 —— 堪称「理科状元」

开源日报 | 谷歌扶持鸿蒙上位；开源Rabbit R1；Docker加持的安卓手机；微软的焦虑和野心；海尔电器把开放平台关了

周排行

计算机组成与设计（七）—— 除法器

Integer Approximation(分治+枚举)

大话数据库索引

windows10系统JDK的配置及下载地址

mysql实现秒值转换中原六仔平台搭建

Codeforces Round #556 (Div. 1)

百练1064 网线主管

Codeforces 995F Cowmpany Cowmpensation

子集生成之增量构造法，位向量法，二进制法

ERROR: cmd.exe failed with args /c "/APK\gradle\rungradle.bat...

每日归档

更多

2024-05-10(38)

2024-05-09(35)

2024-05-08(42)

2024-05-07(14)

2024-05-06(40)

2024-05-05(0)

2024-05-04(7)

2024-05-03(19)

2024-05-02(0)

2024-05-01(4)