Python-pandas库读取Excel文件数据的常见错误集合 - 代码天地

Python-pandas库读取Excel文件数据的常见错误集合

其他 2020-05-14 15:23:05 阅读次数: 0

问题1
Error tokenizing data. C error: Buffer overflow caught - possible malformed input file.

解决方案
这一般是缓冲区溢出错误，造成这种错误的原因是.csv文件中每行使用了 \r ,也就是回车符。
解决方案就是给 read_csv 添加参数 lineterminator=”\n” ，指定用“\n” 作为换行符。

df = pd.read_excel(r'test.csv',lineterminator="\n")

问题2
“pandas.parser.CParserError: Error tokenizing data. C error: Expected 2 fields in line 3, s”

解决方案
加入参数error_bad_lines=False

df = pd.read_excel(r'test.csv',error_bad_lines=False)

问题3
‘utf-8’ codec can’t decode byte 0xd0 in position 0: invalid continuation byte

解决方案
使用国标码编码 encoding = ‘gb2312’

df = pd.read_excel(r'test.csv',encoding = 'gb2312')

如果依旧报错或者’gb2312’报错，可以选择编码范围更广的‘gb18030’

encoding='gb18030'

如果还不能解决，说明文中出现了连‘gb18030’也无法编码的字符，可以使用‘ignore’属性忽略非法字符，path为文件路径。

df = open(path, encoding='gb18030', errors='ignore')

或者

df=open(path).read().decode(‘gb18030’,’ignore’)

如果errors报错了（parser_f() got an unexpected keyword argument ‘errors’），请升级你的pandas版本。

问题4
index 1 is out of bounds for axis 1 with size 1

扫描二维码关注公众号，回复： 11190908 查看本文章

解决方案
数据中有空值，请检查文件里的内容。把无关数据删除，空值补上数据。

问题5
Error: ‘gbk’ codec can’t decode byte 0x80 in position 205: illegal multibyte sequence"

解决方案
后面加上参数：‘rb’

df = pd.read_excel(r'test.csv','rb')

问题6
读取数据时总是少一行数据。

解决方案
加上参数header = 0或者header = None。

df = pd.read_excel(r'test.csv',header = None)

君琴

原创文章 41 获赞 65 访问量 8360

关注私信

猜你喜欢

转载自blog.csdn.net/weixin_44436677/article/details/106089238

Python-pandas库读取Excel文件数据的常见错误集合

Python-pandas对excel筛选(分组)查询

python-pandas基本数据操作

python-pandas常用数据操作

【python-pandas】python pandas获取groupby之后的数据

python-pandas读取mongodb、读取csv文件

python-pandas之Series数据分析（四）

python-pandas之DataFrame数据分析（五）

python-pandas之Index数据分析（六）

python-pandas创建Series数据类型

python-pandas基础数据结构（Series）

python-pandas基础数据结构（DataFrame）

Python-pandas：数据合并merge函数用法详解

Python-Pandas(1)数据读取与显示,数据样本行列选取

Python-pandas：从万德（wind）获取DataFrame形式的宏观经济数据

python 读取excel文件数据并插入数据库

Python-Pandas

Python-pandas详解

python-pandas练习

Python-pandas包

python-pandas总结

【Python-pandas】pandas入门

pandas读取excel文件数据格式被转换

pandas 读取文件常见错误

Python-Pandas的简单使用

Python-Pandas简单操作

python-pandas自学笔记

Python+selenium 读取Excel配置文件数据

Python >>> 利用xlrd 读取Excel 文件数据

『Tricks』用Python读取Excel文件数据

今日推荐

开源日报 | Chrome内置Gemini的意义不在于Gemini；中国AI追随之路的五大误区；ECharts创始人“下海”养鱼；谷歌I/O开发者大会什么都有，只是没有惊喜

微软回应中国区AI团队“打包赴美”传闻

基于大语言模型的开源知识库问答系统 MaxKB GitHub Star 数量突破 5,000 个！

美国拟限制 AI 大模型出口中国和俄罗斯

苹果将与 OpenAI 达成协议，将 ChatGPT 应用于 iPhone

openKylin 社区生态委员会第六次会议圆满召开

阿里云正式发布通义千问 2.5

Python 3.13 发布首个 Beta：实验性自由线程模式和 JIT、改进交互式解释器

Stack Overflow 拿我的代码去训练 AI 大模型，还封了我的账号

Pop!_OS 的 COSMIC 桌面完成 App Store 上架工作

《2024 年一季度互联网投融资运行情况》研究报告

报告：Django 仍然是 74% 开发者的首选

周排行

laravle中orm简单的增删改查

文本分类特征选取之CHI开方检验

Spark核心编程-WordCount

大数据开发实战系列之电信客服(1)

读书笔记 - 把时间当作朋友 by 李笑来

python 笔记--if else

SpringBoot/Mybatis/Druid, 多数据源MultiDataSource配置思路

排序三个整数

redis集群搭建【2】-Windows中Redis集群搭建

STM32F030驱动TM1650点亮4联数码管

每日归档

更多

2024-05-16(6)

2024-05-15(24)

2024-05-14(0)

2024-05-13(18)

2024-05-12(0)

2024-05-11(38)

2024-05-10(38)

2024-05-09(35)

2024-05-08(42)

2024-05-07(14)