pandas21 读csv文件read_csv(12.迭代和块)(详细 tcy)

实例-迭代2018/12/26 

# 希望遍历大文件而不将整个文件读入内存指定chunksize逐块读取文本文件
# read_csv或read_table返回值类型是可迭代对象TextFileReader
# 指定iterator=True也将返回TextFileReader对象  
目录:
第1部分:csv文本文件读写

    pandas 读csv文件read_csv(1.文本读写概要)https://mp.csdn.net/postedit/85289371
    pandas 读csv文件read_csv(2.read_csv参数介绍)https://mp.csdn.net/postedit/85289928
    pandas 读csv文件read_csv(3.dtypes指定列数据类型)https://mp.csdn.net/postedit/85290575
    pandas 读csv文件read_csv(4.to_csv文本数据写)https://mp.csdn.net/postedit/85290962
    pandas 读csv文件read_csv(5.文本数据读写实例)https://mp.csdn.net/postedit/85291123
    pandas 读csv文件read_csv(6.命名和使用列)https://mp.csdn.net/postedit/85291430
    pandas 读csv文件read_csv(7.索引)https://mp.csdn.net/postedit/85291658
    pandas 读csv文件read_csv(8.方言和分隔符)https://mp.csdn.net/postedit/85291994
    pandas 读csv文件read_csv(9.浮点转换和NA值)https://mp.csdn.net/postedit/85292391
    pandas 读csv文件read_csv(10.注释和空行)https://mp.csdn.net/postedit/85292609
    pandas 读csv文件read_csv(11.日期时间处理) https://mp.csdn.net/postedit/85292925
    pandas 读csv文件read_csv(12.迭代和块)https://mp.csdn.net/postedit/85293639
    pandas 读csv文件read_csv(13.read_fwf读固定宽度数据)https://mp.csdn.net/postedit/85294010
    
第2部分:
    pandas hdf文件读写简要https://mp.csdn.net/postedit/85294299
    pandas excel读写简要https://mp.csdn.net/postedit/85294545
    
第3部分:
    python中csv模块用法tcy https://mp.csdn.net/postedit/85228189
    pandas读csv文件read_csv错误解决办法7种https://mp.csdn.net/postedit/85228808
    pandas to_string用法https://mp.csdn.net/postedit/85294935

实例1:nrows读取指定行数

data=' a b c key\n' \
     '0 0 1 2 k1\n' \
     '1 3 4 5 k1\n' \
     '2 6 7 8 k2\n' \
     '3 9 10 11 k3\n' \
     '4 12 13 14 k3\n' \
     '5 15 16 17 k3'

pd.read_csv(StringIO(data), sep='\s+',nrows=2,engine='python')#读2行数据

  a b c key
0 0 1 2 k1
1 3 4 5 k1  

实例2:- 逐块读取文件chunksize(行数)

chunker = pd.read_csv (StringIO(data), sep='\s+',engine='python', chunksize=2)

for i in chunker:
    print(i)

   a  b  c key
0  0  1  2 k1
1  3  4  5 k1
   a  b  c key
2  6  7  8 k2
3  9 10 11 k3
   a  b  c key
4 12 13 14 k3
5 15 16 17 k3

# 实例2.2:
chunker = pd.read_csv (StringIO(data), sep='\s+',engine='python', chunksize=2)
chunker.get_chunk(3)

  a b c key
0 0 1 2 k1
1 3 4 5 k1
2 6 7 8 k2

chunker.get_chunk(3)

   a  b  c key
3  9 10 11 k3
4 12 13 14 k3
5 15 16 17 k3

chunker.get_chunk(3)#异常停止迭代  
# 实例3:iterator=True迭代文件 

reader = pd.read_table(StringIO(data), sep='\s+',engine='python', iterator=True)
reader.get_chunk(2)#迭代获得下2行数据

  a b c key
0 0 1 2 k1
1 3 4 5 k1

for i in reader:
    print(i)

   a  b  c key
2  6  7  8 k2
3  9 10 11 k3
4 12 13 14 k3
5 15 16 17 k3  

猜你喜欢

转载自blog.csdn.net/tcy23456/article/details/85293639
今日推荐