Python中的文件和流

打开文件

open(name[,mode[,buffering]])
文件名是唯一的强制参数。
注意，如果文件不存在，会抛一个异常。

文件模式

open的第二个参数是文件模式，如下:

值	描述
r	读模式
w	写模式
a	追加模式
b	二进制模式(可添加到其他模式中)
+	读/写模式(可添加到其他模式中)

缓冲

open的第三个参数控制着文件的缓冲。如果为0(或False)，就是无缓冲的；如果是大于1的数字代表缓冲区的大小(字节),任何负数代表使用默认的缓冲区大小。

读写示例

>>> f = open('somefile.txt','w')
>>> f.write('Hello, ')
>>> f.write('World!')
>>> f.close()
>>> 
>>> f = open('somefile.txt','r')
>>> f.read(4)
'Hell'
>>> f.read()
'o, World!'

管式输出

在Linux中：

cat somefile.txt | python somescript.py 
Wordcount: 12

somescript.py

# somescript.py
import sys
text = sys.stdin.read()
words = text.split()
wordcount = len(words)
print 'Wordcount:',wordcount

读写行

使用file.readline()读取单独的一行。readlines()读取文件中的所有行并将其作为列表返回。

关闭文件

可以再finally子句中关闭文件；
也可以通过with语句来打开文件，且不需要显示关闭文件：

with open("somefile.txt") as somefile:
    do_something(somefile)

对文件内容进行迭代

定义一个处理方法：

>>> def process(str):
...     print 'Processing:',str

按字节(字符)处理

>>> f = open("somescript.py")
>>> while True:
...     char = f.read(1)
...     if not char: break
...     process(char)

>>> f.close()

按行操作

很简单，把上面的read(1)方法换成readline()方法即可。

或者使用readlines迭代行:

f = open(filename)
for line in f.readlines():
    process(line)
f.close()

使用fileinput实现惰性行迭代

当需要处理非常大的文件时，需要使用这种方法。它每次读取实际需要的文件部分。

扫描二维码关注公众号，回复： 1612182 查看本文章

import fileinput
for line in fileinput.input(filename):
    process(line)

文件迭代器

在Python中，文件时可迭代的，可以在for循环中使用它们：

f = open(filename) #默认是可读的
for line in f:
    process(line)
f.close()