One, read from standard input
When your Python script does not pass in any parameters, fileinput will use stdin as the input source by default
#!/usr/bin/env python
#-*- coding:utf-8 -*-
#name: demo.py
import fileinput
for line in fileinput.input():
print(line)
The effect is as follows, no matter what you input, the program will automatically read it and print it again, like a repeater.
$ python demo.py
hello
hello
python
python
Two, open a file separately
To open a file separately, you only need to enter a file name in files
#!/usr/bin/env python
#-*- coding:utf-8 -*-
#name: demo.py
import fileinput
with fileinput.input(files=('a.txt',)) as file:
for line in file:
print(f'{fileinput.filename()} 第{fileinput.lineno()}行: {line}', end='')
Which a.txt
reads as follows
hello
world
After execution, the output will be as follows
$ python demo.py
a.txt 第1行: hello
a.txt 第2行: world
It should be noted is that fileinput.input()
default mode='r'
mode to read the file, if your files are binary, you can use mode='rb'
patterns. fileinput has and only these two reading modes.
Three, batch open multiple files
Can also be seen from the above example, I fileinput.input
pass in a function files 参数
, it receives a list or tuple contains multiple file names, passing one is reading a file, it is to read the incoming pieces of multiple files .
#!/usr/bin/env python
#-*- coding:utf-8 -*-
#name: demo.py
import fileinput
with fileinput.input(files=('a.txt', 'b.txt')) as file:
for line in file:
print(f'{fileinput.filename()} 第{fileinput.lineno()}行: {line}', end='')
a.txt
And b.txt
the contents are
$ cat a.txt
hello
world
$ cat b.txt
hello
python
After running the following output due a.txt
and b.txt
content are integrated into a file object file, and therefore fileinput.lineno()
only when reading a file, the original file is the real line number.
$ python demo.py
a.txt 第1行: hello
a.txt 第2行: world
b.txt 第3行: hello
b.txt 第4行: python
If you want to read when multiple files, can read numbers really implement the original file, you can use fileinput.filelineno()
the method
#!/usr/bin/env python
#-*- coding:utf-8 -*-
#name: demo.py
import fileinput
with fileinput.input(files=('a.txt', 'b.txt')) as file:
for line in file:
print(f'{fileinput.filename()} 第{fileinput.filelineno()}行: {line}', end='')
After running, the output is as follows
$ python demo.py
a.txt 第1行: hello
a.txt 第2行: world
b.txt 第1行: hello
b.txt 第2行: python
This usage and glob 模块
simply a perfect match
#!/usr/bin/env python
#-*- coding:utf-8 -*-
#name: demo.py
import fileinput
import glob
for line in fileinput.input(glob.glob("*.txt")):
if fileinput.isfirstline():
print('-'*20, f'Reading {fileinput.filename()}...', '-'*20)
print(str(fileinput.lineno()) + ': ' + line.upper(), end="")
The running effect is as follows
$ python demo.py
-------------------- Reading b.txt... --------------------
1: HELLO
2: PYTHON
-------------------- Reading a.txt... --------------------
3: HELLO
4: WORLD
Fourth, back up files while reading
fileinput.input
There is one backup 参数
, you can specify the suffix of the backup, such as.bak
#!/usr/bin/env python
#-*- coding:utf-8 -*-
#name: demo.py
import fileinput
with fileinput.input(files=("a.txt",), backup=".bak") as file:
for line in file:
print(f'{fileinput.filename()} 第{fileinput.lineno()}行: {line}', end='')
The results run as follows, it will be more of a a.txt.bak
file
$ ls a.txt*
a.txt
$ python demo.py
a.txt 第1行: hello
a.txt 第2行: world
$ ls a.txt*
a.txt a.txt.bak
Five, standard output redirection replacement
fileinput.input
There is one inplace 参数
, which indicates whether to write the result of standard output back to the file, which is not replaced by default
#!/usr/bin/env python
#-*- coding:utf-8 -*-
#name: demo.py
import fileinput
with fileinput.input(files=("a.txt",), inplace=True) as file:
print("[INFO] task is started...")
for line in file:
print(f'{fileinput.filename()} 第{fileinput.lineno()}行: {line}', end='')
print("[INFO] task is closed...")
After running, you will find that the print content in the for loop will be written back to the original file. The print outside the for loop remains unchanged.
$ cat a.txt
hello
world
$ python demo.py
[INFO] task is started...
[INFO] task is closed...
$ cat a.txt
a.txt 第1行: hello
a.txt 第2行: world
Using this mechanism, text replacement can be easily achieved.
#!/usr/bin/env python
#-*- coding:utf-8 -*-
#name: demo.py
import sys
import fileinput
for line in fileinput.input(files=('a.txt', ), inplace=True):
#将Windows/DOS格式下的文本文件转为Linux的文件
if line[-2:] == "\r\n":
line = line + "\n"
sys.stdout.write(line)
Attachment: How to realize the exchange of DOS and UNIX formats for program testing, just use vim to enter the following commands
DOS转UNIX::setfileformat=unix
UNIX转DOS::setfileformat=dos
Six, common methods
If you just want fileinput
as an alternative to open
reading the file tool, then the contents of the above is sufficient to meet your requirements.
-
fileinput.filenam()
Returns the file name currently being read. Before the first line is read, None is returned. -
fileinput.fileno()
Returns the "file descriptor" of the current file represented by an integer. When the file is not opened (between the first line and the file), -1 is returned. -
fileinput.lineno()
Returns the cumulative line number that has been read. Before the first line is read, 0 is returned. After the last line of the last file is read, the line number of the line is returned. -
fileinput.filelineno()
Returns the line number in the current file. Before the first line is read, 0 is returned. After the last line of the last file is read, the line number of the line in this file is returned.
But if you want to do some more complex logic based on fileinput, you may need to use the following methods
-
fileinput.isfirstline()
If the line just read is the first line of the file, it returns True, otherwise it returns False. -
fileinput.isstdin()
It returns True if the last read line is from sys.stdin, otherwise it returns False. -
fileinput.nextfile()
Close the current file so that the next iteration will read the first line from the next file (if it exists); lines not read from this file will not be counted in the cumulative line count. The file name will not change until the first line of the next file is read. This function will not take effect until the first line is read; it cannot be used to skip the first file. After the last line of the last file is read, this function will no longer take effect. -
fileinput.close()
Close the sequence.
Seven, advanced methods
In the fileinput.input()
middle there is a openhook 的参数
, it supports user-defined objects pass reading method.
If you do not pass in any hooks, fileinput uses the open function by default.
fileinput built two hooks for you to use for our
1,fileinput.hook_compressed(*filename*, *mode*)
using gzip
and bz2
module transparently open gzip
and bzip2
compressed files (by extension '.gz'
and '.bz2'
to identify). If the file is not an extension '.gz'
or '.bz2'
file will open in the normal way (ie using the open () and without any decompression operation). Example of use:fi = fileinput.FileInput(openhook=fileinput.hook_compressed)
2、 fileinput.hook_encoded(*encoding*, *errors=None*)
Returns a through open()
opening hook each file, using the given encoding
and errors
to read the file. Example of use:fi = fileinput.FileInput(openhook=fileinput.hook_encoded("utf-8", "surrogateescape"))
If your own scene is more special, neither of the above two hooks can meet your requirements, you can also customize it.
3. Custom hooks
If I want to use fileinput to read files on the network, I can define hooks like this.
① First use requests to download the file to the local
② Then use open to read it
def online_open(url, mode):
import requests
r = requests.get(url)
filename = url.split("/")[-1]
with open(filename,'w') as f1:
f1.write(r.content.decode("utf-8"))
f2 = open(filename,'r')
return f2
③ Pass this function directly to openhook
import fileinput
file_url = 'https://www.csdn.net/robots.txt'
with fileinput.input(files=(file_url,), openhook=online_open) as file:
for line in file:
print(line, end="")
④ Print out the robots file of CSDN as expected after running
User-agent: *
Disallow: /scripts
Disallow: /public
Disallow: /css/
Disallow: /images/
Disallow: /content/
Disallow: /ui/
Disallow: /js/
Disallow: /scripts/
Disallow: /article_preview.html*
Disallow: /tag/
Disallow: /*?*
Disallow: /link/
Sitemap: https://www.csdn.net/sitemap-aggpage-index.xml
Sitemap: https://www.csdn.net/article/sitemap.txt
8. Case
Case 1: Read all lines of a file
#!/usr/bin/env python
#-*- coding:utf-8 -*-
import fileinput
for line in fileinput.input('data.txt'):
print(line, end="")
Case 2: Read all lines of multiple files
#!/usr/bin/env python
#-*- coding:utf-8 -*-
import fileinput
import glob
for line in fileinput.input(glob.glob("*.txt")):
if fileinput.isfirstline():
print('-'*20, f'Reading {fileinput.filename()}...', '-'*20)
print(str(fileinput.lineno()) + ': ' + line.upper(), end="")
Case 3: Use fileinput to convert CRLF files to LF
#!/usr/bin/env python
#-*- coding:utf-8 -*-
import sys
import fileinput
for line in fileinput.input(files=('a.txt', ), inplace=True):
#将Windows/DOS格式下的文本文件转为Linux的文件
if line[-2:] == "\r\n":
line = line + "\n"
sys.stdout.write(line)
Case 4: Cooperate with re to do log analysis: take all rows with dates
#--样本文件--:error.log
aaa
1970-01-01 13:45:30 Error: **** Due to System Disk spacke not enough...
bbb
1970-01-02 10:20:30 Error: **** Due to System Out of Memory...
ccc
#---测试脚本---
#!/usr/bin/env python
#-*- coding:utf-8 -*-
import re
import fileinput
import sys
pattern = '\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2}'
for line in fileinput.input('error.log',backup='.bak',inplace=1):
if re.search(pattern,line):
sys.stdout.write("=> ")
sys.stdout.write(line)
#---测试结果---
=> 1970-01-01 13:45:30 Error: **** Due to System Disk spacke not enough...
=> 1970-01-02 10:20:30 Error: **** Due to System Out of Memory...
Case 5: Using fileinput to achieve a function similar to grep
#!/usr/bin/env python
#-*- coding:utf-8 -*-
import sys
import re
import fileinput
pattern= re.compile(sys.argv[1])
for line in fileinput.input(sys.argv[2]):
if pattern.match(line):
print(fileinput.filename(), fileinput.filelineno(), line)
$ ./demo.py import.*re *.py
#查找所有py文件中,含import re字样的
addressBook.py 2 import re
addressBook1.py 10 import re
addressBook2.py 18 import re
test.py 238 import re