Python- file read and write and modify

Read and write files in three forms: read, write and append.

A read mode and a write mode r + r

1, read mode r

Read mode r characteristics: (1) can only read, can not write; will complain when (2) file does not exist.

(1) Example: read current books.txt file directory, as shown in the following documents.

   

Resolution:

a, open the file with open, in python3 only open. python2 can open and file. Close the file is close (). There are generally open to the relevant

b, if in the current directory, you can write directly to the file name, or need to add the path.

c, such that no writing 'r', i.e., written f = open ( 'books.txt'), the read mode is the default.

d, read all of the content can be read out files

E, In some cases, necessary to add encoding and decoding format, in the format:

f = open('books','r',encoding='utf-8')

(2)

Read the entire contents of the file: read

readline: read a line

readlines: read each line, and put in a list

  

  

2, read-write mode r +

 R + write mode features: being given (1) file does not exist; (2) can be read and to be written, is written to cover, will cover the contents of the file foremost

  

'Water Margin' to 'ordinary' words cover, the original file becomes:

Second, the read-write mode and a write mode w w +

1, write mode

W write mode features: (1) can only write, can not read; (2) the time to write will clear the original content of the file; (3) when the file does not exist, create a new file.

Below, when writing 'Water Margin', the original contents are emptied.

   

f.flush (): Sometimes we use f.write () after, you will find no written document, which is due to the presence of the contents of the buffer zone, have to wait after the buffer is full, then all the data is written. At this time can f.flush () forced inside the data buffer is written to disk.

 2, read-write mode w +

Read-write mode w + features: (1) can be written, can be read; (2) the time to write will clear the original content of the file; (3) when the file does not exist, create a new file.

Third, the append mode and a read mode append a +

1, adding a mode

A append mode features: (1) can not be read; (2) can be written, a write additions that add new content at the end of the original content; (3) when the file does not exist, create a new file.

  

The 'Water Margin' added to the end of the original content

2, an additional read mode a +

A + additional read mode features: (1) read and write; (2) the time of writing is additionally written, new content is added in the original content at the end; (3) when the file does not exist, create a new file.

Several more patterns may be summarized by the following table:

Read-write mode Is readable Is writable When the file does not exist
r Yes no An error
r+ Yes Yes, write covered An error
w no That emptied the contents of the original Create a new file
w+ Yes That emptied the contents of the original Create a new file
a no Yes, additional write Create a new file
a+ Yes Yes, additional write Create a new file

Fourth, the file pointer

The file pointer is used to record where files go.

The file pointer is very important that we look at the following example, read the entire contents books.txt read out, readline is not read anything. The reason is that after reading read, the file pointer to the end of the file, then readline then began to read from this position, certainly not content. So sometimes you need to adjust the position of the file pointer.

   

seek can move the file pointer after moving only for reading, with an additional mode of writing, or in writing at the end.

In addition, seek (num), the num refers to the character, not the line.

In the above code, add a f.seek (0), to move the pointer to the beginning of the file. This time, readline you can read from the beginning.

  

Fifth, automatically closes the file

 with automatically closing the file, use the following:

with open('books.txt','a+') as f:
    f.write('\n三体')

Sixth, file modification

1, simple and crude directly modify

The most simple and crude modify files, the steps are:

(1) Open the file, access the file contents;

(2) to modify the content;

(3) Empty the contents of the original document;

(4) to write new content into it.

This method is very simple, look at the following example ---- a small file stored username in the name and password, as shown format. We want to add 'A class _' before all names

Because the 'A class _' containing Chinese, need to call on encoding = 'utf-8', otherwise it will be garbled.

   

2. The method of backup files

当文件很大时,刚刚的方法在一次性读取文件内容和写入新内容时,耗时长,占用磁盘空间也较大。

备份文件的方法可以建立一个备份文件,修改一行写一行,具体步骤如下:

(1)打开2个文件,原文件a和备份文件b。如a.txt    b.txt.bak

(2)删除a文件,将b文件名改为a文件名

例:将文件words里的“花”改成“flower”

七、小练习

 1、产生手机号 前几位一样1861253 后四位随机,写到文件里。

分析:(1)首先要随机产生一些四位数,位数不足的要补0,zfill可以给字符串补0;

(2)需要写到文件里,可以用w或者a模式。文件有打开就要有关闭。

import random
f = open('phones.txt','w')
num = int(input('请输入你要产生的手机号个数:'))
for i in range(num):
    start = '1861253'
    random_num = str(random.randint(1,9999))
    new_num = random_num .zfill(4) #不够4位就补0,仅对字符串可以使用
    phone_num = start + new_num
    f.write(phone_num+'\n')
f.close()

2、监控服务器日志,如果ip出现的次数大于50次,就把该ip加入黑名单。日志文件的格式如下图

分析:(1)首先,我们应该从日志里提取出所有ip。根据日志文件的格式,我们可以看到每一行的开头是ip,那么可以一行一行的读取数据,然后用空格进行分割,则该行第一个元素就是ip。(2)然后需要统计每一个ip出现的次数,最直接想到的就是count。(3)找到出现次数大于50次的ip,打印出来。

import time

point = 0#记录文件指针的位置
while True:
    f = open(r'D:\access.log',encoding='utf-8')
    all_ips = []
    f.seek(point)#移动文件指针,本次接着上次的位置继续读
    for line in f:
        ip = line.split()[0]#每行第一个元素为IP
        all_ips.append(ip)#存放所有的ip,不去重
    point = f.tell()# 获取文件当前指针位置
    ips_set = set(all_ips)#去重
    for i in ips_set:
        if all_ips.count(i)>50:
            print('应该加入黑名单的ip是:%s\n'%i)
    time.sleep(60)#暂停60s

运行结果为:

 

 

文件的读写有三种形式:读、写和追加。

一、读模式 r 和读写模式 r+

1、读模式 r

读模式r特点:(1)只能读,不能写;(2)文件不存在时会报错。

(1)例:读取当前目录下的books.txt文件,该文件如下所示。

   

解析:

a、用open打开文件,在python3中只有open。python2可以用open和file。关闭文件是close()。一般有开就有关

b、如果在当前目录,可以直接写文件名,否则需添加路径。

c、如果不写 'r',即写成  f = open('books.txt'),也是默认读模式。

d、read可以将文件所有的内容都读出来

e、另外,有时需要添加解码格式 encoding ,格式为:

f = open('books','r',encoding='utf-8')

(2)

read:读取文件全部内容

readline:读取一行

readlines:读取每一行,并且放到一个list里

  

  

2、读写模式 r+

 读写模式r+特点:(1)文件不存在时会报错;(2)可以读,也可以写,是覆盖写,会把文件最前面的内容覆盖

  

‘水浒传’把‘平凡的’三个字覆盖,原来的文件变为:

二、写模式 w 和写读模式 w+

1、写模式

写模式w特点:(1)只能写,不能读;(2)写的时候会把原来文件的内容清空;(3)当文件不存在时,会创建新文件。

如下,写入‘水浒传’时,将原来的内容都清空。

   

f.flush():有时我们用f.write()后,会发现没有写入文件,这是因为内容存在了缓冲区,需要等缓冲区满了之后,再把所有数据写入。此时可以用f.flush()强制把缓冲区里面的数据写到磁盘上。

 2、写读模式 w+

写读模式w+特点:(1)可以写,也可以读;(2)写的时候会把原来文件的内容清空;(3)当文件不存在时,会创建新文件。

三、追加模式a和追加读模式a+

1、追加模式a

追加模式a特点:(1)不能读;(2)可以写,是追加写,即在原内容末尾添加新内容;(3)当文件不存在时,创建新文件。

  

将‘水浒传’添加到原内容的末尾

2、追加读a+模式

追加读a+模式特点:(1)可读可写;(2)写的时候是追加写,即在原内容末尾添加新内容;(3)当文件不存在时,创建新文件。

以上几种模式,可以用下表来总结:

读写模式 是否可读 是否可写 文件不存在时
r 报错
r+ 是,覆盖写 报错
w 是,清空原内容 创建新文件
w+ 是,清空原内容 创建新文件
a 是,追加写 创建新文件
a+ 是,追加写 创建新文件

四、文件指针

文件指针用来记录文件走到哪里。

文件指针是很重要的,我们看下面的例子中,read将books.txt的内容全部读了出来,readline则没有读出任何内容。原因是read读完之后,文件指针到了文件的末尾,此时readline接着从这个位置开始读,肯定是没内容的。因此有时需要调整文件指针的位置。

   

seek可以移动文件指针,移动后只是针对读,用追加模式写的时候,还是在末尾写。

另外,seek(num),这个num指的是字符,不是行。

在上面的代码中,加一句f.seek(0),即可将指针移到文件开头。这次,readline就可以从头开始读了。

  

五、自动关闭文件

 with可以自动关闭文件,用法如下:

with open('books.txt','a+') as f:
    f.write('\n三体')

六、文件修改

1、简单粗暴直接修改

最简单粗暴的修改文件,步骤是:

(1)打开文件,获取文件内容;

(2)对内容进行修改;

(3)清空原来文件的内容;

(4)把新的内容写进去。

这种方法很简单,下面看一个小例子----文件username里存放了姓名和密码,如下图格式。我们要在所有姓名前加上‘A班_’

因为'A班_'中含有中文,需要叫上encoding='utf-8',否则会出现乱码。

   

2、备份文件的方法

当文件很大时,刚刚的方法在一次性读取文件内容和写入新内容时,耗时长,占用磁盘空间也较大。

备份文件的方法可以建立一个备份文件,修改一行写一行,具体步骤如下:

(1)打开2个文件,原文件a和备份文件b。如a.txt    b.txt.bak

(2)删除a文件,将b文件名改为a文件名

例:将文件words里的“花”改成“flower”

七、小练习

 1、产生手机号 前几位一样1861253 后四位随机,写到文件里。

分析:(1)首先要随机产生一些四位数,位数不足的要补0,zfill可以给字符串补0;

(2)需要写到文件里,可以用w或者a模式。文件有打开就要有关闭。

import random
f = open('phones.txt','w')
num = int(input('请输入你要产生的手机号个数:'))
for i in range(num):
    start = '1861253'
    random_num = str(random.randint(1,9999))
    new_num = random_num .zfill(4) #不够4位就补0,仅对字符串可以使用
    phone_num = start + new_num
    f.write(phone_num+'\n')
f.close()

2、监控服务器日志,如果ip出现的次数大于50次,就把该ip加入黑名单。日志文件的格式如下图

分析:(1)首先,我们应该从日志里提取出所有ip。根据日志文件的格式,我们可以看到每一行的开头是ip,那么可以一行一行的读取数据,然后用空格进行分割,则该行第一个元素就是ip。(2)然后需要统计每一个ip出现的次数,最直接想到的就是count。(3)找到出现次数大于50次的ip,打印出来。

import time

point = 0#记录文件指针的位置
while True:
    f = open(r'D:\access.log',encoding='utf-8')
    all_ips = []
    f.seek(point)#移动文件指针,本次接着上次的位置继续读
    for line in f:
        ip = line.split()[0]#每行第一个元素为IP
        all_ips.append(ip)#存放所有的ip,不去重
    point = f.tell()# 获取文件当前指针位置
    ips_set = set(all_ips)#去重
    for i in ips_set:
        if all_ips.count(i)>50:
            print('应该加入黑名单的ip是:%s\n'%i)
    time.sleep(60)#暂停60s

运行结果为:

 

Guess you like

Origin www.cnblogs.com/guohu/p/11308023.html
Recommended