python on the number of characters and the number of bytes in the file operation

I remember just started learning python file operations write files when there is a problem, very strange. Finally we know the reason for this is the number of bytes.

Let's look at an example.

Saved as a new file named utf-8 0117utf-8.txt remember to remove the BOM header.

 

 

 

 

 We try to modify the mode r +

>>> fp=open("d:\\pydelete\\0117utf-8.txt","r+")
>>> fp.write("aa")
2
>>> fp.close()
>>>

The results are as follows

 

A large pile of garbage.

Here involves a number of bytes.

Because utf-8 is kept stored a Chinese three bytes, a memory is stored in English English one byte characters.

If you write "aaa" on it.

 

 So cut to the chase and see the number of bytes and the number of characters.

1.fp.read (arg), fp.write () returns the number is the number of characters.

We have just read a written document it and see.

FP = Open >>> ( "D: \\ \\ pydelete 0117utf-8.txt", "R & lt +", encoding = "UTF-. 8")
>>> fp.read (. 1)
'A'
>>> FP .read (. 1)
'a'
>>> fp.read (. 1)
'a'
>>> fp.read (. 1)    # can be seen here should be the digital number of characters, a character meaning.
'Country'
>>> fp.read (1)
', then'
>>> fp.read (1)
''
>>>

Let's write something.

>>> fp.seek (0.0) # cursor will switch back to the first line
0.0
>>> fp.write ( "newly written 2x")
5                            # is the number of characters you can see here. Express written five characters.
>>> fp.flush ()

 

We count a result before the file content is "aaa country, then" we write the number of bytes in the file before the "newly written 2x" is 9, now written 11 r + mode will now start from scratch cover so contents of the file as "newly written 2x "

There may be garbled if the number of bytes written before is less than (if the original file has a Chinese presence)

 

 

2.fp.tell(),fp.seek(0,0) 是字节数

我们加游标切回行首,再来读下。

>>> fp.seek(0,0)
0
>>> fp.read(1)
'新'
>>> fp.tell()
3                           #可以看出tell是返回的字节数,也就是游标现在应该在新字后边。
>>>

由此我们试着改变游标位置。

>>> fp.seek(0,0)

0
>>> fp.seek(6,0)    #seek的第一个参数也是字节数,那现在游标在哪我如果读一个字符应该是什么?  游标应该在写字后如果读一个字符应该是“入”

6

----------

>>> fp.read(1)
'入'

 

3.fp.truncate(size)

把文件裁成规定的大小,默认是裁到当前文件操作标记的位置。如果size 比文件的大小还要,依据系统的不同可能是不改变文件,

也能是用0把文件补到相应的大小,也可能是以一些随机的内容加上去。参数也是字节数。

我们来尝试一下

>>> fp.seek(0,0)
0
>>> fp.truncate(6)    
6
>>>

此时文件内容应该只剩下“新写”2字。

 

知道这些我们在写文件的时候,就能很清楚效果了。

关于字节数可以参考:

https://www.cnblogs.com/King-Tong/p/11431561.html

Guess you like

Origin www.cnblogs.com/King-Tong/p/12205194.html