python-day10 (formal learning)

Character Encoding

Computer Basics

Any application to operate the hardware requests need to call to the operating system launch system, and then by the operating system to operate hardware

Principle text editor, file access

  1. Open the editor will open to start a process, it is that after the power failure in memory, so content with the editor to write are also stored in memory data loss
  2. Want to permanently save, you need to click the Save button: Editor, right-memory data to the hard disk brush
  3. We write a py file (not implemented), with no difference between the preparation of other documents, are just a bunch of characters to write it

Principle python interpreter to execute py file

  • The first stage: Python interpreter starts, then start the equivalent of a text editor
  • The second stage: Python interpreter equivalent of a text editor to open test.py file, read from the content on the hard disk test.py file into memory (Small Review: pyhon interpretative decision concerned only the interpreter file contents, do not care about the file extension).
  • The third stage: Python interpreter interpreted just loaded into memory test.py code (ps: at this stage, that is when the real implementation of the code, will recognize the syntax, the code within the Python executable file, when executed to name = " egon ", the memory will open up space for the string" egon ").

Similarities and differences python interpreter and text editing

  • The same point: Python interpreter is interpreted contents of the file, so Python interpreter have read py file function, which is the same as with a text editor.
  • Different points: a text editor, file contents into memory, in order to display or edit, simply ignore Python syntax, but Python interpreter will file contents into memory, not in order to give you a peek in to write Python code the what, but to execute Python code that identifies the Python syntax.

Character encoding Introduction

The computer will want to understand the human character by character encoding, because the computer only recognize 0 and 1. Character encoding of the encoding process is character -> Translation Process -> Digital

Character encoding classification

utf-8 (future trends)

gbk (China)

unicode (universal identification)

shift_jis (Japan)

euc-kr (Korea)

ascii (United States)

Garbled analysis

First, a clear concept

  • Brush file from memory to the hard disk storage file operations referred
  • Read files from the hard disk file referred to as memory read operation

Garbled two cases:

  • Garbled one: When the file has been saved garbled

When save files, because each country within the text file, we go alone to keep shiftjis,
text from other countries is essentially because there is no correspondence between the shiftjis found in storage caused by failure. But when we insist on deposit when the editor does not complain (Do your coding errors, the editor of this software along with the collapse of yet ???), but there is no doubt that the deposit can not be saved and hard, definitely keep chaos , and save the file that is garbled stage has already occurred, and when we open the file with shiftjis, Japanese may be normal, but Chinese is garbled.

  • Garbled two: do not read files saved garbled garbled file

Save files when using utf-8 encoding, ensuring compatibility nations, will not be garbled, but chose the wrong decoding mode when reading a file, such as gbk, distortion occurs in the read step for reading stage garbled can be solved, the election of the right decoding scheme ok.

to sum up

  1. To ensure that no distortion of the core of the law is, according to what character encoding standard, according to what criteria will decode standard here refers to the character encoding.
  2. Written in memory of all the characters, equally, it is Unicode encoding, such as we open the editor, enter a "you", and we can not say "you" is a Chinese character, then it is just a symbol, which may be a lot countries are to use, depending on the input method we use the word style may be not the same. Only when we have to save hard disk or network-based transmission in order to determine the "you" in the end is a character, or a Japanese word, which is converted into other Unicode encoding format of the process. In short, the use of fixed memory is Uncidoe coding, the only thing we can change is stored to the hard disk to use when coding.

File Operations

Three basic operations

Open the file r mode can only be read but not write

# rt: read by text
# windows的操作系统默认编码为gbk,因此需要使用utf8编码
f = open('32.txt', mode='rt', encoding='utf8')
data = f.read()
print(data)
print(f"type(data): {type(data)}")
f.close()  #文件读完后指针会跑到文件末端,再次读取的将会是空格
aaa
bbb
ccc
nick最帅吗
type(data): <class 'str'>

readline()和readlines()

# f.readline()/f.readlines()
f = open('32.txt', mode='rt', encoding='utf8')
print(f"f.readable(): {f.readable()}")  # 判断文件是否可读
data1 = f.readline()
data2 = f.readlines()
print(f"data1: {data1}")
print(f"data2: {data2}")
f.close()
f.readable(): True
data1: aaa

data2: ['bbb\n', 'ccc\n', 'nick最帅吗']

Open the file w mode you can only write can not read

# wt
f = open('34w.txt', mode='wt', encoding='utf8')
print(f"f.readable(): {f.readable()}")
f.write('nick 真帅呀\n')  # '\n'是换行符
f.write('nick,nick, you drop, I drop.')
f.write('nick 帅的我五体投地')
f.flush()  # 立刻将文件内容从内存刷到硬盘,该模式会先清空文件内的所有内容再写入
f.close()
f.readable(): False

Open the file of a model, only the additional

# at
f = open('34a.txt', mode='at', encoding='utf8')
print(f"f.readable(): {f.readable()}")
f.write('nick 真帅呀\n')  # '\n'是换行符
f.write('nick,nick, you drop, I drop.')
f.write('nick 帅的我五体投地')
f.close()
f.readable(): False

Open a binary file, b mode may be used to access audio and pictures

try:
    import requests

    response = requests.get(
        'http://www.chenyoude.com/Python从入门到放弃/文件的三种打开模式-mv.jpg?x-oss-process=style/watermark')
    data = response.content

    f = open('mv.jpg?x-oss-process=style/watermark', 'wb')
    f.write(data)
    print('done...')
    f.close()
except Exception as e:
    print(e, '报错了,那就算了吧,以后爬虫处会详细介绍')
done...
f = open('34w.txt', 'wb')
f.write('nick 好帅啊'.encode('utf8'))
f.close()

document management with the operational context

Before we use the open () method of operating a file, open the file open but we also need to manually release the files take up the operating system. But in fact, we can be more convenient to open a file that provides context management tools Python --with open ().

with open('32.txt', 'rt', encoding='utf8') as f:
    print(f.read())

sdf

with open () method not only provides a method for the automatic release of the operating system takes up, with open and can be separated by commas, disposable open multiple files, copy files fast.

with open('32.txt', 'rb') as fr, \
        open('35r.txt', 'wb') as fw:
    f.write(f.read())

Guess you like

Origin www.cnblogs.com/leaf-wind/p/11316574.html