Crawler from entry to jail (2) - CSV file operation

The content of the article is from "python crawler development"

2.1 File Operations

2.1.1 Open files: open and with open () as

When using Python to open a text file, first make sure that the file exists.
It is possible to put text files and Python files directly together, so that the text file can be opened directly using the filename.
When reading a file, the parameter "file operation mode" can be omitted, or it can be written as "r", which is the first letter of read.

The "open" keyword is used to open a file and create a file object.
##Method 1: open, you need to close it manually
Please add image description

Method 2: Context management (automatic shutdown)
Please add image description

Parameters: encoding

There is a parameter "encoding" here. This parameter is particularly useful, it can convert the file to UTF-8 encoding format when opening the file, thereby avoiding the appearance of garbled characters. This parameter is only available in Python 3. Using this parameter in Python 2 will result in an error. If the file was created in Windows and garbled characters appear when opening the file with UTF-8, you can change the encoding format to GBK.

2.1.2 Reading files: read and readlines

read: directly return the entire contents of the file as a string:

f.read()

Please add image description

readlines: Read all lines and return the result as a list:

f.readlines()

Please add image description

2.1.3 Writing files: write and writelines

Please add image description
There is one more parameter "w" here, w is the first letter of English write, which means to open the file by writing. In addition to "w", this parameter can also be "a". The difference between them is that if there is already a new.txt file, using "w" will overwrite the original file, resulting in the loss of the original content; while using "a", the new content will be written to the end of the original file .
The other parameters are as follows:
insert image description here
Method 1: write
directly writes a large string of characters into the text, you can use the following line of code:

f.write('一大段')

Method 2: writelines
write all the strings in the list into the text, you can use the following line of code:

f.writelines(['第一段','第二段','第三段'])

It is important to note that when writing a list, the text written by Python will not automatically wrap, and you need to manually enter a newline.
Such as:

f.writelines(['第一段\n','第二段\n','第三段'])

2.2 Read and write CSV files

2.2.1 CSV file

CSV files are essentially text files, but if they are opened directly with a text editor, they are not very readable.
as follows:
Please add image description

2.2.2 Python reads CSV file: DictReader()

Python's standard CSV library documentation: https://docs.python.org/3/library/csv.html

To read a CSV file, you first need to import Python's CSV module:

import csv

Since the CSV file is essentially a text file, it needs to be opened as a text file first, and then the file object is passed to the CSV module:
Operation example:

import csv
with open('result.csv',encoding='utf-8') as f:
    reader= csv.DictReader(f)
    for row in reader:
        print(row)

The result is as follows:
Please add image description

The username, content, and depend_time in the figure are 3.1.1
The row obtained by the for loop in the first line of the picture is an OrderedDict (ordered dictionary), (the returned row in 3.8 is a dict type), which can be used directly like a normal dictionary:

username= row['username']
content=row['content']
reply_time=	row['reply_time']

Usage example
Please add image description

After a few lines of code, the CSV file has been converted into a dictionary.

Note : The code for reading the text content must be placed inside the indentation, otherwise it will cause an error
Please add image description

This is because the value in the f variable is a generator, and the generator only reads the text when it is used (or iterated more accurately). But after exiting the indent of with, the file is closed by Python, and of course nothing can be read at this time.
This limitation can be bypassed using a list comprehension.Please add image description

2.2.3 Python writes CSV file: DictWriter()

Writing a CSV file in Python is a little more complicated than reading a CSV file, because of the column names that need to be specified. The column names must correspond one-to-one with the keys of the dictionary.
Python needs to use the csv.DictWriter() class when writing CSV files. It receives two parameters: the first parameter is the file object f; the second parameter is named fieldnames, and the value is the key list of the dictionary.
Write column name rows: writeheader()
Write a list of dictionaries to a CSV file: writerows()
Write a single dictionary: writerow()
Usage example:

import csv
data=[{
    
    'name':'老大','age':20,'salary':99999},
      {
    
    'name':'老二','age':18,'salary':12345},
      {
    
    'name':'老三','age':15,'salary':00000}]
with open('a.csv','w',encoding='gbk') as f:
    write=csv.DictWriter(f,fieldnames=['name','age','salary'])
    write.writeheader()#写入第一行,name,age,salay
    write.writerows(data)
    write.writerow({
    
    'name':'超人','age':999,'salary':98976654})

Guess you like

Origin blog.csdn.net/weixin_55159605/article/details/124096190
Recommended