Python Reptile 3.2 - csv usage Tutorial

Overview

At the same time this series document for learning Python crawler technology simple tutorial to explain and consolidate their technical knowledge, just in case they accidentally useful to you so much the better.
Python version is 3.7.4

On an article about how to save data in json format, this one we look at how to read and write data using csv module.

csv Introduction

What is csv

CSV (Comma Separated Values), i.e., a comma separated value (also called a character separated values, may not be a comma as a delimiter), it is a common text format for storing data tables, including numbers or characters.

csv use of very broad, many programs will involve the use csv, but csv has no universal standard, so when dealing with csv format often encounter trouble. Therefore, when using csv must follow a certain standard, which is not fixed, but everyone should set their own standards, so if you are using csv will not make a stupid mistake.

Using csv library

csv json library and the library is the same as a standard library of Python, it does not require installation you can use. Csv on the use of the library, but also from the two in terms of read and write, csv library has four main methods are:

  • csv.reader()Read the data from csv file
  • csv.DictReader()Create an object like a regular reader of the same operation, but the information will be mapped to read a dict
  • csv.writer()Write data to a csv file
  • csv.DictWriter()Create an object like a regular writer of the same operation, but the dictionary is mapped to the output line

reader()Read

reader(iterable, dialect='excel', *args, **kwargs)

It returns a reader object that will iterate given csvfile in a row.

iterableIt can be any object supports the iterator protocol, and each next()returns a string when calling its methods - the list of file objects and objects are appropriate. If csvfile is a file object, then it must be open to "r" sign on the platform. You may be given an optional dialectparameter that is used to define a specific set of parameters CSV dialect. To modify the delimiter between the columns and the columns may be passed delimiterparameters. Other optional argsprovides keyword parameters to cover each of the current format parameter Dialects. Using the following sample code:

    # 引入csv库
    import csv
    
    # 以读的方式打开order.csv文件
    with open('order.csv', 'r') as csv_reader:
        reader = csv.reader(csv_reader)
        # 如果不想获取标题,则使用next()函数从下一行开始获取数据
        next(reader)
        for row in reader:
            print(row)
            # 可以根据下标进行获取指定列数据
            # print(row[2])

DictReader()Read

class csv.DictReader(f,fieldnames = None,restkey = None,restval = None,dialect ='excel'* args,** kwds)

This is another read operation, to create an object image reader as a conventional operation, the mapping information to a read dict, which is the bond by the optional fieldnamesparameters are given. Parameter field name is a sequence, the order in which the elements of the input data associated with fields. These elements become the key outcomes dictionary. If omitted, fieldnamesthe parameter, the value of the first line in the file f is the field name. If the field contains the row read more field names sequence, then adding the remaining data typed by sequence restkey value. If the number of fields is less than the line read sequence field names, it will use the remaining keys restval optional parameter. Any other optional or keyword arguments are passed to the underlying reader instance. Using the following sample code:

    # 引入csv库
    import csv
    
    # 以读的方式打开order.csv文件
    with open('order.csv', 'r') as csv_reader:
        reader = csv.DictReader(csv_reader, fieldnames=['order_no', 'user_name'])
        for row in reader:
            print(row)
            # print(row['order_no'])

writer()Write

writer(fileobj, dialect='excel', *args, **kwargs)

Returns a writer object, the user is responsible for converting the data string is separated on a given file-like object. fileobjThe method can be any object with a write (). If fileobja file object, then it must be "b" logo on the open platform, which will have an impact. You may be given an optional dialectparameter that is used to define a specific set of parameters CSV dialect. Using the following sample code:

    # 引入csv库
    import csv
    
    # 声明定义头
    header = ['username', 'age', 'height']
    # 声明定义一个列表
    lt = [
        ('张三', 25, 180),
        ('李四', 26, 172),
        ('王五', 27, 183)
    ]
    
    # 设置编码方式为utf8,新行为空字符串,否则默认为\n
    with open('user.csv', 'w', encoding='utf-8', newline='') as fp:
        writer = csv.writer(fp)
        # 写入单行
        writer.writerow(header)
        # 写入多行
        writer.writerows(lt)

DictWriter()Write

class csv.DictWriter(f,fieldnames,restval ='',extrasaction ='raise',dialect ='excel'* args,** kwds)

Create an object like a regular writer of the same operation, but the dictionary is mapped to the output line. fieldnamesParameter field name is a sequence identifier is written to a file in which values in order to pass the key in the dictionary WriteRow () method f. If the dictionary is missing a key field name, the optional restvalparameter specifies the value to be written. If the dictionary passed to the method contained in the field name could not be found in key, then the optional extrasactionparameter indicates the action to take. Using the following sample code:

    # 引入csv库
    import csv
    
    with open('user.csv', 'w', encoding='utf-8', newline='') as fp:
        # 声明定义头
        header = ['username', 'age', 'height']
        # 声明定义一个列表,里面是对象
        lt = [
            {'username': '张三', 'age': 26, 'height': 183},
            {'username': '李四', 'age': 24, 'height': 180},
            {'username': '王五', 'age': 23, 'height': 189},
        ]
        writer = csv.DictWriter(fp, header)
        # 写入表头数据的时候,需要掉用writeheader方法
        writer.writeheader()
        writer.writerows(lt)

to sum up

Precautions csv library:

  1. To pay attention to when you open the file mode, read by r, written w;
  2. To set when opening files newline=''(the null character);
  3. To specify the encoding when opening a file open, the same writing;
  4. If the separator will remain too, when the read delimiters must be consistent;
  5. csv does not check the format (although there is a strict mode, but under strict format mode will not have to check), must pay attention to the format when writing.

These are the csv library using the methods and precautions.

Other Bowen link

Published 154 original articles · won praise 404 · Views 650,000 +

Guess you like

Origin blog.csdn.net/Zhihua_W/article/details/101674341