Talk about Python data processing family bucket, the most complete summary!

1 Introduction

In actual projects, various configuration files are often encountered, which can enhance the maintainability of the project

Processing methods of common accessory files, including: JSON, ini / config, YAML, XML, etc.

In this article, we will talk about the summary of the configuration file of the Python data processing family bucket

​2.JSON

Python has a built-in JSON module, which can easily manipulate JSON data

The four common methods are:

  • json.load(json_file) parses the JSON file and converts it to the corresponding data type in Python
  • json.loads(json_string) parses a string in JSON format, and the result is a dictionary in Python
  • json.dump(python_content,file_path) writes Python data, including: dict, list to the file
  • json.dumps(python_dict) converts dict in Python to a string in JSON format

Take the following JSON configuration file as an example:

#config.json
{
  "mysql": {
    "host": "198.0.0.1",
    "port": 3306,
    "db": "xh",
    "username": "root",
    "password": "123456",
    "desc": "Mysql配置文件"
  }
}

1. Read the configuration file

There are two ways to read the configuration file, namely:

Use json.load() to read the configuration file directly

Or, read the content in the configuration file first, and then use json.loads() to convert to Python data type

It should be pointed out that in the face of complex hierarchical JSON configuration files, you can use jsonpath to read; jsonpath is similar to xpath, and you can quickly read data through regular expressions

import json 

def read_json_file(file_path): 
    """ 
    Read json file 
    : param file_path: 
    :return: 
    """ 
    with open(file_path,'r', encoding='utf-8') as file: 

        # Reading method two Choose one 
        # Method one 
        result = json.load(file) 

        # Method two 
        # result = json.loads(file.read()) 

        # Analyze data 
        host_mysql = result['mysql']['host'] 
        port_mysql = result[' mysql']['port'] 
        db = result['mysql']['db'] 

        print('Mysql address:', host_mysql, ",port number:", port_mysql, ",database:",db)
    return result

2. Save the configuration file

Using the json.dump() method in json, you can write a dictionary to a JSON file

def write_content_to_json_file(output_file, content): 
    """ is 
    written to the json file 
    : param output_file: 
    :param content: 
    :return: 
    """ 
    with open(output_file,'w') as file: 
        # is written to the file 
        # Note: In order to ensure the normal display of Chinese, you need to set ensure_ascii=False 
        json.dump(content, file, ensure_ascii=False) 

content_dict = { 
    'mysql': { 
        'host': '127.0.0.1', 
        'port': 3306, 
        'db':'xh', 
        'username':'admin', 
        'password': '123456', 
        'desc':'Mysql database'
    }
}

write_content_to_json_file('./output.json', content_dict)

3. Modify the configuration file

If you need to modify the configuration file, you only need to read the content from the configuration file first, then modify the content, and finally save the modified content in the configuration file.

def modify_json_file():
    """
    修改json配置文件
    :return:
    """
    result = read_json_file('./config.json')

    # 修改
    result['mysql']['host'] = '198.0.0.1'

    write_content_to_json_file('./config.json', result)

3.ini/config

The parsing method of ini configuration file and config configuration file is similar, but the file suffix is ​​not consistent

Here we take the ini configuration file as an example

# config.ini
[mysql]
host = 139.199.1.1
username = root
password = 123456
port = 3306

The ini file is composed of 3 parts, namely: node (Section), key (Key), value (Value)

There are two common ways for Python to process ini files, including:

  • Use the built-in configparser standard module
  • Use configobj third-party dependency library

Let's first take a look at the built-in configparser module

3.1.1 Read configuration file

Instantiate a ConfigParser parse object, use the read() method to read the ini configuration file

from configparser import ConfigParser 

# instantiate parsing object 
cfg = ConfigParser() 

# read ini file content 
cfg.read(file_path)

Use the sections() function to get a list of all nodes

# sections() Get all the sections and return them as a list of 
sections = cfg.sections() 
print(sections)

To get all the keys under a certain node, you can use the options(section_name) function

# Get all the keys of a certain area 
# cfg.options(section_name) 
keys = cfg.options('mysql') 
print(keys)

Through the items(section_name) function, you can get all the key-value pairs under a certain node

# Get the key-value pair 
items under a certain area = cfg.items("mysql") 
print(items)

If you want to get the value under a certain key under a certain node, use the get(section_name,key_name) function

# Read a key value under a certain area 
host = cfg.get("mysql", "host") 
print(host)

3.1.2 Write configuration file

Similar to reading the configuration file, you need to instantiate a ConfigParser parsing object

First, use the add_section(section_name) function to add a node

# Join nodes and key-value pairs 
# Add a node 
cfg.add_section("redis")

Then, you can use the set(section_name,key,value) function to add key-value pairs to a node

# To the node, add the key-value pair 
cfg.set("redis", "host", "127.0.0.1") 
cfg.set("redis", "port", "12345")

Finally, use the write() function to write to the configuration file

# Write to the file 
cfg.write(open('./raw/output.ini','w'))

3.1.3 Modify the configuration file

The step to modify the configuration file is to read the configuration file, then modify it through set(section_name,key,value), and finally use the write() function to write to the file.

def modify_ini_file(file_path):
    """
    修改ini文件
    :return:
    """
    cfg.read(file_path)

    cfg.set("mysql", "host", "139.199.11.11")

    # 写入
    cfg.write(open(file_path, "w"))

Next, let’s talk about the process of using configobj to manipulate the ini configuration file

First install the configobj dependency library

# Dependence 
# pip3 install configobj

3.2.1 Read configuration file

Directly use the path of the ini configuration file as a parameter, and construct an object using the ConfigObj class

from configobj import ConfigObj

# 实例化对象
config = ConfigObj(file_path, encoding='UTF8')

Looking at the source code, you can find that ConfigObj is a subclass of Section node, and Section is a subclass of Dict dictionary

So, you can get the node and key value directly through the key name Key

# <class'configobj.ConfigObj'> 
print(type(config)) 

# <class'configobj.Section'> 
print(type(config['mysql'])) 

# node 
print(config['mysql']) 

# some The value corresponding to a key 
print(config['mysql']

3.2.2 Modify configuration file

Only need to read the configuration file, then directly modify the ConfigObj object, and finally use the write() method to achieve the purpose of modifying the configuration file

def modify_ini_file(file_path): 
    """ 
    Modify the ini file 
    : param file_path: 
    :return: 
    """ 
    # read the configuration file 
    config = read_ini_file(file_path) 

    # directly modify 
    config['mysql']['host'] = '139.199 .1.1' 

    # Delete a key-value pair 
    try: 
        del config['mysql']['db'] 
    except Exception as e: 
        print('key does not exist') 
        pass 

    # write 
    config.write()

3.2.3 Write configuration file

To write a configuration file, you first need to instantiate a ConfigObj object and pass in the file path

Then, set the node, set the key-value pair for the node

Finally, call the write() method to write to the configuration file

def write_to_ini_file(output):
    """
    写入到ini文件中
    :param output:
    :return:
    """
    config = ConfigObj(output, encoding='UTF8')
    config['website'] = {}
    config['website']['url'] = "www.baidu.com"
    config['website']['name'] = "百度"

    # 保存
    config.write()

4.YAML

Python operates YAML files, two common ways are: pyyaml, ruamel.yaml

Use pip to install dependencies

# Installation dependent 
# Method one 
pip3 install pyyaml 

# Method two 
pip3 install ruamel.yaml

Let’s take a simple YAML configuration file as an example to illustrate in two ways

#
果Fruits: 
  # Apple 
  -Apple: 
      name: apple 
      price: 1 
      address: Guangdong 
  # Orange 
  -Orange: 
      name: orange 
      price: 3 
      address: Hunan 
  # Banana 
  -Banana: 
      name: banana 
      price: 2 
      address: Hainan

Let's take a look at pyyaml ​​first

4.1.1 Read configuration file

First, read the configuration file, use yaml.safe_load() to load the data, the data type obtained is a dictionary

import yaml 

with open(file_path, "r") as file: 
    data = file.read() 

    # safe_load() read configuration file 
    # Result data type: dict 
    result = yaml.safe_load(data) 

    print(result)

Then, you can get the key value through the hierarchical relationship of the YAML configuration file

# 3、获取yaml中的值
name = result['Fruits'][0]['Apple']['name']
price = result['Fruits'][0]['Apple']['price']
address = result['Fruits'][0]['Apple']['address']
print("名称:", name, ",price:", price, ",address:", address)

4.1.2 Write configuration file

Use the dump() method in YAML to write a dictionary into the YAML configuration file

It should be noted that in order to ensure the normal display of Chinese writing, you need to configure allow_unicode=True

def write_to_yaml_file(content, file_path): 
    """ is 
    written to the yaml file 
    : param content: 
    :param file_path: 
    :return: 
    """ 

    # is written to the file 
    with open(file_path,'w', encoding='utf -8') as file: 
        yaml.dump(content, file, default_flow_style=False, encoding='utf-8', allow_unicode=True) 

# Define a dictionary 
content = { 
   "websites": [{"baidu": {' url': "www.baidu.com",'name':'Baidu', "price": 100}},{"alibaba": {'url': "www.taobao.com",'name': ' Taobao', "price": 200}},{"tencent": {'url': "www.tencent.com",'name': '腾讯', "price": 300}},]
}

write_to_yaml_file(content, "./raw/new.yaml")

4.1.3 Modify the configuration file

And modify the ini file type, first read the configuration file, then modify the content in the dictionary, and finally use the above writing method, that can achieve the purpose of modifying the configuration file

def modify_yaml_file(): 
    """ 
    Modify yaml file 
    : return: 
    """ 
    content = read_yaml_file('./raw/norm.yaml') 
    print(content) 

    # Modify dict 
    content['Fruits'][0]['Apple ']['price'] = 10086 

    # Rewrite to a new yaml file 
    write_to_yaml_file(content,'./raw/output.yaml')

Next, let’s talk about the process of using ruamel to manipulate YAML configuration files

Ruamel is a derivative version of pyyaml. On the basis of traditional pyyaml, the RoundTrip mode is added to ensure the same read and write order of YAML configuration files

Therefore, it is similar to pyyaml ​​in the way of reading, modifying and writing

4.2.1 Read configuration file

from ruamel import yaml

def read_yaml_file(file_path):
    """
    读取yaml文件
    :param file_path:
    :return:
    """
    with open(file_path, 'r', encoding='utf-8') as file:
        data = file.read()

        # 解析yaml文件
        # 类型:ordereddict
        result = yaml.load(data, Loader=yaml.RoundTripLoader)

        name = result['Fruits'][0]['Apple']['name']
        price = result['Fruits'][0]['Apple']['price']
        address = result['Fruits'][0]['Apple']['address']
        print("名称:", name, ",price:", price, ",address:", address)

    return result

4.2.2 Write configuration file

def write_to_yaml_file(filepath, data):
    """
    写入到yaml文件中
    :param filepath:
    :param data:
    :return:
    """
    with open(filepath, 'w', encoding='utf-8') as file:
        yaml.dump(data, file, Dumper=yaml.RoundTripDumper, allow_unicode=True)

4.2.3 Modify the configuration file

def modify_yaml_file(): 
    """ 
    Modify yaml file 
    : return: 
    """ 
    content = read_yaml_file('./raw/norm.yaml') 

    print(content) 

    # Modify dict 
    content['Fruits'][0]['Apple ']['price'] = 10086 

    # Rewrite to a new yaml file 
    write_to_yaml_file('./raw/output.yaml', content)

5.XML

As a markup language, XML is used to design storage and transmission of data. Many projects often use XML as configuration files and data transmission types

Python's built-in xml module can easily handle XML configuration files

Take the following configuration file as an example:

<?xml version="1.0" encoding="utf-8"?>
<dbconfig>
    <mysql>
        <host>127.0.0.1</host>
        <port>3306</port>
        <dbname>test</dbname>
        <username>root</username>
        <password>4355</password>
    </mysql>
</dbconfig>

First, use xml.dom.minidom.parser(file_path) to parse the configuration file, and use the documentElement property to get the XML root node

import xml.dom.minidom 

# Read configuration file 
dom = xml.dom.minidom.parse("./raw.xml") 

# Use documentElement property to get XML root node 
# Root node 
root = dom.documentElement

Then, use the getElementsByTagName(tag_name) method to get a node

# Get mysql node 
node_mysql = root.getElementsByTagName('mysql')[0]

Finally, use the childNodes property to traverse the child Node nodes of the node to obtain the name and value of the node

# Traverse the child nodes to obtain the name and value 
for node in node_mysql.childNodes: 
    # Node type 
    # 1: Element 
    # 2: Attribute 
    # 3: Text 
    # print(node.nodeType) 
    if node.nodeType == 1: 
        print(node. nodeName, node.firstChild.data)

6. Finally

At this point, the Python data family bucket is all over!

I have uploaded all the source code in the article to the background, ➡ click the blue font to get the source code

Guess you like

Origin blog.csdn.net/weixin_43881394/article/details/109074821