Pandas and Excel implement Pytest data-driven

introduction

The automated testing of tweets shared earlier  must be-data-driven DDT   introduced the operation of JSON and YAML files in the unittest framework to achieve data-driven. So in pytest, how should it be implemented?

 

Pytest manipulates JSON/YAML files to achieve data-driven

First, create a method to read JSON files and YAML files according to the basic idea of ​​using pytest for data-driven:

def test_read_data_from_json_yaml(data_file):

    return_value = []

    data_file_path = os.path.abspath(data_file)

    print(data_file_path)

    _is_yaml_file = data_file_path.endswith((".yml", ".yaml"))

    with codecs.open(data_file_path, 'r', 'utf-8') as f:

        #从YAML或JSON文件中加载数据

        if _is_yaml_file:

            data = yaml.safe_load(f)

        else:

            data = json.load(f)

    for i, elem in enumerate(data):

        if isinstance(data, dict):

            key, value = elem, data[elem]

            if isinstance(value, dict):

                case_data = []

                for v in value.values():

                    case_data.append(v)

                return_value.append(tuple(case_data))

            else:

                return_value.append((value,))

    return return_value

The function test_read_data_from_json_yaml realizes automatic reading of JSON files and YAML files, and extracts the data in the JSON files and YAML files, and returns them in a way that can be received by pytest.mark.parametrize.

 

With this function method, the data of the JSON or YAML file can be converted into the format recognized by pytest.mark.parametrize and passed directly through this method.

 

Let's practice it and create the following file directories in the root directory of the APITest project:

|--APITest

    |--tests_pytest_ddt

        |--test_baidu_ddt.py

        |--test_baidu_ddt.json

        |--test_baidu_ddt.yaml

        |--test_baidu_ddt.xlsx

        |--__init__.py

        |--conftest.py

Among them, the content of the test_baidu_ddt.json file is as follows:

{ 

  "case1": {

  "search_string": "testing",

  "expect_string": "Testing"

  },

  "case2": {

  "search_string": "hello_world.com",

  "expect_string": "Testing"

  }

}

The content of the test_baidu_ddt.yaml file is as follows:

"case1":

  "search_string": "testing"

  "expect_string": "Testing"

"case2": 

  "search_string": "hello_world.com"

  "expect_string": "Testing"

The code of the test_baidu_ddt.py file is as follows:

import codecs
import json
import os
import time
import pytest
import yaml


def test_read_data_from_json_yaml(data_file):

    return_value = []

    data_file_path = os.path.abspath(data_file)

    print(data_file_path)

    _is_yaml_file = data_file_path.endswith((".yml", ".yaml"))

    with codecs.open(data_file_path, 'r', 'utf-8') as f:

        #从YAML或JSON文件中加载数据

        if _is_yaml_file:

            data = yaml.safe_load(f)

        else:

            data = json.load(f)

    for i, elem in enumerate(data):

        if isinstance(data, dict):

            key, value = elem, data[elem]

            if isinstance(value, dict):

                case_data = []

                for v in value.values():

                    case_data.append(v)

                return_value.append(tuple(case_data))

            else:

                return_value.append((value,))

    return return_value


@pytest.mark.baidu
class TestBaidu:

    @pytest.mark.parametrize('search_string, expect_string',  test_read_data_from_json_yaml('tests_pytest_ddt/test_baidu_ddt.yaml'))

    def test_baidu_search(self, login, search_string, expect_string):

        driver, s, base_url = login

        driver.get(base_url + "/")

        driver.find_element_by_id("kw").send_keys(search_string)

        driver.find_element_by_id("su").click()

        time.sleep(2)

        search_results = driver.find_element_by_xpath('//*[@id="1"]/h3/a').get_attribute('innerHTML')

        print(search_results)

        assert (expect_string in search_results) is True



if __name__ == "__main__":

    pytest.main(['-s', '-v'])

The code in this file is   almost the same as the code explained in the Pytest test framework-data-driven . The only change is that a new method test_read_data_from_json_yaml has been added. In addition, the parameters of @pytest.mark.parametrize have changed from directly providing parameters to files. Provide parameters.

(test_read_data_from_json_yaml('tests_pytest_ddt/test_baidu_ddt.yaml'))

 

Run it in the command line as follows:

D:\Python_Test\APITest>pytest tests_pytest_ddt -s -v

 

After running, the results are as follows:
imageIt
can be seen that both test cases have been executed, and the data in the YAML file has been read correctly.

So what if we want to execute the data in the JSON file now? Replace the suffix of the yaml file passed in in the above code with the suffix of the json file and execute it again.

 

Pytest manipulates Excel files to achieve data drive

In practical applications, many companies also use Excel for data driving. In python, there are many libraries for reading and writing Excel, the common ones are xlrd, xlwt, and openpyxl. Since xlrd and xlwt can only be used for reading and writing, respectively, to achieve the same read and write operations, it has a large number of lines of code, so it has gradually become less popular. So the following will focus on the use of openpyxl.

 

openpyxl installation

pip install openpyxl

 

openpyxl use

from openpyxl import load_workbook, Workbook

if __name__ == "__main__":
    # 创建一个workbook
    file_name = r'c:\test.xlsx'
    wb = Workbook()

    # 创建一个sheet,名为Testing,把它插入到最前的位置
    wb.create_sheet('Testing',0)

    # 创建一个sheet,名为TEST,把它插入index为1的位置
    wb.create_sheet('TEST',1)

    # 保存表格
    wb.save(file_name)
    
    # 读和写
    # 初始化表格
    wb2 = load_workbook(file_name)

    # 读,获取所有的sheet名称
    print(wb2.sheetnames)


    # 获取sheet名为Testing的表格
    s = wb2['Testing']

    # 将A1行的值设置为Testing
    s['A1'] = 'Testing'

    # 将第2行,第一列的值设置为1
    s.cell(row=2, column=1).value = 1

    # 打印第2行第一列单元格的值 --方法1
    print(s.cell(row=2,column=1).value)

    # 打印第2行第一列单元格的值 --方法2
    print(s['A2'].value)

    # 保存表格
    wb.save(file_name)

As shown in the above code block, it briefly introduces the usage of openpyxl, which involves creating a table, creating a sheet name, reading the value of a cell, setting the value of a cell, and so on. You can see that using openpyxl to operate excel is relatively simple.

 

openpyxl combined with pytest to achieve data-driven

The content of the file test_baidu_ddt.xlsx is as follows (sheet name Testing): Let’s
image
write a method to read Excel, the code is as follows:

def test_read_data_from_excel(excel_file, sheet_name):

    return_value = []

    # 判断文件是否存在
    if not os.path.exists(excel_file):
        raise ValueError("File not exists")

    # 打开指定的sheet
    wb = load_workbook(excel_file)

    # 按照pytest接受的格式输出数据
    for s in wb.sheetnames:
        if s == sheet_name:
            sheet = wb[sheet_name]
            for row in sheet.rows:
                return_value.append([col.value for col in row])

    # 第一行数据是标题,故skip掉
    return return_value[1:]

Update the test_baidu_ddt.py file and add the test_read_data_from_excel method. The updated code is as follows:

import codecs
import json
import os
import time
import pytest
import yaml
from openpyxl import load_workbook


def test_read_data_from_json_yaml(data_file):
    return_value = []
    data_file_path = os.path.abspath(data_file)
    print(data_file_path)

    _is_yaml_file = data_file_path.endswith((".yml", ".yaml"))
    with codecs.open(data_file_path, 'r', 'utf-8') as f:
        # 从YAML或JSON文件中加载数据
        if _is_yaml_file:
            data = yaml.safe_load(f)
        else:
            data = json.load(f)

    for i, elem in enumerate(data):
        if isinstance(data, dict):
            key, value = elem, data[elem]
            if isinstance(value, dict):
                case_data = []
                for v in value.values():
                    case_data.append(v)
                return_value.append(tuple(case_data))
            else:
                return_value.append((value,))
    return return_value


def test_read_data_from_excel(excel_file, sheet_name):

    return_value = []
    if not os.path.exists(excel_file):
        raise ValueError("File not exists")

    wb = load_workbook(excel_file)
    for s in wb.sheetnames:
        if s == sheet_name:
            sheet = wb[sheet_name]
            for row in sheet.rows:
                return_value.append([col.value for col in row])
    print(return_value)
    return return_value[1:]


@pytest.mark.baidu
class TestBaidu:
    # 注意,此处调用我换成了读Excel的方法

    @pytest.mark.parametrize('search_string, expect_string',  test_read_data_from_excel(r'D\Python_Test\APITest\tests_pytest_ddt\test_baidu_ddt.xlsx', 'Testing'))

    def test_baidu_search(self, login, search_string, expect_string):
        driver, s, base_url = login
        driver.get(base_url + "/")
        driver.find_element_by_id("kw").send_keys(search_string)
        driver.find_element_by_id("su").click()
        time.sleep(2)

        search_results = driver.find_element_by_xpath('//*[@id="1"]/h3/a').get_attribute('innerHTML')
        print(search_results)
        assert (expect_string in search_results) is True


if __name__ == "__main__":
    pytest.main(['-s', '-v','tests_pytest_ddt'])

Run it again on the command line as follows:

D:\Python_Test\APITest>pytest tests_pytest_ddt -s -v

Check the results after running, you will find that the test is executed correctly, and the test data is obtained from the sheet name specified by Excel.

 

 

Pandas is data-driven

Openpyxl operates Excel very concisely, but compared to Pandas, it is not concise enough, and openpyxl is not as efficient as Pandas, especially when there are too many table line items, openpyxl is slower.

Pandas is a powerful tool set for analyzing structured data. It is based on Numpy (providing high-performance matrix operations); Pandas is used for data mining and data analysis, and it also provides data cleaning functions. Using Pandas to manipulate Excel data will become very simple.

 

Pandas installation

# pandas默认依赖xlrd库,故先安装xlrd
pip install xlrd

# 安装Pandas
pip install Pandas

 

Pandas syntax

import Pandas as pd

# 首先初始化,engine默认是xlrd
s = pd.ExcelFile(path_or_buffer, engine=None)



# 接着parse
s.parse(sheet_name=0,header=0,names=None,index_col=None,usecols=None,

squeeze=False,converters=None,true_values=None,false_values=None,

skiprows=None,nrows=None,na_values=None,parse_dates=False,

date_parser=None,thousands=None,comment=None,skipfooter=0,

convert_float=True,mangle_dupe_cols=True,**kwds,)

Pandas is very simple to read Excel files. First, use Pandas to initialize ExcelFile. The two parameters path_or_buffer are the path of the file we want to read.

It is recommended to use English path and English naming method for Excel file name, not Chinese.

import pandas as pd

path_or_buffer = r'D:\Python_Test\APITest\tests_pytest_ddt\test_baidu_ddt.xlsx'

engine is the engine used by Pandas. The available options are "xlrd", "openpyxl", "odf", and "pyxlsb". If not provided, xlrd is used by default.

 


After the parameters of the parse function are initialized, the s.parse() function can be used. The parse function has a lot of parameters, and only a few commonly used are listed here.

 

sheet_name: Excel's sheet name
sheet_name can be an integer number, a list name, or a combination of the two.

# 通过整型数字读取。读取第一个sheet。Pandas sheet名下标以0开始
s = pd.ExcelFile(path_or_buffer, sheet_name = 0)

# 通过列表名读取
data = s.parse(sheet_name = 'Testing')

# 通过index读取。读取第一个sheet
data = s.parse(sheet_name = 0)

#组合读取。读取第4个sheet,名为Testing的sheet以及第7个sheet
data = s.parse(sheet_name = [3, 'Testing', 'Sheet6'])

 

header: Which row to use as the column name. The
default value of the header is 0, which is the first row. It can also be set to [0, x].
(For example [0,1] means to use the first two rows as multiple indexes)

data = s.parse(sheet_name = 'Testing', header = 0)

Note: Pandas uses the first row of headers by default, so in Excel, the first row must be title. If the first row is data, the first row of data will be omitted. If you don't want the header, you can pass header=None as a parameter.

 

usecols: The column to be read
usecols receives an integer, starting from 0, such as [0, 1, 2], or column names such as "A:D, F", which means to read columns A to D, and column F .

data = s.parse(sheet_name = 'Testing', usecols='A:D')

 

skiprows: When reading, skip specific rows
skiprows=n, skip the first n rows; skiprows = [a, b, c], skip rows a+1, b+1, c+1 (index starts from 0 )

data = s.parse(sheet_name = 'iTesting', skiprows = [1,2,3])

 

nrows: the number of rows
to be read, only the number of rows to be read

data = s.parse(sheet_name = 'Testing', nrows = 3)

 

Pandas combined with Pytest to achieve data-driven

After understanding the Pandas syntax, let's take a look at how to use Pandas to read Excel data:

def test_read_data_from_pandas(excel_file, sheet_name):

    if not os.path.exists(excel_file):
        raise ValueError("File not exists")

    # 初始化
    s = pd.ExcelFile(excel_file)

    # 解析Excel Sheet
    df = s.parse(sheet_name)

    # 以list格式返回数据
    return df.values.tolist()

As you can see, using pandas to read Excel data is more concise and convenient.
Finally, let's update the test_baidu_ddt.py file, the updated code is as follows:

import codecs
import json
import os
import time
import pytest
import yaml
from openpyxl import load_workbook
import pandas as pd

# 读取Yaml文件和Json文件
def test_read_data_from_json_yaml(data_file):
    return_value = []
    data_file_path = os.path.abspath(data_file)
    print(data_file_path)

    _is_yaml_file = data_file_path.endswith((".yml", ".yaml"))
    with codecs.open(data_file_path, 'r', 'utf-8') as f:
        #从YAML或JSON文件中加载数据
        if _is_yaml_file:
            data = yaml.safe_load(f)
        else:
            data = json.load(f)

    for i, elem in enumerate(data):
        if isinstance(data, dict):
            key, value = elem, data[elem]
            if isinstance(value, dict):
                case_data = []
                for v in value.values():
                    case_data.append(v)
                return_value.append(tuple(case_data))
            else:
                return_value.append((value,))
    return return_value

# 读取Excel 文件 -- openpyxl
def test_read_data_from_excel(excel_file, sheet_name):
    return_value = []

    if not os.path.exists(excel_file):
        raise ValueError("File not exists")

    wb = load_workbook(excel_file)
    for s in wb.sheetnames:
        if s == sheet_name:
            sheet = wb[sheet_name]
            for row in sheet.rows:
                return_value.append([col.value for col in row])
    print(return_value)
    return return_value[1:]

# 读取Excel文件 -- Pandas
def test_read_data_from_pandas(excel_file, sheet_name):
    if not os.path.exists(excel_file):
        raise ValueError("File not exists")
    s = pd.ExcelFile(excel_file)
    df = s.parse(sheet_name)
    return df.values.tolist()

@pytest.mark.baidu
class TestBaidu:

    @pytest.mark.parametrize('search_string, expect_string',  test_read_data_from_pandas(r'D:\Python_Test\APITest\tests_pytest_ddt\test_baidu_ddt.xlsx', 'Testing'))
    def test_baidu_search(self, login, search_string, expect_string):
        driver, s, base_url = login
        driver.get(base_url + "/")
        driver.find_element_by_id("kw").send_keys(search_string)
        driver.find_element_by_id("su").click()
        time.sleep(2)

        search_results = driver.find_element_by_xpath('//*[@id="1"]/h3/a').get_attribute('innerHTML')
        print(search_results)
        assert (expect_string in search_results) is True


if __name__ == "__main__":
    pytest.main(['-s', '-v', 'tests_pytest_ddt'])

Run it again on the command line as follows:

D:\Python_Test\APITest>pytest tests_pytest_ddt -s -v

Check the results after running, you can find that the test is executed correctly, and the test data is obtained from the sheet name specified by Excel through Pandas.


In fact, Pandas can read not only Excel files, but also HTML files, TXT files, JSON files, database files (.sql), etc. In the field of data analysis, Pandas is widely used. For more specific Pandas usage, please refer to it yourself.

Welcome to pay attention to [The Way of Infinite Testing] public account , reply [Receive Resources]
Python programming learning resources dry goods,
Python+Appium framework APP UI automation,
Python+Selenium framework Web UI automation,
Python+Unittest framework API automation,

Resources and codes are sent for free~
There is a QR code of the official account at the bottom of the article, you can just scan it on WeChat and follow it.

Remarks: My personal public account has been officially opened, dedicated to the sharing of test technology, including: big data testing, functional testing, test development, API interface automation, test operation and maintenance, UI automation testing, etc., WeChat search public account: "Infinite The Way of Testing", or scan the QR code below:

 Add attention and let us grow together!

Guess you like

Origin blog.csdn.net/weixin_41754309/article/details/113662924