Automated testing ----- store test cases, which is better

When implementing automated testing, we often use data-driven. So we often save the test data separately in a file of a specific format, and then drive the automated test code by reading the file.

This article will compare the three mainstream file formats of excel, csv and yaml to see which one is better

1. First look at the most used Excel

Excel is the most widely used data file format in the world. Using python for automated testing, you can use openpyxl, a third-party library, to operate excel.

It is often said that Excel has a lot of restrictions on operations, and the speed of writing and reading is very slow. Is it really?

Let's do an experiment. Create an empty Excel file, and then insert 1000 pieces of data to see how long it will take to insert and read the data.

import openpyxl
def test_insert_1000_lines_data():
    lines = 1000
    workbook = openpyxl.Workbook()
    worksheet = workbook.create_sheet('demo')
    for i in range(lines):
        data = (i, f'name{i}', f'http://www.example.com/{i}')
        worksheet.append(data)
    workbook.save('1000lines.xlsx')
    
def test_read_1000_lines_data():
    workbook = openpyxl.load_workbook('1000lines.xlsx')
    worksheet = workbook['demo']
    for row in worksheet.values:
        pass

How much time will it take to insert 1000 pieces of data? The answer is 0.09 seconds, which is not slow. And it only takes 0.06 seconds to read the 1000 pieces of test data.

In a project, 1000 test cases should be more reasonable, so in normal test scenarios, it is appropriate to use excel to manage test case data in terms of reading efficiency.

But when the data becomes larger and larger, the processing speed of Excel will become slower and slower, and it takes 7 seconds to read the data. This means that if you have multiple projects that need to be tested at the same time, the parsing of excel may have a certain impact on the test efficiency.

1000 lines 50,000 rows 100,000 rows
insert time 0.09s 3.75s 7.44s
read time 0.07s 3.45s 7.1s

2. Look at the csv format again

Whether it is for automated testing or data analysis, the csv format is more suitable. This format is not like Excel, which has to deal with the style of the table, it is more focused on the data.

Moreover, in the Python language, there is a built-in csv format processing module, which is very simple to use and does not require too much additional learning cost.

import csv
def test_insert_1000_lines_data():
    lines = 1000
    with open('1000lines.csv', 'w', newline='') as f:
        csv_writer = csv.writer(f)
        for i in range(lines):
            data = (i, f'name{i}', f'http://www.example.com/{i}')
            csv_writer.writerow(data)
            
def test_read_1000_lines_data():
    with open('1000lines.csv', newline='') as f:
        csv_reader = csv.reader(f)
        for row in csv_reader:
            print(row)

Insert and read 1,000, 50,000, and 100,000 rows of data respectively, and the speed of csv is an order of magnitude faster than excel.

1000 lines 50,000 rows 100,000 rows
insert time 0.08s 0.09s 0.17s
read time 0.02s 0.04s 0.09s

For massive data processing, csv is much faster than excel, and the code writing is simpler. However, you must pay attention to the handling of commas when using the csv format. In csv, the data in each row is separated by commas by default. If you have a data that contains commas, you must remember to wrap the data in double quotes.

Moreover, the data format supported by csv is very limited. After the data is read, it is treated as a string, and additional parsing operations need to be added by yourself.

id,17,18,"{'name': 'yuz', 'age': 11}"

3. Finally, let's take a look at the performance of yaml

The advantage of yaml lies in its rich data type support. Whether it is tuple, dictionary, number, or Boolean types, it can be easily parsed into corresponding python data types by the python language.

import yaml

def test_insert_1000_lines_data():
    lines = 1000
    with open('1000lines.yaml', 'w') as f:
        all_data = [{"id": i,
                     "name": f"name{i}",
                     "data": {"username": "yuz", "passoword": 123456}}
                    for i in range(lines)]
        yaml.safe_dump(all_data, f)
        
def test_read_1000_lines_data():
    with open('1000lines.yaml',
              encoding='utf-8') as f:
        data = yaml.load(f, Loader=yaml.SafeLoader)

In the analysis of a small amount of data, yaml will be very convenient. But once the data increases to tens of thousands of groups, the parsing speed of yaml will be very, very slow. When the data reaches 100,000 rows, the reading speed is close to 1 minute.

1000 lines 50,000 rows 100,000 rows
insert time 0.27s 14.79s 31.78s
read time 0.47s 26.8s 53.63s

Recently, more and more automated testers use yaml to store use cases. On the one hand, they value the rich data formats it supports, and on the other hand, they may be affected by some frameworks.

The interface automation testing framework of httprunner uses yaml to store use case data. In fact, when there is a large amount of data that needs to be read, the processing speed of yaml is an order of magnitude slower than that of Excel. Therefore, a framework such as httprunner is more suitable for testing a single use case or a small number of use cases in terms of test efficiency. If you want to test the entire project or even multiple projects at once, the execution speed of httprunner will be slower.

The following conclusions can be drawn by comparing the operating efficiency of the three formats of Excel, csv and yaml.

  1. If you only want to test a small number of use cases, or have higher requirements for the format of the test data, using yaml to store the use case data will be more convenient for parsing, but in this scenario, you can generally use mature tools such as postman directly, which is not necessary Realize it yourself.

  2. If you are used to the operation of Excel. You can use this method directly, and Excel is still very fast for data below 10,000 rows.

  3. In any case, I still recommend everyone to try the csv format. First of all, even in the case of 100,000 rows of data, its processing speed is very fast. Secondly, the csv module is directly built into the Python language. Its usage is very similar to the open. function, and there is almost no additional learning cost.

If the article is helpful to you, remember to like, bookmark, and add attention. I will share some dry goods from time to time...

END Supporting Learning Resources Sharing

Finally:  In order to give back to the die-hard fans, I have compiled a complete software testing video learning tutorial for you. If you need it, you can get it for free 【保证100%免费】

加入我的软件测试交流qq群:110685036免费获取~(同行大佬一起学术交流,每晚都有大佬直播分享技术知识点)

Software Testing Interview Documentation

We must study to find a high-paying job. The following interview questions are the latest interview materials from first-tier Internet companies such as Ali, Tencent, and Byte, and some Byte bosses have given authoritative answers. Finish this set The interview materials believe that everyone can find a satisfactory job.

è¿éæå¥å¾çæè¿°

How to obtain the full set of information:

è¿éæå¥å¾çæè¿°

Guess you like

Origin blog.csdn.net/myh919/article/details/131483454