使用pandas导出数据到excel速度测试

需要将数据库中的数据导出到excel下载,由于数据量较大,导出时间太长,测试了不同的python模块的速度。

pandas默认使用的是openpyxl,另外使用xlsxwriter对比。直接上代码,数据库中大概有23万条数据,转成excel的大小约为35m。

需要说明一下硬件:

CPU:Intel(R) Core(TM) i5-9300H CPU @ 2.40GHz

内存:8.0 GB

openpyxl:

import pandas as pd
import MySQLdb
from MySQLdb import cursors
import time

db_settings = {
    'user': '',
    'password': '',
    'db': '',
    'host': '127.0.0.1',
    'port': 3306,
    'cursorclass': cursors.DictCursor,
    'charset': 'utf8'
}


def main():
    start = time.time()
    con = MySQLdb.connect(**db_settings)
    cursor = con.cursor()
    cursor.execute('select * from test')
    data = cursor.fetchall()
    df = pd.DataFrame(data)
    writer = pd.ExcelWriter('output.xlsx', engine='openpyxl')
    df.to_excel(writer, sheet_name='Sheet1', index=False)
    writer.save()
    end = time.time()
    print(end - start)


if __name__ == '__main__':
    main()

最终,用时为:197.815589427948

xlsxwriter:

import pandas as pd
import MySQLdb
from MySQLdb import cursors
import time

db_settings = {
    'user': '',
    'password': '',
    'db': '',
    'host': '127.0.0.1',
    'port': 3306,
    'cursorclass': cursors.DictCursor,
    'charset': 'utf8'
}


def main():
    start = time.time()
    con = MySQLdb.connect(**db_settings)
    cursor = con.cursor()
    cursor.execute('select * from test')
    data = cursor.fetchall()
    df = pd.DataFrame(data)
    writer = pd.ExcelWriter('output.xlsx', engine='xlsxwriter')
    df.to_excel(writer, sheet_name='Sheet1', index=False)
    writer.save()
    end = time.time()
    print(end - start)


if __name__ == '__main__':
    main()

最终,用时为:102.69404006004333

可以看出使用xlsxwriter的效率远远大于默认的openpyxl,也可以不使用pandas,直接使用xlsxwriter试一下

猜你喜欢

转载自www.cnblogs.com/shouwangrenjian/p/12576323.html