需要将数据库中的数据导出到excel下载,由于数据量较大,导出时间太长,测试了不同的python模块的速度。
pandas默认使用的是openpyxl,另外使用xlsxwriter对比。直接上代码,数据库中大概有23万条数据,转成excel的大小约为35m。
需要说明一下硬件:
CPU:Intel(R) Core(TM) i5-9300H CPU @ 2.40GHz
内存:8.0 GB
openpyxl:
import pandas as pd import MySQLdb from MySQLdb import cursors import time db_settings = { 'user': '', 'password': '', 'db': '', 'host': '127.0.0.1', 'port': 3306, 'cursorclass': cursors.DictCursor, 'charset': 'utf8' } def main(): start = time.time() con = MySQLdb.connect(**db_settings) cursor = con.cursor() cursor.execute('select * from test') data = cursor.fetchall() df = pd.DataFrame(data) writer = pd.ExcelWriter('output.xlsx', engine='openpyxl') df.to_excel(writer, sheet_name='Sheet1', index=False) writer.save() end = time.time() print(end - start) if __name__ == '__main__': main()
最终,用时为:197.815589427948
xlsxwriter:
import pandas as pd import MySQLdb from MySQLdb import cursors import time db_settings = { 'user': '', 'password': '', 'db': '', 'host': '127.0.0.1', 'port': 3306, 'cursorclass': cursors.DictCursor, 'charset': 'utf8' } def main(): start = time.time() con = MySQLdb.connect(**db_settings) cursor = con.cursor() cursor.execute('select * from test') data = cursor.fetchall() df = pd.DataFrame(data) writer = pd.ExcelWriter('output.xlsx', engine='xlsxwriter') df.to_excel(writer, sheet_name='Sheet1', index=False) writer.save() end = time.time() print(end - start) if __name__ == '__main__': main()
最终,用时为:102.69404006004333
可以看出使用xlsxwriter的效率远远大于默认的openpyxl,也可以不使用pandas,直接使用xlsxwriter试一下