How to use python to export large amounts of data to Excel

Reprinted from the product is slightly Library  http://www.pinlue.com/article/2020/04/0210/1610100837708.html

(1) Problem Description: In order to better display data, Excel data file formats are often more advantageous than a text file, but specific to the python, how to export data to Excel it? If you hit a need to export large amounts of data And how does it work? In this paper, to solve these two problems.

PS Note: Many people learn Python process will encounter a variety of problems to worry about, no one to help answering easy to give up. For this reason small series built a Python full-stack free Q & A skirt: under seven Yiyi nine plays from Pakistan and five (digital homonym) conversion can be found, but not the older drivers have problems to solve, there is also the latest Python Tutorial Project You can get together with each other ,, supervise progress together!

(2) the following steps:

1. The first step, the installation openpyxl,

Use pip install openpyxl can, but installed in the windows version is 2.2.6, but centos version 4.1 is automatically installed,

Code written to run under windows no problem, but on centos was being given, say ew = ExcelWriter (workbook = wb) at least one argument, so decisive 2.2.6 version on my 237 server installed, the problem is solved.

pip install openpyxl==2.2.6

2. The second step, ha ha, no friends, do not talk nonsense, directly on the code, ps, implement code contains two versions of xlwt and openpyxl.

(3) Further reading: through access to information found online opinions, summed up in the following points:

python Excel operation related module lib has two groups is xlrd, xlwt, xlutils, another group is openpyxl,

But the former group (xlrd, xlwt) older, can only be handled by Excel 97-2003 or Excel 97 previous versions generated excel file xls format, xlwt not even support a later version of the 07 excel, excel file this format in general the maximum can only support 256 or excel files 65536 lines.

So the face of large amounts of data need to be exported to excel the case, you will have the following three options, 1) is stored in a different format, such as saving as a CSV file 2) Use openpyxl-, because of its support for Excel 2007+ xlsx / xlsm format processing 3) win32 COM (Windows only)

Of course, we have to face the difficulties, in order to better display products and data to users, we still choose the second.

ps, very lucky, after some searching I found openpyxl, support excel 07+ and have someone in the maintenance, documentation legible, referring Tutorial and API documentation will soon be able to get started, this is it -

[20180713 add ps, the following code __version__ openpyxl = '2.2.6' and openpyxl of __version__ = '2.4.8' test passes, taking into account the latest version openpyxl is __version__ = '2.5.4', For the latest version of python operation example excel

(4) Without further ado, directly on the code, please refer to

# coding:utf-8

'''

# We want to help Ha, I would like him to ask questions

create by yaoyz

date: 2017/01/24

'''

importxlrd

importxlwt

# Workbook related

fromopenpyxl.workbookimportWorkbook

# ExcelWriter, a very strong package write excel functions

fromopenpyxl.writer.excelimportExcelWriter

# Eggache a column of numbers into letters method

fromopenpyxl.utilsimportget_column_letter

fromopenpyxl.reader.excelimportload_workbook

classHandleExcel():

'' 'Excel related class of operation' ''

def__init__(self):

self. head_row_labels = [u 'students' ID', u 'student names', u' Contact ', u' knowledge ID ', u' knowledge name ']

"""

function:

Read each record in the txt file, save it in the list

Param:

filename: the file name to be read out

Return:

list of records returned: res_list

"""

defread_from_file(self,filename):

res_list=[]

file_obj=open(filename,"r")

forlineinfile_obj.readlines ():

res_list.append(line)

file_obj.close()

returnres_list

"""

function:

Read each record in the * .xlsx, save it in return data_dic

Param:

excel_name: To read out the filename

Return:

dict returned records: data_dic

"""

defread_excel_with_openpyxl(self, excel_name="testexcel2007.xlsx"):

# Read excel2007 file

wb = load_workbook(filename=excel_name)

# Show how many tables

print"Worksheet range(s):", wb.get_named_ranges()

print"Worksheet name(s):", wb.get_sheet_names()

# Take the first table

sheetnames = wb.get_sheet_names()

ws = wb.get_sheet_by_name(sheetnames[0])

# Display the number of table names, table rows, table columns

print"Work Sheet Titile:",ws.title

print"Work Sheet Rows:",ws.get_highest_row()

print"Work Sheet Cols:",ws.get_highest_column()

# Get read excel spreadsheet how many rows, the number of columns

row_num=ws.get_highest_row()

col_num=ws.get_highest_column()

print "row_num:", row_num "col_num:", col_num

# Established to store data dictionary

data_dic = {}

sign=1

# The data stored in the dictionary

forrowinws.rows:

temp_list=[]

# print "row",row

forcellinrow:

printcell.value,

temp_list.append(cell.value)

print""

data_dic[sign]=temp_list

sign+=1

printdata_dic

returndata_dic

"""

function:

Read each record in the * .xlsx, save it in return data_dic

Param:

records: to be saved, a list containing each record

save_excel_name: Save as file name

head_row_stu_arrive_star:

Return:

dict returned records: data_dic

"""

defwrite_to_excel_with_openpyxl(self,records,head_row,save_excel_name="save.xlsx"):

# Create a new workbook

wb = Workbook()

# Create a excelWriter

ew = ExcelWriter(workbook=wb)

# Set the output path and file name

dest_filename = save_excel_name.decode('utf-8')

# The first sheet is ws

ws = wb.worksheets[0]

# Set the name of ws

ws.title ="range names"

# Write the first row, the header row

forh_xinrange (1, County (head_row) +1):

h_col=get_column_letter(h_x)

#print h_col

ws.cell('%s%s'% (h_col,1)).value ='%s'% (head_row[h_x-1])

# Write those lines the second line and beyond

i =2

forrecordinrecords:

record_list=str(record).strip().split("\t")

forxinrange(1,len(record_list)+1):

col = get_column_letter(x)

ws.cell('%s%s'% (col, i)).value ='%s'% (record_list[x-1].decode('utf-8'))

i +=1

# Write file

ew.save (filename = dest_filename)

"""

function:

Excel content test output

Read Excel files

Param:

excel_name: To read the Excel file name

Return:

no

"""

defread_excel(self,excel_name):

workbook=xlrd.open_workbook(excel_name)

printworkbook.sheet_names()

# Get all sheet

printworkbook.sheet_names()# [u'sheet1', u'sheet2']

sheet2_name = workbook.sheet_names()[1]

# Sheet according to acquire sheet contents or index name

sheet2 = workbook.sheet_by_index (1) # sheet index starts from 0

sheet2 = workbook.sheet_by_name('Sheet1')

Name # sheet, the number of rows, columns

printsheet2.name,sheet2.nrows,sheet2.ncols

# Get whole rows and columns of values ​​(array)

rows = sheet2.row_values ​​(3) # get the fourth line content

cols = sheet2.col_values ​​(2) # acquires content third column

printrows

printcols

# Obtaining cell contents

printsheet2.cell(1,0).value

printsheet2.cell_value(1,0)

printsheet2.row(1)[0].value

# Acquiring the data type of the cell contents

printsheet2.cell(1,0).ctype

# Obtained by Name

returnworkbook.sheet_by_name(u'Sheet1')

"""

function:

Set cell style

Param:

name: the name of the font

height: font height

bold: Capitalization

Return:

style: Back-set format object

"""

defset_style(self,name,height,bold=False):

style = xlwt.XFStyle () # initialize style

font = xlwt.Font () # Create a font style

font.name = name# 'Times New Roman'

font.bold = bold

font.color_index =4

font.height = height

borders= xlwt.Borders()

borders.left=6

borders.right=6

borders.top=6

borders.bottom=6

style.font = font

style.borders = borders

returnstyle

"""

function:

Cell styles according to the settings changed from the calculation result is stored as an Excel txt

Param:

dataset: To save the result data, list storage

Return:

Save the results to excel object

"""

defwrite_to_excel(self, dataset,save_excel_name,head_row):

f = xlwt.Workbook () # Create a workbook

# Create the first sheet:

# sheet1

count=1

sheet1 = f.add_sheet(u'sheet1', cell_overwrite_ok=True)# 创建sheet

# The first line of the title:

forpinrange (LEN (head_row)):

sheet1.write(0,p,head_row[p],self.set_style('Times New Roman',250,True))

default=self.set_style('Times New Roman',200,False)# define style out the loop will work

forlineindataset:

row_list=str(line).strip("\n").split("\t")

forppinrange(len(str(line).strip("\n").split("\t"))):

sheet1.write(count,pp,row_list[pp].decode('utf-8'),default)

count+=1

f.save (save_excel_name) # Save the file

defrun_main_save_to_excel_with_openpyxl(self):

print "test reading and writing excel file 2007 and beyond xlsx, to facilitate more data written to the file"

print "1. txt file is read into the memory to store a list object"

dataset_list=self.read_from_file("test_excel.txt")

'''test use openpyxl to handle EXCEL 2007'''

print "2. to write the file to an Excel spreadsheet."

head_row_label=self.head_row_labels

save_name="test_openpyxl.xlsx"

self.write_to_excel_with_openpyxl(dataset_list,head_row_label,save_name)

print "3. is finished, save the txt file format Excel file for the task."

defrun_main_save_to_excel_with_xlwt(self):

print "4. txt file is read into the memory to store a list object"

dataset_list=self.read_from_file("test_excel.txt")

'''test use xlwt to handle EXCEL 97-2003'''

print "5. The file is written to an Excel spreadsheet."

head_row_label=self.head_row_labels

save_name="test_xlwt.xls"

self.write_to_excel_with_openpyxl(dataset_list,head_row_label,save_name)

print "6. is finished, save the txt file format Excel file for the task."

if__name__ =='__main__':

print"create handle Excel Object"

obj_handle_excel=HandleExcel()

# Openpyxl were used to write data to a file and xlwt

obj_handle_excel.run_main_save_to_excel_with_openpyxl()

obj_handle_excel.run_main_save_to_excel_with_xlwt()

'' 'Test readout file note openpyxl xls file may not be read, to xlrd xlsx format file may not be read' ''

#obj_handle_excel.read_excel_with_openpyxl("testexcel2003.xls")  # 错误写法

#obj_handle_excel.read_excel_with_openpyxl("testexcel2003.xls") # 错误写法

obj_handle_excel.read_excel("testexcel2003.xls")

obj_handle_excel.read_excel_with_openpyxl("testexcel2007.xlsx")

Those are the times to share, understand about it?

 

Published 60 original articles · won praise 58 · Views 140,000 +

Guess you like

Origin blog.csdn.net/yihuliunian/article/details/105310605