Operation excel in python library openpyxl, excel extracted from the source table to the target copy information table excel

Modern life, it is difficult not to deal with the excel sheet, excel sheet has the advantages of easy to use, but when a large amount of data in the table, we need to copy and paste some data (such as ID number) from other statistical forms in time we will be more tired, after all, we are not machines, not for a long time to do some boring repetitive operations. Imagine a scenario where we have a table to fill thousands of lines, you need to enter the corresponding ID number by name, but before we have done a similar table, some of the same person's name with the ID number is complete , then we will need to pass one by one to find the name, identity card number and then copy to our current table do go.

When I repeated day after day with these operations, I always wanted to have an automated tool to do this operation, as the man freed me from this inhuman torture, the thought of finally python, because so I can focus on some of the little details of the internal language, which focus on solving this problem.

Install command pip install openpyxl (line installation) or easy_install openpyxl.

openpyxl operation can be divided into four steps, the first step to create or load an existing workbook workbook into memory, respectively, using

Import load_workbook openpyxl from 
from openpyxl Import the Workbook 
# Loading workbook in the prior 
WB1 = load_workbook ( 'lalala.xlsx') 
"" " 
in a large amount table when the data source, here we can use the openpyxl read_only mode 
load source table the benefits of doing so is not the whole tables are loaded into memory 
"" " 
wb1 = load_workbook (filename = 'lalala.xlsx', READ_ONLY = True) 
# create a Workbook 
wb2 = Workbook ()

The second step is to operate the excel table sheet, workbook () created by the default Workbook name of the active sheet to Sheet, can be verified by python interactive command line.

# Access activity sheet 
WS = wb.active 
# Set sheet heading 
ws.title = "the Range names" 
# to create Pi is the title of the sheet 
WS = wb.create_sheet (title = "Pi") 
# get the title sheet Sheet1 
ws = wb [ 'Sheet1']

The third step is the operating sheet in the cell. It should be noted that the location of a cell is determined by it in the column with the line, such as a cell, it is in column A, and in the third row, can be accessed through ws [ 'A3']. further having a column with the cell row attributes, cell.row with cell.column data types as shown in FIG.

ddcec42b481369372a5550f0c7e1a43.png

Special attention when loading workbook with read_only mode, cell.row with cell.column are int object. cell.column cell is recorded from the number of offset columns where the first column, is not truly representative of the number of columns in the workbook capital letters, such as "A".

# Get the first row, the data type tuplerow = ws [1] # Get A column, the data type tuplecolumn = ws [ 'A'] # set value for the F5 ws [ 'F5'] = ' sfs' # Set the cell value ws [ 'F5']. value = 'hello' # obtain a cell number of rows m = ws [ 'F5'] . row # to get the number of columns cell of n = ws [ 'F5'] . column # obtain a particular area values, such as from F5 to F30, the data type tuplek = ws [ 'F5': 'F30'] # obtain the value of a specific area, such as from F5 to G30, the data type tuplej = ws [ 'F5': 'G30' ] # Gets sheet maximum number of rows row_count = ws.max_row # obtain sheet maximum number of columns column_count = ws.max_column
last step to save changes to note here, if you want to save the table (microsoft office or wps) in other software when you open, save operation will complain

wb1.save('empty_book.xlsx')
wb2.save(filename='other_book.xlsx')

Implementation requirements

Get_info_from_excel.py create a new file to edit with your favorite text editor, you first need to introduce openpyxl library load_workbook module. You can use load_workbook loaded excel table already exist.

from openpyxl import load_workbook

Our aim is to extract information from the source excel sheet and bulk copy to the target excel table, we first define some variables.

# Source Table Name 
source_file_name = 'lalala.xlsx' 
# object table name 
target_file_name = 'lelele.xlsx' 
# source table to extract information sheet 
source_sheet_name = 'Sheet2' 
# destination table to bulk copy sheet information 
target_sheet_name = 'Sheet2 ' 
# header row in the source table which row 
source_header_row. 3 = 
# rows in the target table header row which 
target_header_row = 2 
# source table to extract information according to which column of data, the source table header row 
source_cell_condition =' name ' 
# to copy the destination table column data information according to which, according to the target table heading row 
target_cell_condition = 'name' 
# source table columns to extract information 
source_cell_filled = 'ID number' 
# target table columns you want to copy information 
target_cell_filling = 'identity No. '

The source table with the destination table into memory, to facilitate the next step the two tables.

# A large amount of the data source table when here we can use openpyxl of read_only mode load source table, the benefits of doing so is not the whole tables are loaded into memory 
# wb_w = load_workbook (source_file_name) 
wb_r = load_workbook (filename = source_file_name, READ_ONLY = True) 
wb_w = load_workbook (target_file_name)

Rows from the sheet name with the title number of rows in the source table with the destination table to obtain the title has already been defined:

ws_r=wb_r[source_sheet_name]
ws_w=wb_w[target_sheet_name]

header_row_r=ws_r[source_header_row]
header_row_w=ws_w[target_header_row]

Source operating table header row, we want to get information:

"" " 
When openpyxl loaded with read_only workbook mode, the acquired cell is not an ordinary cell, 
tested cell.column integer offset into columns, so here we define a function to process, 
convert integer to excel the real number of columns, such as "a", "BB" et 
"" " 
DEF readOnly_offsetColunmNumber_toRealColumn (number): 
    column = '' 
    IF number <= 26 is: 
       column = CHR (number + the ord ( 'a') -. 1) 
    the else : 
       number1 = // Number 26 is 
       column1 = CHR (number1 the ord + ( 'A') -. 1) 
       number2% = Number 26 is 
       Column2 = CHR (number2 the ord + ( 'A') -. 1) 
       column column1 + = Column2 
    return column 

# initialize two variables, the conditions are the source column of the table, to copy a column 
source_condition_column = '' 
source_filled_column = '' 
"" "
Title loop source table columns, the conditions to obtain the position of columns and column location to be copied,  
then conditions are obtained by cyclically embedded maximum number of rows
"" "
for cell in header_row_r:
    if cell.value==source_cell_condition:
       source_condition_column=readOnly_offsetColunmNumber_toRealColumn(cell.column)            
    elif cell.value==source_cell_filled:
         source_filled_column=readOnly_offsetColunmNumber_toRealColumn(cell.column)

Operation target table header row, we want to get information:

# Initialize two variables, namely, the condition of the target table the column, the column to be pasted target_condition_column = '' target_filling_column = '' "" " 
title bar loop target table, the conditions to obtain a column and a position of the column to be pasted, 
then by the maximum number of rows nested loops to get the condition column 
"" "for cell_j in header_row_w: IF cell_j.value == target_cell_condition: 
       target_condition_column = cell_j.column            
    elif cell_j.value == target_cell_filling: 
         target_filling_column = cell_j.column

Now we've got all the required information, the actual time to paste the data.

"" " 
Conditional loop target table columns, conditions inside nested loops source table column, a cell condition once the target table columns 
same as the value of a cell with the conditions of the source table columns, we'll want to copy the source table columns cell in the same row value 
cell of the same row of the table gives the target column to be pasted. 
"" " 
for cell_m in ws_w [target_condition_column + STR (+ target_header_row. 1): target_condition_column + STR (ws_w.max_row)]: 
    for cell_N in ws_r [ + STR source_condition_column (+ source_header_row. 1): source_condition_column + STR (ws_r.max_row)]: 
        IF cell_m [0] == cell_N .Value [0] .Value: 
           ws_w [target_filling_column + STR (cell_m [0] .Row)]. value = ws_r [source_filled_column + str ( cell_n [0] .row)]. value

Finally, save the target workbook on it.

wb_w.save(target_file_name)

 This article is reproduced in https://www.py.cn/toutiao/11131.html

Guess you like

Origin www.cnblogs.com/jsdd/p/11599321.html
Recommended