Python : The crawling pages of data written to Excel files in
After crawling through the information web crawler, we are generally content into txt file or a database, you can also write Excel files, introduced here on the use of Excel simple implementation file saved web page data to the crawling.
Required third-party libraries: Requests , beautifulsoup4 , xlwt .
Let's look through the use of Excel uses a simple example to store data files.
# Import xlwt module Import xlwt # create a Workbook object, namely to create an Excel workbook f = xlwt.Workbook () # create a student information table # sheet1 represents Excel files in a table # create a sheet object named "Student Information ", cell_overwrite_ok indicates whether the cell coverage, is a parameter of the Worksheet instance, the default value is False Sheet1 = f.add_sheet (U ' student information ' , cell_overwrite_ok = True) # header information set line rowTitle = [U ' Science No. ' , U ' name ' , U ' sex ' , U ' date of birth ' ] # student information line collection RowDatas = [[U ' 10001 ' , U ' Zhang three ' , U ' man ' , U ' 1998-2-3 ' ], [U ' 10002 ' , U ' Lee four ' , U ' woman ' , U ' 1999 -12-12 ' ], [U ' 10003 ' , U ' King five ' , U ' man ' , U ' 1998-7-8 ']] # Traversing write information to the table heading row for i inRange (0, len (rowTitle)): # where the '0' line, 'i' denotes a row 0 and i specifies a cell in the table, 'rowTitle [i]' is written to the cell SUMMARY sheet1.write (0, I, rowTitle [I]) # traverse student writes information to the table for K in Range (0, len (rowDatas)): # first through the collection layer, i.e., each row of data for J in Range (0, len (rowDatas [K])): # re-traverse the inner set, j represents the column data sheet1.write (K +. 1, j, rowDatas [K] [J]) # K +. 1 represents a first header removed row, j represents a column of data, rowdatas [k] [j] is inserted into the cell data # path and file name saved f.save ( ' D: /WriteToExcel.xlsx ' )
Excel file named WriteToExcel.xlsx disk D corresponding to the found information has been inserted into the table.