(A) formatting excel format Python

demand:

  Customers to upload a poc test by sftp excel file, down to the cloud desktop look and found a bunch of formatting problems, how to do it? The company does not allow it to download files to a local treatment, it can only be processed on the server.
Need to convert a bunch of type, time format is wrong, think about the process can make use of python do, turn into a csv format, are converted to string format, the interface is also in line.
 
When he finished, then do it. Because no written how python, heart or fear.
Find a demo of a parsing excel, to find a change of change, previously processed excel file, print, field testing is possible,

Question 1, the reading actually run into the first cell of the field, it reported the coding problem on the server.

Know coding problem, but I do not know why (formerly also treated in local files, no problem.) Python consulted before the Great God, let me encode ( 'utf-8') to try.
 
Then execute successful. While still do not know why, the file itself is a set of encoding utf8. (I did not get to the bottom !!!!!)
 

2, after the contents to a csv file, found that the order does not meet the requirements, thought for a moment, can not think of what high-end approach, had to use the method most low

 
Fortunately, content processing is still relatively small. He said the performance does not exist.

Question 3: String has emoticons, did not handle success [Internet to find some demo tests are not passed, on the first matter, the immediate pre-existing library]

to sum up:

In the course of treatment with python format, the feeling is not difficult, the difficulty is we do not know with which packages can handle some basic grammar issues. Just text processing, then, is not difficult.
 
The main code is as follows :( Do not laugh, I'm just a white python)
'''
    读 excel文件
'''
def read_from_excel(filepath):
    data = xlrd.open_workbook(filepath)
    table = data.sheets()[0]
    nor = table.nrows
    nol = table.ncols

    print 'row: %d , colume: %d' % (nor, nol)
    resutl = []

    for i in range(1, nor):
        dict = {}
        flag = True
    #    if i == 10:
     #       break
        for j in range(nol):
            title = table.cell_value(0, j).encode('utf-8')
            print(str(i) + '--' + str(j) + '---'+ title)
            #print(chardet.detect(table.cell_value(i, j)))
            value = (str(table.cell_value(i, j).encode('utf-8')).replace('\n', ''))
            print(str(i) + '--' + str(j) + '---'+value)
                # print value
            if title == 'identitu_type':
                if value == 'SSS':
                    value = 'SSS card'
                elif value == 'PASSPORT':
                    value = 'Passport'
                elif value == 'DRIVERLICENCE':
                    value = "Driver's license"
                elif value == 'PHILHEALTH':
                    value = "PhilHealth"
                elif value == 'UMID':
                    value = "UMID"
                else:
                    flag = False
            print(str(i) + '--' + str(j) + '---'+value)

            dict[title] = remove_emoji(value)
        if flag:
            resutl.append(dict)

    return resutl
'''
    字典转 csv文件
'''
def nestedlist2csv(list, out_file):
    with open(out_file, 'wb') as f:
        title = []
        w = csv.writer(f)
        fieldnames=list[0].keys()  # solve the problem to automatically write the header
        print fieldnames
       
        title = ['Name','id_card', 'phone','identitu_type','Date']
        w.writerow(title)
        for row in list:
            print(row.values)
            value = [row['Name'], row['id_card'], row['phone'], row['identitu_type'], row['Date']]
            w.writerow(value)

 

 

Guess you like

Origin www.cnblogs.com/idea-persistence/p/11222525.html