[01] project data load and store shops

 

[01] project data load and store shops

aims:

1, successfully read "Store Data .csv" file 2, parsing the data into a list of stored dictionary format: [{ 'var1': value1 , 'var2': value2, 'var3': values, ...}, .. ., {}] 3, data cleaning: ① Comment, washed two fields. price digital ② Clear field missing data ③ commentlist split into three fields, and washed into a digital 4, the result is stored as a file .pkl

 

Data read

f = open ( 'C: / Users / 83759 / Python micro professional analysts data information items _ / Store data .csv', 'R & lt', encoding = 'UTF8') 
for I in f.readlines () [: 20 is] :
Print (i.split ( ','))
#Print (i.split ( ',') [-. 1] .split ( ''))
f.seek (0)

Cleaning data

# Create comment, price, commentlist cleaning function # Functional Programming DEF FCM (S):   IF 'strip' in S:       return int (s.split ( '') [0])   the else:       return 'missing data' #comment cleaning function: a space segment, select one of several reviews the results of the first list, and converted to integer DEF FPR (S):   IF '¥' in S:       return a float (s.split ( '¥' ) [-. 1])   the else:       return 'missing data' #Print cleaning function: with ¥ segment, select one of the results of the last list price per person, and converted to floating point DEF FCL (S):   IF len ( S) ==. 3:       Quality = a float (S [0] [2:])       Environment = a float (S [. 1] [2:])       -Service a float = (S [2] [2:])       return [Quality, Environment , Service]   the else:      return 'missing data'




























#commentlist cleaning function: a space segment, respectively, the cleaning quality, service and environmental data, and converted into a floating-point for I in f.readlines () [: 10]:   Cl = FCL (i.split ( ' , ') [-. 1] .split (' '))   Print (Cl)





result

image-20200306192157269

 

pkl file, csv file, tsv differences between files.

(1) pkl file:

pkl file format is a python inside to save the file, if opened directly displays the sequence of a bunch of stuff.

The right to open as follows:

import cPickle as pickle  
f = open('path')  
data = pickle.load(f)  
print (data)   #show file  

1) .pkl is a file storage in python.

  2) The storage can be some temporary variables used in the process python project, or need to be extracted, saved the temporary strings, lists, dictionaries and other data.

  3) Save mode is saved to a file created .pkl inside.

  4) Then you need the time and then open, load.

(2) csv file: comma delimited file, can be used to open excel

(3) tsv file: Tab tab-delimited files, may be used to open excel

Guess you like

Origin www.cnblogs.com/Lilwhat/p/12431065.html