python data processing (VI) of the data cleaning: standardization and scripted

1. Data normalization and standardization of

. A normalization: the data set is calculated, so that the data are in a specific range \

. B Standardization:

c. Delete outliers

2. Data storage

a. Save the SQLite database

b. Export to a simple csv file

3. Find the data cleaning method is suitable for projects

Write the script (to determine the structure of the code, for subsequent use, learning and sharing) whether the code can help you save time and improve efficiency

4. Data cleaning scripted

4.1 Zen python code specification see it, to make the code more clear

4.2 make the code more reusable generic

4.3 documenting the code

The primary task list needs to be done

Scripting

Optimization script

Add some strings and documents as a function inline comments

The test with new data

Looking for similar data test scripts

Built-in test module unittest / nose / pytest library

 

Guess you like

Origin www.cnblogs.com/qiu-hua/p/12622818.html