1. Data normalization and standardization of
. A normalization: the data set is calculated, so that the data are in a specific range \
. B Standardization:
c. Delete outliers
2. Data storage
a. Save the SQLite database
b. Export to a simple csv file
3. Find the data cleaning method is suitable for projects
Write the script (to determine the structure of the code, for subsequent use, learning and sharing) whether the code can help you save time and improve efficiency
4. Data cleaning scripted
4.1 Zen python code specification see it, to make the code more clear
4.2 make the code more reusable generic
4.3 documenting the code
The primary task list needs to be done
Scripting
Optimization script
Add some strings and documents as a function inline comments
The test with new data
Looking for similar data test scripts
Built-in test module unittest / nose / pytest library