Study Notes (01): Python data cleaning combat - Course Introduction

Learning immediately: https://edu.csdn.net/course/play/26990/361139?utm_source=blogtoedu

Data cleaning;
numpy, PANDAS;

Outline:

Common tools (numpy, pandas-series, dataframe)

File operations (csv, excel, mysql)

Data table processing (filtering, additions and deletions, sorting)

Data conversion (string, date, format conversion)

Statistics (packet groupby, aggregate functions, apply function)

Data preprocessing (duplicate values, default values, outliers, discrete data)

 

It requires data cleaning problems;

1. Missing data - attribute value space;

2. Noise - unreasonable data values;

3. inconsistent - there is a contradiction and data;

4. The data redundancy - attribute data of two or more than the number of required data analysis;

The discrete points / outlier

6. Repeat data

 

Published 73 original articles · won praise 24 · views 2555

Guess you like

Origin blog.csdn.net/weixin_44943394/article/details/105063460
Recommended