Introduction to Data Mining (1)

One: The main process of data mining:

1: Define your goals

2: Get data (crawlers or download data from some statistical websites)

3: Data exploration:

4: Data preprocessing (data cleaning, data integration, data transformation, data reduction: the process of streamlining data)

5: Mining modeling (classification, clustering, association, prediction)

6: Model Evaluation and Release

Two: Introduction to related modules

1: numpy can efficiently process data, provide array support, and many modules rely on him, such as pandas, scipy, and matplotlib, so this module is the foundation. In addition to providing fast array processing capabilities for python, another major role in data analysis is as a container for passing data between algorithms.

2: pandas is mainly used for data exploration and data analysis

3: matplotlib drawing module to solve visualization problems

4: scipy mainly performs numerical calculations, supports matrix operations, and provides many advanced data processing functions, such as integration, Fourier transform, and differential equation solving

5: stasmodels This module is mainly used for statistical analysis

6: Gensim This module is mainly used for text mining

7: sklearn, keras machine learning, the latter deep learning



Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325564044&siteId=291194637