Data mining introductory notes

Recently, I want to learn the relevant knowledge of data mining. I found this book "Data Mining: Concepts and Techniques" from the Internet to have a look. Plan to write some notes to record more basic and important knowledge points. I hope I can lay a solid foundation, and I hope I can help some white people who are planning to get started like me.

1. Data mining can be seen as the result of the natural evolution of information technology, which is the product of the continuous development of the database and data management industry.

2. What is data mining? In layman's terms, data mining is the process of mining interesting patterns and knowledge from large amounts of data. It is a fundamental step in our process of extracting knowledge from large datasets, all of which are: data cleaning (removing noise and removing inconsistent data), data integration (multiple data sources can be combined), data selection (extracting from the database) data related to analytical tasks), data transformation (transformation and unification of data into a form suitable for mining by aggregating or aggregating operations), data mining (using intelligent methods to extract data patterns), pattern evaluation (according to some measure of interest, Identify really interesting patterns that represent knowledge), knowledge representation (use visualization and knowledge representation techniques to provide users with mined knowledge).

3. The difference between a database and a data warehouse A relational database is a collection of tables. The data in each table is a highly structured result, and each piece of data is composed of multiple attributes of the same type, so it can be easily grasped when used. The main characteristics of the data (is a major form of data in data mining research). A data warehouse is a repository of information collected from multiple data sources, housed in a consistent schema, and typically resides on a single site. Data warehouses are constructed through data cleansing, transformation, inheritance, loading, and periodic data refreshes.

4. What are the data mining functions? There are a large number of data mining functions, including characterization and differentiation, frequent patterns, association and correlation mining, classification and regression, cluster analysis, outlier analysis, etc. Data mining functions are used to specify patterns discovered by data mining tasks. Generally speaking, these tasks can be divided into two categories: descriptive and predictive. Descriptive mining tasks characterize the general properties of the data in the target data. Predictive mining tasks generalize on current data in order to make predictions. ------------------Sleepy to continue tomorrow--------------------

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325440911&siteId=291194637