R language and data analysis (1) data analysis process, data mining, data visualization

R software

  • R is free
  • R is a comprehensive statistical research platform that provides a variety of data analysis techniques
  • R has top drawing functions

data analysis

What is data

Data is a symbol that records and can identify objective events. It is a physical symbol or a combination of these physical symbols that records the nature, state, and relationship of objective things.

Why do data analysis?

Use the results of data analysis to guide decision making

Data analysis process

Data collection → data storage → data analysis → data mining → data visualization → decision making

data collection

The collected data is called the original data,

Store data as files

Statistics

Use statistical methods to purposefully analyze and process the collected data, and interpret the analysis results

Data mining

Data mining, called Data Mining in English, also known as data exploration and data mining, generally refers to the process of searching for information hidden in a large amount of data through algorithms
The difference between data mining and data statistics

  • Data mining cannot determine what to dig out. It is used to explore the unknown, and the specific method is not known. The goal of data statistics is generally clear, knowing which values ​​to calculate, such as summing, calculating average, etc., only need to use the appropriate Statistical method
  • Data mining is usually related to computer science. The goal of data mining is achieved through many methods such as statistical online analysis and processing, information retrieval, machine learning, artificial intelligence, expert systems, and pattern recognition;
  • Data statistics, different statisticians use different methods to calculate the same results; while data mining, the same data, different people may get different results
  • Data mining and data statistics are not independent of each other, and statistical knowledge is also required in the process of data mining

Data mining and three major thinking changes
1. To analyze all the data related to something, instead of relying on analyzing a small number of data samples
. 2. We are willing to accept the complexity of the data, and no longer pursue accuracy
3. No longer seek elusive causality, and focus on things Correlation

data visualization

Graphics are often more clear than numbers. For example, the latitude and longitude information obtained by GPS positioning is better displayed on a map.

Make a decision

Guess you like

Origin blog.csdn.net/qq_44520665/article/details/113479746