What are the technologies of big data, how to analyze?

Common analysis methods for big data

1. Visual analysis

The users of big data analysis include big data analysis experts and ordinary users, but the most basic requirement of both of them for big data analysis is visual analysis, because visual analysis can intuitively present the characteristics of big data and can be easily Readers accept it as simple as reading pictures.

 

2. Data mining algorithm

The theoretical core of big data analysis is data mining algorithms. Various data mining algorithms are based on different data types and formats in order to more scientifically present the characteristics of the data itself. It is precisely because these are recognized by statisticians around the world Various statistical methods (which can be called truths) can penetrate into the data and discover the recognized value. Another aspect is also because these data mining algorithms can process big data more quickly. If an algorithm takes several years to reach a conclusion, then the value of big data cannot be said.

 

3. Predictive analysis

One of the final application areas of big data analysis is predictive analysis, digging out features from big data, establishing models through science, and then bringing new data through the models to predict future data.

 

4. Semantic engine

The diversification of unstructured data brings new challenges to data analysis. We need a set of tools to analyze and refine data. The semantic engine needs to be designed with enough artificial intelligence to be able to actively extract information from the data. If you want to learn big data systematically, you can join big data technology to learn the deduction : 522189307

The basis of big data analysis is the above five aspects. Of course, if you go deeper into big data analysis, there are many, many more characteristic, more in-depth and more professional big data analysis methods.

Big data technology

Data collection: The ETL tool is responsible for extracting data from distributed and heterogeneous data sources such as relational data and flat data files to the temporary intermediate layer for cleaning, conversion, and integration, and finally loads it into a data warehouse or data mart, becoming The basis of online analytical processing and data mining.

Data access: Relational database, NOSQL, SQL, etc.

 

Infrastructure: Cloud storage, distributed file storage, etc.

Data processing: Natural language processing (NLP, Natural Language Processing) is a subject that studies the language problems of human-computer interaction. The key to processing natural language is to let the computer "understand" natural language, so natural language processing is also called natural language understanding, also known as computational linguistics. On the one hand it is a branch of language information processing, on the other hand it is one of the core topics of artificial intelligence.

Statistical analysis: Hypothesis test, significance test, difference analysis, correlation analysis, T test, analysis of variance, chi-square analysis, partial correlation analysis, distance analysis, regression analysis, simple regression analysis, multiple regression analysis, stepwise regression, regression prediction and Residual analysis, ridge regression, logistic regression analysis, curve estimation, factor analysis, cluster analysis, principal component analysis, factor analysis, fast clustering method and clustering method, discriminant analysis, correspondence analysis, multivariate correspondence analysis (optimal scale Analysis), bootstrap technology, etc.

Data mining: Classification, Estimation, Prediction, Affinity grouping or association rules, Clustering, Description and Visualization, Description and Visualization, complex data types Mining (Text, Web, graphic images, video, audio, etc.)

Published 207 original articles · praised 5 · 40,000+ views

Guess you like

Origin blog.csdn.net/mnbvxiaoxin/article/details/105058786