Good programmers Big Data technology inventory you know

  Good programmers Big Data technology inventory you know, the concept of big data, referring to its contents can not be carried out within a certain period of time using conventional software tools to capture, data management and processing of collections. The big data technology, is the ability to quickly obtain valuable information from various types of data.

  First, data acquisition

  After the tool is responsible for distribution of the ETL, disparate data sources, such as relational, flat data files to a temporary intermediate layer was washed extract, transformation, integration, and finally loaded into the data warehouse or data mart, online analytical processing becomes data mining foundation.

  Second, the data access

  Relational database, NOSQL, SQL and so on.

  Third, infrastructure

  Cloud storage, distributed file storage.

  Fourth, data processing

  Natural Language Processing (NLP, Natural Language Processing) is the study of language problems and human-computer interaction is a discipline. The key is to make natural language processing computer "understand" natural language, also known as natural language processing natural language understanding (NLU, Natural Language Understanding), also called Computational Linguistics (Computational Linguistics. On the one hand it is the language of information processing a branch, on the other hand it is one of artificial intelligence (AI, artificial Intelligence) of the core issues.

  Fifth, statistical analysis

  Hypothesis testing, significance tests, variance analysis, correlation analysis, T test, analysis of variance, chi-square analysis, partial correlation analysis, from analysis, regression analysis, simple regression analysis, multiple regression analysis, stepwise regression, regression forecast and residual analysis , ridge regression, logistic regression analysis, estimation curve, factor analysis, cluster analysis, principal component analysis, factor analysis, cluster method with fast clustering method, discriminant analysis, correspondence analysis, multivariate analysis corresponding to (optimum scale analysis), bootstrap technologies.

  Sixth, data mining

  Classification (Classification), estimation (Estimation), prediction (Prediction), the correlation grouping or association rules (Affinity grouping or association rules), clustering (Clustering), description and visualization, Description and Visualization), complex data type mining (the Text , Web, graphic images, video, audio, etc.).

  Seventh, the model predicts

  Predictive models, machine learning, modeling and simulation.

  Eighth, the results are presented

  Cloud, tag cloud, diagrams and the like.

  In fact, the technical details about big data far more than the above, please continue to focus on.

Guess you like

Origin www.cnblogs.com/gcghcxy/p/10955638.html