1 The characteristic data 4V
① large amount of data
TB-PB-ZB
HDFS distributed file system
② many kinds of data
Structured Data: Mysql based storage and processing
Unstructured data: images, audio, etc.
HDFS、MR、Hive
Semi-structured data: XML format, HTML format
HDFS、MR、Hive、Spark
③ speed
Growing faster
TB-PB-ZB
HDFS
Data processing speed
MR-HIVE-PIG-Impala (offline)
Spark-Flink (online)
④ low density value
2 Big Data framework of the project
① Data acquisition ftp, socket
② HDFS data storage
③ Data analysis MR + HIVE + INPALA + SPARK
The application layer in the large data processing machine learning ④
⑤ data show oracle + ssm
3 Development of Artificial Intelligence
3.1 AI Three Waves
Checkers - Expert System
Chess - statistical model
Go - deep learning
3.2 AI scene
Image recognition, unmanned, intelligent medical treatment, intelligent translation, speech recognition, data mining
4 Machine Learning - the difference between artificial intelligence and contact
Machine learning is a branch of artificial intelligence
Deep learning is a branch of machine learning
5 data, data analysis, data mining and contact difference
Data are observed or measured value
Information is credible data
Data analysis: Data - Information
Data Mining: information - valuable information
6 Machine Learning
Working on how to calculate means, give given algorithm combined data to build the model, predicted by the model to achieve the functions of machine learning.
7 rules-based learning and model-based learning
Rule-based learning is learning hard-coded
Model-based learning model is constructed by machine learning data predicted by the model.