"Big Data technologies" Principles and Applications, Second Edition - Chapter Big Data Overview

1.2 Big Data concept

  • Big amount of data
  • Many data types
  • Processing speed
  • Low density value

1.3 impact of Big Data

  • Changes gone from experiment to theory to calculate and then data
  • Changes in thinking
    1. Full sample rather than sample
    2. Efficiency rather than precise
    3. Relevant and not causal

1.6 Large data calculation mode

  1. Batch computing, primarily in large-scale data for batch processing. MapReduce for large data sets (1TB) parallel computing. Spark is a low-latency cluster for large data sets distributed computing system, much faster than MapReduce.
  2. Flow calculations, data stream or data flow refers to a set of infinite series of dynamic data on the number and distribution of time, it must be calculated in real time given by way of second response. Commercial-grade platform: Streams, StreamBase; second category is open source computing platform, Storm, Yahoo, S4, Spark Streaming
  3. Calculation FIG. FIG Pregel achieve parallel processing system, mainly for graph traversal, shortest path, the PageRank calculation, there are other Giraph, GraphX, PowerGraph, GoldenOrb, Hama
  4. Analysis calculation, need to provide real-time or near real-time response, Google's Dremel, Impala, Hive, Cassandra

Large data cloud 1.8

  1. Cloud computing consists of three typical service mode, IaaS (infrastructure services, ie computing resources and storage), PaaS (Platform as a Service), SaaS (software as a service)
  2. Public cloud, private cloud, hybrid cloud
  3. Cloud computing key technologies include: virtualization technology, distributed storage, distributed computing, multi-tenant.
  4. Things extensions thereof is connected to the Internet, he uses the local network or the Internet and other communication technologies the sensors, controllers, machines, people and linked together in a new way, and objects formed, was connected to the object, information technology and remote management control.

Guess you like

Origin www.cnblogs.com/tsruixi/p/12078843.html