How cloud HBase & X-Pack Spark online storage and computing do one?

Challenges of Data Processing

With the gradual accumulation and increase enterprise data, data from a relational database architecture single node, the evolution of the sub-library sub-table, and then evolved into NoSQL and hadoop ecology. hadoop ecological flourishing, there is no uniform standard architecture, currently used more is Lambda architecture, which is the main features of stream computing, batch processing, online storage independent, connected by pipline.


c604a0501e462bcdfcad079498cfa04def5666a7

Big Data Lambda architecture is more complex, flow, batch, online storage requires a separate building, and the need to build a data pipline to do data exchange flows.
  • Data is written: batch processing, streaming, online storage of data needs to be written separately. On the one hand, and lot two streams requires a separate write data, while a lot of business written directly to the detailed data will be similar to HBase, Cassandra, mongoDB such online storage system.
  • Data Exchange: batch processing, switching between online storage need to build a lot of ETL batch jobs
  • Data quality: batch processing, streaming, online storage of data needs to be written, respectively, will lead to complicated data maintenance, data link may not be written in different

Guess you like

Origin yq.aliyun.com/articles/719243