Let's talk about the eight application scenarios of global network HBase

Overview of HBase

HBase is a distributed storage and database engine that can support tens of millions of QPS and PB-level storage. These have been verified in the production environment and have been verified in the majority of companies. In particular, Ali, Xiaomi, JD.com, and Didi have thousands or tens of thousands of HBase clusters. The first condition for choosing a technology is to align with large companies, which will invest a lot of manpower to maintain, improve, and contribute to the community.

About the relationship between NewSQL and NoSQL

Technology is always moving forward, and now NewSQL is being mentioned. In fact, NewSQL is a package and a sub-scenario on top of NoSQL in my opinion. Large tables in NoSQL typically provide KV1V2...Vn, where each V can be 1b or 100MB. It can be said that the existence of a meta is similar to 01 in the digital world, which can be combined arbitrarily. In NoSQL represented by HBase, HBase can combine any scenarios, and NewSQL can be a sub-scenario with an SQL layer added on top or a transaction added at a closer layer.

About the separation of computing and storage

On the cloud, the core of the engine is the separation of storage and computing. Storage can be billed on demand, at least elastically scaled. Computation is provided according to node storage, and is completely charged according to QPS, either the cost is terrifyingly high, or it is difficult to meet more scenarios. For example, if you store 10M, how many times does it count as a QPS. Since HBase is inherently separated from storage and computing, it is naturally more suitable for the architecture on the cloud. It can be said that on the cloud, HBase has more advantages.

HBase scenario

HBase can be said to be a database or a storage. HBase, which has dual attributes, has a wide range of application scenarios by nature. In 2.0, OffHeap was introduced to reduce latency and meet online demands. The introduction of MOB can store about 10M objects, which is fully adapted to object storage. In addition, due to its own concurrency and storage capabilities, it can be said to be the most competitive engine.

image

  • Object storage: We know that a lot of headlines, news news, web pages, and pictures are stored in HBase, and the virus databases of some virus companies are also stored in HBase.
  • Time series data: There is an OpenTSDB module on top of HBase, which can meet the needs of time series scenarios
  • Recommended portrait: Especially the portrait of the user is a relatively large sparse matrix, and the risk control of ants is built on HBase
  • Spatiotemporal data: mainly trajectories, meteorological grids, etc. The trajectory data of Didi Taxi is mainly stored in HBase. In addition, in all car networking companies with a larger amount of data, the data is stored in HBase.
  • CubeDB OLAP: Kylin is a cube analysis tool. The underlying data is stored in HBase. Many customers build their own cubes based on offline computing and store them in HBase to meet the needs of online report queries.
  • Message/Order: In the field of telecommunications and banking, many orders query the underlying storage, and many applications for communication and message synchronization are built on HBase
  • Feeds flow: typical applications are applications similar to xx Moments
  • NewSQL: There is a Phoenix plug-in on it, which can meet the requirements of secondary indexes and SQL, and the non-transactional requirements of SQL for connecting traditional data
  • More scenarios need to be continuously excavated

Above, review again, HBase scenarios, made a simple classification, there will be some actual cases for some scenarios later, welcome to pay attention to the HBase technical community.

image

Original link

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325785679&siteId=291194637