Super popular articles in 2021! HBASE principle and practice notes document jointly written by two senior engineers of Xiaomi and NetEase

Apache HBase is a high-availability, high-performance, multi-version distributed NoSQL database based on Apache Hadoop. It is an open source implementation of Google BigTable. By building a large-scale structured storage cluster on a cheap server, it provides high-performance random access to massive data. Literacy.

Alibaba, Xiaomi, Tencent, NetEase, Huawei, Didi, Kuaishou, China Mobile, etc. have all regard HBase as an extremely important infrastructure, and many companies have long-term investments in the HBase community.

At present, in the open source community, there are not many distributed persistent KV storage systems that can be widely accepted by the market, and HBase is one of the excellent products. The open ecological environment of the Apache community has also enabled HBase to develop healthily. HBase can be seen frequently at database conferences and big data conferences around the world. From the perspective of the entire ecological chain of HBase, we can also find that various frameworks such as Phoenix and Omid can be built on HBase to achieve different business needs in SQL and transactions.

Within NetEase, HBase has evolved from supporting a single log storage to supporting hundreds of different businesses for each business department at the same time. The storage system based on HBase + SSD has been well applied in scenarios such as real-time recommendation and real-time risk control, and more generalized scenarios, such as log storage, order storage, user portraits, and so on. I hope this article can help readers understand HBase more deeply and systematically.

This article systematically analyzes and explains the entire system architecture and core components of HBase from a design perspective. At the same time, it also introduces commonly used performance tuning strategies and problem diagnosis methods and techniques to help readers better practice in the actual production environment.

This article was created by HBase PMC members and senior NetEase engineers, and recommended by many technical experts. Deepen the core of HBase, take a break and analyze the basic theory, development, operation and maintenance of HBase database.

HBASE principle and practice PDF jointly compiled by two senior engineers of Xiaomi and Netease

 

HBASE principle and practice PDF jointly compiled by two senior engineers of Xiaomi and Netease

 

Overview of HBase; This chapter is the beginning of the book, and introduces the protagonist to readers from the aspects of HBase's historical development, data model, architecture, and system characteristics.

  • 1.1 HBase past and present
  • 1.2 HBase data model
  • 1.3 HBase architecture
  • 1.4 HBase system characteristics

HBASE principle and practice PDF jointly compiled by two senior engineers of Xiaomi and Netease

Basic data structure and algorithm; this chapter will introduce the core data structure of HBase, including skip tables, LSM trees and Bloom filters. At the same time, in order to deepen the reader's impression, we designed a lightweight KV storage engine MiniBase° and provided some related programming exercises.

  • 2.1 Jump table
  • 2.2 LSM tree
  • 2.3 Bloom filter
  • 2.4 Design KV storage engine | Engine MiniBase

HBASE principle and practice PDF jointly compiled by two senior engineers of Xiaomi and Netease

 

HBase depends on services;

  • 3.1 Introduction to ZooKeeper
  • 3.2 ZooKeeper core configuration in HBase
  • 3.3 Introduction to HDFS
  • 3.4 File layout of HBase in HDFS

HBASE principle and practice PDF jointly compiled by two senior engineers of Xiaomi and Netease

 

HBase client;

  • 4.1 HBase client implementation
  • 4.2 HBase client avoidance guide

HBASE principle and practice PDF jointly compiled by two senior engineers of Xiaomi and Netease

 

The core modules of RegionServer ; this chapter will decompose RegionServer and give an in-depth introduction to the core modules. It should be noted that the introduction of modules in this chapter is limited to analyzing their core functions, internal structure, etc., and does not discuss their role in the entire HBase read and write process.

  • 5.1 Internal Structure of RegionServer
  • 5.2 HLog
  • 5.3 MemStore
  • 5.4 HFile
  • 5.5 BlockCache

HBASE principle and practice PDF jointly compiled by two senior engineers of Xiaomi and Netease

 

HBase read and write process;

  • 6.1 HBase write process
  • 6.2 BulkLoad function
  • 6.3 HBase reading process
  • 6.4 In-depth understanding of Coprocessor

HBASE principle and practice PDF jointly compiled by two senior engineers of Xiaomi and Netease

 

Compaction implementation;

  • 7.1 Basic working principle of Compaction
  • 7.2 Compaction Advanced Strategy

HBASE principle and practice PDF jointly compiled by two senior engineers of Xiaomi and Netease

 

Load balancing realization;

  • 8.1 Region Migration
  • 8.2 Region Merger
  • 8.3 Region Split
  • 8.4 Load Balancing Application of HBase

HBASE principle and practice PDF jointly compiled by two senior engineers of Xiaomi and Netease

 

The principle of downtime recovery; this chapter focuses on the analysis of common failures of RegionServer in HBase, the basic principles of failure recovery, and the process of data recovery after downtime.

  • 9.1 Analysis of common failures of HBase
  • 9.2 Basic Principles of HBase Failure Recovery
  • 9.3 HBase failure recovery process
  • 9.4 HBase failure time optimization

HBASE principle and practice PDF jointly compiled by two senior engineers of Xiaomi and Netease

 

copy;

  • 10.1 Copy scenario and principle
  • 10.2 Serial Copy
  • 10.3 Synchronous replication

HBASE principle and practice PDF jointly compiled by two senior engineers of Xiaomi and Netease

 

Backup and restore;

  • 11.1 Overview of Snapshot
  • 11.2 Snapshot creation
  • 11.3 Snapshot recovery
  • 11.4 Advanced Snapshot

HBASE principle and practice PDF jointly compiled by two senior engineers of Xiaomi and Netease

 

HBase operation and maintenance; this chapter will combine the author's years of production line operation and maintenance experience, focusing on the best practices of the HBase system in monitoring and alarming, performance testing, and business isolation.

  • 12.1 HBase system monitoring
  • 12.2 HBase cluster benchmark performance test 12.3 HBase YCSB
  • 12.4 HBase business isolation
  • 12.5 HBase HBCK
  • 12.6 HBase core parameter configuration
  • 12.7 HBase table design
  • 12.8 Salted Table

HBASE principle and practice PDF jointly compiled by two senior engineers of Xiaomi and Netease

 

HBase system tuning; this chapter covers some tuning skills, and introduces some tuning methods and ideas by introducing the tuning content most relevant to HBase, including GC tuning, operating system tuning, and HBase read and write performance tuning Wait.

  • 13.1 HBase GC tuning
  • 13.2 G1 GC performance tuning
  • 13.3 HBase operating system tuning
  • 13.4 HBase-HDFS tuning strategy
  • 13.5 HBase read performance optimization
  • 13.6 HBase write performance tuning

HBASE principle and practice PDF jointly compiled by two senior engineers of Xiaomi and Netease

 

HBase operation and maintenance case analysis;

  • 14.1 RegionServer is down
  • 14.2 HBase write exception
  • 14.3 Analysis of problems during HBase operation and maintenance

HBASE principle and practice PDF jointly compiled by two senior engineers of Xiaomi and Netease

 

HBase 2.x core technology;

  • 15.1 Procedure function
  • 15.2 In Memory Compaction
  • 15.3 MOB Object Storage
  • 15.4 Offheap read path and Offheap write path
  • 15.5 Asynchronous design

HBASE principle and practice PDF jointly compiled by two senior engineers of Xiaomi and Netease

 

Advanced topics; this chapter will introduce several advanced topics of HBase. The secondary index is a database function commonly used by developers, but the community version of HBase does not support the secondary index function. This chapter will provide some commonly used ideas for designing secondary indexes. Transaction is a core function of the database system that is worth discussing. HBase currently only supports single-row transactions and does not support inter-bank transactions between multiple partitions. This chapter will give the design ideas for inter-bank transactions. Finally, I introduce HBase's community operation mechanism and HBase development and testing related content, hoping to be of benefit to readers who are interested in participating in community building.

  • 16.1 Secondary Index
  • 16.2 Single-line transactions and inter-bank transactions
  • 16.3 HBase Development and Testing

HBASE principle and practice PDF jointly compiled by two senior engineers of Xiaomi and Netease

 

Friends who need this [HBASE principle and practice] 328-page technical document can forward this article and follow the editor, scan the code below to get it! ! !

Daniel speaks highly of this article

HBASE principle and practice PDF jointly compiled by two senior engineers of Xiaomi and Netease

 

I hope this article can help the majority of developers to learn, and I hope that this article can be liked by everyone, repost it so that more people benefit, pay more attention to updating technical good articles every day!

Guess you like

Origin blog.csdn.net/bjmashibing001/article/details/111593597