Something about Hbase architecture

§ large solidarity

Primitive society, by a number of similar blood lineage, clan combine the collective life, this is the tribe. Emirates is the supreme leader, in addition also have military leaders, and together they thrive.

By the end of the primitive society, frequent wars, a number close to or with common interests tribe, temporary or permanent form alliances, and go and others PK, this is the tribal alliance.

Nature of the tribal alliance and the same tribe, they also have the highest leader, the authorities and so on. Chinese historical records of the Yellow Emperor, Chi, and Yao, Shun and Yu are the leader of the tribal alliance.

Through this thing, I remember at least two things: The first is the great strength of unity. It is clear. The second is the more famous of the diplomatic words, "no permanent enemies, only permanent interests."

§ The tribes moved into the computer

Computer history only a few years only, in comparison with the human society, it is a genuine "young man." Although the computer has been very powerful, but the ability to always have a single cap.

Primitive society all know the power of unity, let the computer unite it. The number of computers over a network combine to form a tribe (or tribal alliance) computer industry.

We have to elect a computer when the "chief" ah, they were elected that Taiwan is often referred to Master node, the remaining nodes are called Slave. We often say that the Master / Slave. But the meaning of slaves Slave, will be opposed by some countries, so there is another set is called, Leader / Follower, Chinese called master / slave.

The computer industry tribe called clusters.

§Hbase cluster

Hbase design goal is massive storage capacity, so it must be a cluster. Its "chief" is called the Master node, each of the remaining nodes is called Region Server.

Emirates To better manage this tribesmen, the general will set a top military leader, to supplement their own. It can be understood as we often say strategist.

Hbase cluster also has its own "military adviser", it is ZooKeeper. ZooKeeper is itself a cluster.

Emirates this role is very important, once killed, the whole tribe will be rudderless, easy to infighting, it must immediately elect a chief.

Chiefs need to have a strong ability, who can not bear. So usually you need a back-up all the time waiting for the emirate, in case of emergency. It is actually a spare tire.

Hbase so normal that Master node is called Active Master node, at least there is a back-up Master node called Backup Master node. Master regular exchanges between the two nodes, consistent information.

Something about Hbase architecture

Once the Active Master node heroic martyrdom, the "Advisor" ZooKeeper responsible for electing a node from the Backup Master to become the new boss. Because it has always been a back-up on standby, so come on can take over the work, there is no "training period".

Responsible for managing the work of Master classes node, the rest of the Region Server node is mainly responsible for the work, such as data read / write and so on.

§ dismembered

Hbase is realized in the form of lines of column-oriented storage, but it is still line store. Able to support billions of rows, showing that it is a very long list.

Real life, working long objects, transportation is very troublesome, so people will cut into a section, segmented deal, and finally stitched together.

In the same manner, the long table of Hbase divided into several segments, each segment is called a Region. Corresponding to the lateral part table traditional relational database.

Something about Hbase architecture

All Region spliced ​​together to form a complete table of Hbase. Like all cabins spliced ​​together to form a complete train of the same.

Hbase is column-oriented, so that when the data stored in a row, column family may lack some data. For example, a newly graduated students no work experience, and that experience this column family is empty.

Visible difference between the columns and column family clan sometimes very large, it is also a separate when stored, ie a column family a memory, this memory is called Store.

So a Region where you can have a multiple Store.

Something about Hbase architecture

§ storage structure

Hbase cluster responsible for the actual data is a lot of work Region Server server. Each table is logically segmented into many Region.

Finally, it is clear that these should be assigned to Region Region Server on the server, the assigned work is completed by the Master node.

Something about Hbase architecture

Hbase design goal is to support real-time read and write. So write speed must be fast, there is a hidden premise is that the data should be secure job.

Data is written to memory speed is very fast (think Redis), but only written to disk is considered safe.

There are a lot of Region Region on Server, if a large number of concurrent writes, the data eventually fell in different locations on the disk, magneto-optical head to and fro seek time is a very large overhead. The remaining time is the actual writing of the data.

§ How to optimize

When you copy or delete many small files, very time-consuming. If they are packaged into a compressed file, then copy or delete, will be much faster.

It can be optimized by reducing the number of files written. If we only write a file, and the file is appended at the end of each time, which should minimize disk head movement.

This method is called write-ahead log in Hbase, namely Write Ahead Log (WAL). As long as all write operations append data to the log file returns immediately.

A Region Server server is only one such WAL file, Region are all on the server and share it all inside Store.

Write-ahead log data format is not suitable for final storage, so there is also a MemStore Store such data structures residing in memory, it will collect all the data is written, and sorted by row key.

Something about Hbase architecture

When certain conditions are met, the data will be flush in MemStore to persist the disk, the final data is stored in the form of StoreFile of HDFS.

Something about Hbase architecture

Guess you like

Origin www.cnblogs.com/CQqf2019/p/11242316.html