Hbase Coprocessor coprocessor and JavaAPI

Coprocessor concept

1. There are two types of coprocessors: observer and endpoint
1. observer coprocessor

Observer is similar to a trigger in a traditional database. This type of coprocessor will be called by the server when certain events occur. Observer Coprocessor is a set of hooks scattered in the HBase server code, which are called when fixed events occur. For example: there is the hook function prePut before the put operation, which will be called by the Region Server before the put operation is executed; there is the postPut hook function after the put operation.

Taking Hbase2.0.0 version as an example, it provides three observer interfaces:

RegionObserver: Provides client-side data manipulation event hooks: Get, Put, Delete, Scan, etc.
WALObserver: Provides WAL-related operation hooks.
MasterObserver: Provides DDL-type operation hooks. Such as creating, deleting, modifying data tables, etc.
In version 0.96, a RegionServerObserver has been added. The following figure uses RegionObserver as an example to explain the principle of Observer co-processor:

The client initiates a get request.
The request is dispatched to the appropriate RegionServer and Region
coprocessorHost to intercept the request, and then calls preGet() on each RegionObserer registered on the table.
If it is not intercepted by preGet, the request continues to the Region, and then proceeds
The result generated by processing Region is intercepted by coprocessorHost again, and posGet() is called to process
it. No postGet() is added to intercept the response, and the final result is returned to the client.

Insert image description here
Insert image description here

2. endpoint coprocessor

Endpoint coprocessors are similar to stored procedures in traditional databases. The client can call these Endpoint coprocessors to execute a piece of server-side code and return the results of the server-side code to the client for further processing. The most common usage is to perform aggregation operations
. Without a coprocessor, when users need to find the maximum data in a table, that is, max aggregation operation, they must perform a full table scan, traverse the scan results in the client code, and perform the operation of finding the maximum value. Such a method cannot take advantage of the concurrency capabilities of the underlying cluster, and concentrating all calculations on the client for unified execution is bound to be inefficient.
Using Coprocessor, users can deploy the code for finding the maximum value to the HBase Server, and HBase will use multiple nodes of the underlying cluster to concurrently perform the operation of finding the maximum value. That is, the code for finding the maximum value is executed within each Region, the maximum value of each Region is calculated on the Region Server, and only the max value is returned to the client. On the client side, the maximum values ​​of multiple Regions are further processed to find the maximum value. In this way, the overall execution efficiency will be greatly improved.

Coprocessor Java API

1. pom configuration
    <!-- https://mvnrepository.com/artifact/org.apache.hbase/hbase-client -->
    <dependency>
        <groupId>org.apache.hbase</groupId>
        <artifactId>hbase-client</artifactId>
        <version>2.2.4</version>
    </dependency>
    <!-- https://mvnrepository.com/artifact/org.apache.hbase/hbase-server -->
    <dependency>
        <groupId>org.apache.hbase</groupId>
        <artifactId>hbase-server</artifactId>
        <version>2.2.4</version>
    </dependency>
2. Create classes in custom packages

Insert image description here
The package name of this file is com.niitchina.hbasedemo.coprocessor, and the class name is MyRegionObserver. This path and name are closely related to the subsequent configuration. If you want to be lazy, you can follow it completely.

3. Write code
package com.niitchina.hbasedemo.coprocessor;

import org.apache.hadoop.hbase.Cell;
import org.apache.hadoop.hbase.Coprocessor;
import org.apache.hadoop.hbase.CoprocessorEnvironment;
import org.apache.hadoop.hbase.client.Delete;
import org.apache.hadoop.hbase.client.Durability;
import org.apache.hadoop.hbase.client.Get;
import org.apache.hadoop.hbase.client.Put;
import org.apache.hadoop.hbase.coprocessor.*;
import org.apache.hadoop.hbase.regionserver.FlushLifeCycleTracker;
import org.apache.hadoop.hbase.regionserver.InternalScanner;
import org.apache.hadoop.hbase.regionserver.Store;
import org.apache.hadoop.hbase.util.Bytes;
import org.apache.hadoop.hbase.wal.WALEdit;


import java.io.FileWriter;
import java.io.IOException;
import java.util.List;
import java.util.Optional;



public class MyRegionObserver implements RegionObserver,RegionCoprocessor {
    
    
    @Override
    public Optional<RegionObserver> getRegionObserver() {
    
    
        return Optional.of(this);
    }
    private static void outInfo(String str){
    
    

        try {
    
    
            FileWriter fw = new FileWriter("/training/hbase-2.2.4/coprocessor.txt",true);
            fw.write(str + "\r\n");
            fw.close();
        } catch (Exception e) {
    
    
            e.printStackTrace();
        }
    }
    @Override
    public void start(CoprocessorEnvironment env) throws IOException {
    
    

        RegionCoprocessor.super.start(env);
        outInfo("MyRegionObserver.start()");
    }

    @Override
    public void stop(CoprocessorEnvironment env) throws IOException {
    
    

    }
    @Override
    public void preGetOp(ObserverContext<RegionCoprocessorEnvironment> e, Get get, List<Cell> results) throws IOException {
    
    
        RegionObserver.super.preGetOp(e, get, results);
        String rowkey = Bytes.toString(get.getRow());

        // custom code here , this code will run before the get operation
        outInfo("MyRegionObserver.preGetOp() : Before get operation rowkey = " + rowkey);
    }

    public void postGetOp(ObserverContext<RegionCoprocessorEnvironment> e, Get get, List<Cell> results) throws IOException {
    
    
        RegionObserver.super.postGetOp(e, get, results);
        String rowkey = Bytes.toString(get.getRow());
        //custom code
        outInfo("MyRegionObserver.postGetOp() : After Get Operation rowkey = " + rowkey);
    }
    @Override
    public void prePut(ObserverContext<RegionCoprocessorEnvironment> c, Put put, WALEdit edit, Durability durability) throws IOException {
    
    

        RegionObserver.super.prePut(c, put, edit, durability);
        String rowkey = Bytes.toString(put.getRow());
        // logic
        outInfo("MyRegionObserver.prePut() : rowkey = " + rowkey);
    }

    @Override
    public void postPut(ObserverContext<RegionCoprocessorEnvironment> c, Put put, WALEdit edit, Durability durability) throws IOException {
    
    
        RegionObserver.super.postPut(c, put, edit, durability);
        String rowkey = Bytes.toString(put.getRow());
        // custom code
        outInfo("MyRegionObserver.postPut() : rowkey = " + rowkey);
    }

    @Override
    public void preDelete(ObserverContext<RegionCoprocessorEnvironment> e, Delete delete, WALEdit edit, Durability durability) throws IOException {
    
    
        RegionObserver.super.preDelete(e, delete, edit, durability);
        String rowkey = Bytes.toString(delete.getRow());
        outInfo("MyRegionObserver.preDelete() : rowkey = " + rowkey);
    }

    @Override
    public void postDelete(ObserverContext<RegionCoprocessorEnvironment> e, Delete delete, WALEdit edit, Durability durability) throws IOException {
    
    
        RegionObserver.super.postDelete(e, delete, edit, durability);
        String rowkey = Bytes.toString(delete.getRow());
        // custom code
        outInfo("MyRegionObserver.postDelete() : rowkey = " + rowkey);
    }
}
4. Project jar package

Insert image description here
Insert image description here
Insert image description here
Insert image description here
Insert image description here
Insert image description here
Insert image description here
Insert image description here
Insert image description here
Insert image description here
Insert image description here
Insert image description here

5. Hbase file configuration

1. Enter the conf folder under the Hbase installation path to configure hbase-site.xml.com.niitchina.hbasedemo.coprocessor is the package name, and MyRegionObserver is the class name. If the names of the above steps are different, the specific content needs to be replaced here.

<property>
<name>hbase.coprocessor.region.classes</name>
<value>com.niitchina.hbasedemo.coprocessor.MyRegionObserver</value>
</property>

2. Restart Hbase
Insert image description here

6. Running insertion, deletion, and query get can be connected to the coprocessor

Insert image description here
The log file will be saved in a custom path, /training/hbase-2.2.4 set in our code. You can see that the log file comes out
Insert image description here
vi coprocessor.txt. You can look at the log inside and it runs successfully.
Insert image description here

Guess you like

Origin blog.csdn.net/agatha_aggie/article/details/127961928