YCSB to use HBase test

This article reprinted from: http://www.cnblogs.com/gpcuster/archive/2011/08/16/2141430.html Author: gpcuster reproduced, please indicate the statement.

YCSB Introduction

YCSB (Yahoo! Cloud Serving Benchmark) is a universal Yahoo open source performance testing tools.

We can be related to performance testing various types of NoSQL products through this tool, including:

Instructions on YCSB can refer to:

  1. Getting Started
  2. Running a Workload
  3. Adding a Database

YCSB and HBase comes with performance testing tools (PerformanceEvaluation) compared to the benefits that:

  • Extended: performance testing of HBase client is not just a product, but may be a different version of HBase.
  • Flexible: performance testing time, the test mode can be selected: read + write, read + scan the like can also select the frequency of the selected mode of operation different from the Key.
  • monitor:
    • When performance testing, real-time display of the progress of tests:
    • 1340 sec: 751515 operations; 537.74 current ops/sec; [INSERT AverageLatency(ms)= 1.77 ]
      1350 sec: 755945 operations; 442.82 current ops/sec; [INSERT AverageLatency(ms)= 2.18 ]
      1360 sec: 761545 operations; 559.72 current ops/sec; [INSERT AverageLatency(ms)= 1.71 ]
      1370 sec: 767616 operations; 606.92 current ops/sec; [INSERT AverageLatency(ms)= 1.58 ]
    • After the test is complete, the overall test case:
    •  
      [OVERALL], RunTime(ms), 1762019.0
      [OVERALL], Throughput(ops/sec), 567.5307700995279
      [INSERT], Operations, 1000000
      [INSERT], AverageLatency(ms), 1.698302
      [INSERT], MinLatency(ms), 0
      [INSERT], MaxLatency(ms), 14048
      [INSERT], 95thPercentileLatency(ms), 2
      [INSERT], 99thPercentileLatency(ms), 3
      [INSERT], Return= 0 , 1000000
      [INSERT], 0 , 29
      [INSERT], 1 , 433925
      [INSERT], 2 , 549176
      [INSERT], 3 , 10324
      [INSERT], 4 , 3629
      [INSERT], 5 , 1303
      [INSERT], 6 , 454
      [INSERT], 7 , 140

YCSB less than that:

Built-in workload model is too simple, does not provide a form of MR for testing, so time to test if you want to open a multi-threaded manner would be more trouble.

For example, when the import is only open multiple threads to start multiple import process, and then specify "Start Key value" in a different startup parameters. Transaction time during the test, only to open multiple threads on multiple machines to operate.

Use YCSB test HBase-0.90.4

YCSB-0.1.3 download the source code from the official website

http://github.com/brianfrankcooper/YCSB/tarball/0.1.3

Compile YCSB

hdfs@hd0004-sw1 guopeng$ cd YCSB-0.1.3/
hdfs@hd0004-sw1 YCSB-0.1.3$ pwd
/home/hdfs/guopeng/YCSB-0.1.3
hdfs@hd0004-sw1 YCSB-0.1.3$ ant
Buildfile: /home/hdfs/guopeng/YCSB-0.1.3/build.xml

compile:
javac /home/hdfs/guopeng/YCSB-0.1.3/build.xml:50: warning: 'includeantruntime' was not set, defaulting to build.sysclasspath=last; set to false for repeatable builds

makejar:

BUILD SUCCESSFUL
Total time: 0 seconds
hdfs@hd0004-sw1 YCSB-0.1.3$

Because YCSB comes with HBase client code has some compatibility problems, so we use the following code to replace YCSB comes with a file (db / hbase / src / com / yahoo / ycsb / db / HBaseClient.java):

package com.yahoo.ycsb.db;

import java.io.IOException;
import java.util.ConcurrentModificationException;
import java.util.HashMap;
import java.util.Map;
import java.util.Set;
import java.util.Vector;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.hbase.KeyValue;
import org.apache.hadoop.hbase.client.Delete;
import org.apache.hadoop.hbase.client.Get;
import org.apache.hadoop.hbase.client.HTable;
import org.apache.hadoop.hbase.client.Put;
import org.apache.hadoop.hbase.client.Result;
import org.apache.hadoop.hbase.client.ResultScanner;
import org.apache.hadoop.hbase.client.Scan;
import org.apache.hadoop.hbase.util.Bytes;

import com.yahoo.ycsb.DBException;

/**
* HBase client for YCSB framework
*
@see http://blog.data-works.org
*
@see http://gpcuster.cnblogs.com/
*/
public class HBaseClient extends com.yahoo.ycsb.DB {
private static final Configuration config = new Configuration();

static {
config.addResource(
"hbase-default.xml");
config.addResource(
"hbase-site.xml");
}

public boolean _debug = false;

public String _table = "";
public HTable _hTable = null;
public String _columnFamily = "";
public byte _columnFamilyBytes[];

public static final int Ok = 0;
public static final int ServerError = -1;
public static final int HttpError = -2;
public static final int NoMatchingRecord = -3;

public static final Object tableLock = new Object();

/**
* Initialize any state for this DB. Called once per DB instance; there is
* one DB instance per client thread.
*/
public void init() throws DBException {
if ((getProperties().getProperty("debug") != null)
&& (getProperties().getProperty("debug").compareTo("true") == 0)) {
_debug
= true;
}

_columnFamily
= getProperties().getProperty("columnfamily");
if (_columnFamily == null) {
System.err
.println(
"Error, must specify a columnfamily for HBase table");
throw new DBException("No columnfamily specified");
}
_columnFamilyBytes
= Bytes.toBytes(_columnFamily);

// read hbase client settings.
for (Object key : getProperties().keySet()) {
String pKey
= key.toString();
if (pKey.startsWith("hbase.")) {
String pValue
= getProperties().getProperty(pKey);
if (pValue != null) {
config.set(pKey, pValue);
}
}
}
}

/**
* Cleanup any state for this DB. Called once per DB instance; there is one
* DB instance per client thread.
*/
public void cleanup() throws DBException {
try {
if (_hTable != null) {
_hTable.flushCommits();
}
}
catch (IOException e) {
throw new DBException(e);
}
}

public void getHTable(String table) throws IOException {
synchronized (tableLock) {
_hTable
= new HTable(config, table);
}

}

/**
* Read a record from the database. Each field/value pair from the result
* will be stored in a HashMap.
*
*
@param table
* The name of the table
*
@param key
* The record key of the record to read.
*
@param fields
* The list of fields to read, or null for all of them
*
@param result
* A HashMap of field/value pairs for the result
*
@return Zero on success, a non-zero error code on error
*/
public int read(String table, String key, Set<String> fields,
HashMap
<String, String> result) {
// if this is a "new" table, init HTable object. Else, use existing one
if (!_table.equals(table)) {
_hTable
= null;
try {
getHTable(table);
_table
= table;
}
catch (IOException e) {
System.err.println(
"Error accessing HBase table: " + e);
return ServerError;
}
}

Result r
= null;
try {
if (_debug) {
System.out.println(
"Doing read from HBase columnfamily "
+ _columnFamily);
System.out.println(
"Doing read for key: " + key);
}
Get g
= new Get(Bytes.toBytes(key));
if (fields == null) {
g.addFamily(_columnFamilyBytes);
}
else {
for (String field : fields) {
g.addColumn(_columnFamilyBytes, Bytes.toBytes(field));
}
}
r
= _hTable.get(g);
}
catch (IOException e) {
System.err.println(
"Error doing get: " + e);
return ServerError;
}
catch (ConcurrentModificationException e) {
// do nothing for now...need to understand HBase concurrency model
// better
return ServerError;
}

for (KeyValue kv : r.raw()) {
result.put(Bytes.toString(kv.getQualifier()),
Bytes.toString(kv.getValue()));
if (_debug) {
System.out.println(
"Result for field: "
+ Bytes.toString(kv.getQualifier()) + " is: "
+ Bytes.toString(kv.getValue()));
}

}
return Ok;
}

/**
* Perform a range scan for a set of records in the database. Each
* field/value pair from the result will be stored in a HashMap.
*
*
@param table
* The name of the table
*
@param startkey
* The record key of the first record to read.
*
@param recordcount
* The number of records to read
*
@param fields
* The list of fields to read, or null for all of them
*
@param result
* A Vector of HashMaps, where each HashMap is a set field/value
* pairs for one record
*
@return Zero on success, a non-zero error code on error
*/
public int scan(String table, String startkey, int recordcount,
Set
<String> fields, Vector<HashMap<String, String>> result) {
// if this is a "new" table, init HTable object. Else, use existing one
if (!_table.equals(table)) {
_hTable
= null;
try {
getHTable(table);
_table
= table;
}
catch (IOException e) {
System.err.println(
"Error accessing HBase table: " + e);
return ServerError;
}
}

Scan s
= new Scan(Bytes.toBytes(startkey));
// HBase has no record limit. Here, assume recordcount is small enough
// to bring back in one call.
// We get back recordcount records
s.setCaching(recordcount);

// add specified fields or else all fields
if (fields == null) {
s.addFamily(_columnFamilyBytes);
}
else {
for (String field : fields) {
s.addColumn(_columnFamilyBytes, Bytes.toBytes(field));
}
}

// get results
ResultScanner scanner = null;
try {
scanner
= _hTable.getScanner(s);
int numResults = 0;
for (Result rr = scanner.next(); rr != null; rr = scanner.next()) {
// get row key
String key = Bytes.toString(rr.getRow());
if (_debug) {
System.out.println(
"Got scan result for key: " + key);
}

HashMap
<String, String> rowResult = new HashMap<String, String>();

for (KeyValue kv : rr.raw()) {
rowResult.put(Bytes.toString(kv.getQualifier()),
Bytes.toString(kv.getValue()));
}
// add rowResult to result vector
result.add(rowResult);
numResults
++;
if (numResults >= recordcount) // if hit recordcount, bail out
{
break;
}
}
// done with row

}

catch (IOException e) {
if (_debug) {
System.out
.println(
"Error in getting/parsing scan result: " + e);
}
return ServerError;
}

finally {
scanner.close();
}

return Ok;
}

/**
* Update a record in the database. Any field/value pairs in the specified
* values HashMap will be written into the record with the specified record
* key, overwriting any existing values with the same field name.
*
*
@param table
* The name of the table
*
@param key
* The record key of the record to write
*
@param values
* A HashMap of field/value pairs to update in the record
*
@return Zero on success, a non-zero error code on error
*/
public int update(String table, String key, HashMap<String, String> values) {
// if this is a "new" table, init HTable object. Else, use existing one
if (!_table.equals(table)) {
_hTable
= null;
try {
getHTable(table);
_table
= table;
}
catch (IOException e) {
System.err.println(
"Error accessing HBase table: " + e);
return ServerError;
}
}

if (_debug) {
System.out.println(
"Setting up put for key: " + key);
}
Put p
= new Put(Bytes.toBytes(key));
for (Map.Entry<String, String> entry : values.entrySet()) {
if (_debug) {
System.out.println(
"Adding field/value " + entry.getKey() + "/"
+ entry.getValue() + " to put request");
}
p.add(_columnFamilyBytes, Bytes.toBytes(entry.getKey()),
Bytes.toBytes(entry.getValue()));
}

try {
_hTable.put(p);
}
catch (IOException e) {
if (_debug) {
System.err.println(
"Error doing put: " + e);
}
return ServerError;
}
catch (ConcurrentModificationException e) {
// do nothing for now...hope this is rare
return ServerError;
}

return Ok;
}

/**
* Insert a record in the database. Any field/value pairs in the specified
* values HashMap will be written into the record with the specified record
* key.
*
*
@param table
* The name of the table
*
@param key
* The record key of the record to insert.
*
@param values
* A HashMap of field/value pairs to insert in the record
*
@return Zero on success, a non-zero error code on error
*/
public int insert(String table, String key, HashMap<String, String> values) {
return update(table, key, values);
}

/**
* Delete a record from the database.
*
*
@param table
* The name of the table
*
@param key
* The record key of the record to delete.
*
@return Zero on success, a non-zero error code on error
*/
public int delete(String table, String key) {
// if this is a "new" table, init HTable object. Else, use existing one
if (!_table.equals(table)) {
_hTable
= null;
try {
getHTable(table);
_table
= table;
}
catch (IOException e) {
System.err.println(
"Error accessing HBase table: " + e);
return ServerError;
}
}

if (_debug) {
System.out.println(
"Doing delete for key: " + key);
}

Delete d
= new Delete(Bytes.toBytes(key));
try {
_hTable.delete(d);
}
catch (IOException e) {
if (_debug) {
System.err.println(
"Error doing delete: " + e);
}
return ServerError;
}

return Ok;
}
}

The modified HBase client can directly specify the tests need to use the client in the command line parameters, such as the connection information zk :-p hbase.zookeeper.quorum = hd0004-sw1.dc.sh-wgq.sdo.com, hd0001 -sw1.dc.sh-wgq.sdo.com, a client's local cache size :-p hbase.client.write.buffer = 100, and the like.

Then copy and use-dependent compiled Jar package and configuration information.

[hdfs@hd0004-sw1 YCSB-0.1.3]$ cp ~/hbase-current/*.jar ~/hbase-current/lib/*.jar ~/hbase-current/conf/hbase-*.xml db/hbase/lib/
[hdfs@hd0004-sw1 YCSB-0.1.3]$

Now compile HBase client:

[hdfs@hd0004-sw1 YCSB-0.1.3]$ ant dbcompile-hbase
Buildfile: /home/hdfs/guopeng/YCSB-0.1.3/build.xml

compile:
[javac] /home/hdfs/guopeng/YCSB-0.1.3/build.xml:50: warning: 'includeantruntime' was not set, defaulting to build.sysclasspath=last; set to false for repeatable builds

makejar:

dbcompile-hbase:

dbcompile:
[javac] /home/hdfs/guopeng/YCSB-0.1.3/build.xml:63: warning: 'includeantruntime' was not set, defaulting to build.sysclasspath=last; set to false for repeatable builds

makejar:

BUILD SUCCESSFUL
Total time: 0 seconds
[hdfs@hd0004-sw1 YCSB-0.1.3]$

Finally, the establishment of HBase table test (usertable):

hbase(main):004:0> create 'usertable', {NAME => 'f1'}, {NAME => 'f2'}, {NAME => 'f3'}
0 row(s) in 1.2940 seconds

Such environment is ready.

Then use the following command can start importing the data need to be tested:

java -cp build/ycsb.jar:db/hbase/lib/* com.yahoo.ycsb.Client -load -db com.yahoo.ycsb.db.HBaseClient -P workloads/workloada -p columnfamily=f1 -p recordcount=1000000 -p hbase.zookeeper.quorum=hd0004-sw1.dc.sh-wgq.sdo.com,hd0001-sw1.dc.sh-wgq.sdo.com,hd0003-sw1.dc.sh-wgq.sdo.com,hd0149-sw18.dc.sh-wgq.sdo.com,hd0165-sw13.dc.sh-wgq.sdo.com -s

Original articles published 0 · won praise 136 · views 830 000 +

Guess you like

Origin blog.csdn.net/xfxf996/article/details/103887254
Recommended