- SparkSql access hbase configuration
- Test verification
SparkSql access hbase configuration:
- Copy the related jar packages of HBase to the $SPARK_HOME/lib directory on the Spark node. The list is as follows:
guava-14.0.1.jar htrace-core-3.1.0-incubating.jar hbase-common-1.1.2.2.4.2.0-258.jar hbase-common-1.1.2.2.4.2.0-258-tests.jar hbase-client-1.1.2.2.4.2.0-258.jar hbase-server-1.1.2.2.4.2.0-258.jar hbase-protocol-1.1.2.2.4.2.0-258.jar hive-hbase-handler-1.2.1000.2.4.2.0-258.jar
- Configure $SPARK_HOME/conf/spark-env.sh of the Spark node on ambari, and add the above jar package to SPARK_CLASSPATH, as shown below:
- The list of configuration items is as follows: Note that there can be no spaces or carriage returns between the jar packages
export SPARK_CLASSPATH=/usr/hdp/2.4.2.0-258/spark/lib/guava-11.0.2.jar: /usr/hdp/2.4.2.0-258/spark/lib/hbase-client-1.1.2.2.4.2.0-258.jar:/usr/hdp/2.4.2.0-258/spark/lib/hbase-common-1.1.2.2.4.2.0-258.jar:/usr/hdp/2.4.2.0-258/spark/lib/hbase-protocol-1.1.2.2.4.2.0-258.jar:/usr/hdp/2.4.2.0-258/spark/lib/hbase-server-1.1.2.2.4.2.0-258.jar:/usr/hdp/2.4.2.0-258/spark/lib/hive-hbase-handler-1.2.1000.2.4.2.0-258.jar:/usr/hdp/2.4.2.0-258/spark/lib/htrace-core-3.1.0-incubating.jar: /usr/hdp/2.4.2.0-258/spark/lib/protobuf-java-2.5.0.jar:${SPARK_CLASSPATH}
- Copy hbase-site.xml to ${HADOOP_CONF_DIR}. Since the Hadoop configuration file directory ${HADOOP_CONF_DIR} is configured in spark-env.sh, hbase-site.xml will be loaded, and hbase-site.xml mainly contains the following Configuration of several parameters:
<property> <name>hbase.zookeeper.quorum</name> <value>r,hdp2,hdp3</value> <description>zookeeper node used by HBase</description> </property> <property> <name>hbase.client.scanner.caching</name> <value>100</value> <description>The HBase client scans the cache, which is very helpful for query performance</description> </property>
- Component services affected by restarting the modified configuration on ambari
Test Verification:
- Any spark client node authentication:
- Command: cd /usr/hdp/2.4.2.0-258/spark/bin (spark installation directory)
- Command: ./spark-sql
- Execute: select * from stocksinfo; (stocksinfo is a hive external table associated with hbase)
- The result is as follows: