Ambari HDP under SPARK2 and Phoenix Integration

1, Environment Description

operating system CentOS Linux release 7.4.1708 (Core)
Ambari
2.6.x
HDP
2.6.3.0
Spark
2.x
Phoenix
4.10.0-HBase-1.2

2, condition

  1. HBase installation is complete
  2. Phoenix has been enabled, as shown Ambari interface as follows:

  1. Spark 2 installation is complete

3, Spark2 and Phoenix Integration

Phoenix official website integration Tutorial: http://phoenix.apache.org/phoenix_spark.html

step:

  1. Ambari Spark2 into the configuration interface

  1. Find 自定义 spark2-defaultsand add the following configuration items:
   spark.driver.extraClassPath=/usr/hdp/current/phoenix-client/phoenix-4.10.0-HBase-1.2-client.jar
   spark.executor.extraClassPath=/usr/hdp/current/phoenix-client/phoenix-4.10.0-HBase-1.2-client.jar

mark

4, Yarn HA problem

If you configure Yarn HA, you need to modify Yarn HA configuration, otherwise spark-submitsubmit the task will be reported the following error:

Exception in thread "main" java.lang.IllegalAccessError: tried to access method org.apache.hadoop.yarn.client.ConfiguredRMFailoverProxyProvider.getProxyInternal()Ljava/lang/Object; from class org.apache.hadoop.yarn.client.RequestHedgingRMFailoverProxyProvider
        at org.apache.hadoop.yarn.client.RequestHedgingRMFailoverProxyProvider.init(RequestHedgingRMFailoverProxyProvider.java:75)
        at org.apache.hadoop.yarn.client.RMProxy.createRMFailoverProxyProvider(RMProxy.java:163)
        at org.apache.hadoop.yarn.client.RMProxy.createRMProxy(RMProxy.java:94)
        at org.apache.hadoop.yarn.client.ClientRMProxy.createRMProxy(ClientRMProxy.java:72)
        at org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.serviceStart(YarnClientImpl.java:187)
        at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
        at org.apache.spark.deploy.yarn.Client.submitApplication(Client.scala:153)
        at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:56)
        at org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:173)
        at org.apache.spark.SparkContext.<init>(SparkContext.scala:509)
        at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2516)
        at org.apache.spark.sql.SparkSession$Builder$$anonfun$7.apply(SparkSession.scala:922)
        at org.apache.spark.sql.SparkSession$Builder$$anonfun$7.apply(SparkSession.scala:914)
        at scala.Option.getOrElse(Option.scala:121)
        at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:914)
        at cn.spark.sxt.SparkOnPhoenix$.main(SparkOnPhoenix.scala:13)
        at cn.spark.sxt.SparkOnPhoenix.main(SparkOnPhoenix.scala)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.i

修改Yarn HA配置:

Will 原来的配置:

yarn.client.failover-proxy-provider=org.apache.hadoop.yarn.client.RequestHedgingRMFailoverProxyProvider

Read 现在的配置:

yarn.client.failover-proxy-provider=org.apache.hadoop.yarn.client.ConfiguredRMFailoverProxyProvider

If no Yarn HA, this step is not required configuration

---

Welcome attention to micro-channel public number

Published 165 original articles · won praise 292 · views 280 000 +

Guess you like

Origin blog.csdn.net/u011026329/article/details/104414845
HDP