Problems in the process of creating ES external tables in hive

1. Missing jar package: httpclient
Error:

“HiveServer2-Handler-Pool: Thread-696” java.lang.NoClassDefFoundError: org/apache/commons/httpclient/protocol/ProtocolSocketFactory

Need to load commons-httpclient-3.1.jar

2. Missing jar package: eshadoop

Error reported:

FAILED: SemanticException Cannot find class ‘org.elasticsearch.hadoop.hive.EsStorageHandler’

Need to load the same version number as the ES version being used: elasticsearch-hadoop-7.6.1.jar

3. After creating the ES table in hive, it cannot be queried normally.

Error reported:

Error: java.io.IOException: org.elasticsearch.hadoop.EsHadoopIllegalArgumentException: Expected to find keystore file at [hdfs:///path/to/esh.keystore] but was unable to. Make sure that it is available on the classpath, or if not, that you have specified a valid file URI. (state=,code=0)

What is used here is to place the keystore on HDFS.

The attribute needs to be specified in the table creation statement: ‘es.nodes.wan.only’ = ‘true’,

Specific explanation:

Detailed information about the configuration "es.nodes.wan.only" can be found at https://www.elastic.co/guide/en/elasticsearch/hadoop/master/configuration.html:

Insert image description here
What this means is that through the public network, when I access ES instances on the cloud or some restricted networks, such as AWS, by declaring this configuration, the behavior of discovering other nodes will be disabled, and subsequent reads and writes will only be done through this specification. Operate on a node. By adding this attribute, you can access ES on the cloud or in a restricted network. However, because both reading and writing pass through this node, the performance will be greatly affected.

Guess you like

Origin blog.csdn.net/qq_44696532/article/details/134708040