kettle placement hadoop cluster

1. Configure the hadoop version supported by kettle

Modify data-integration\plugins\pentaho-big-data-plugin\plugin.properties

active.hadoop.configuration=hdp23

Supported hadoop versions are listed in data-integration\plugins\pentaho-big-data-plugin\hadoop-configurations

 

2. Select the supported hadoop version on the kettle interface

Tools>Hadoop Distribution>


 

3. Fill in the relevant information on the Hadoop cluster configuration page (refer to the ambari management interface), and then click "Test" to view the configuration results



 
 Here are some problems:

1).shim configuration verification红叉

Solution:

Replace the hadoop configuration file xml in data-integration\plugins\pentaho-big-data-plugin\hadoop-configurations\hdp23 with the configuration file in the hadoop cluster

Hiyo core-site.xml, hbase-site.xml, mapped-site.xml, yarn-site.xml

 

2).user home directory access和verify user home permission红叉

1. (Abandoned) There are not many online solutions to this problem. The hdfs user is the user when the process is started, and ambari uses the hdfs user by default, so you need to copy the kettle to the hdfs user directory of Hadoop.

I configured it on the office computer at the time, and the test always failed. Because the office computer user is not hdfs, kettle always uses the Hadoop cluster connected by the local user.

 

2. Users who create office computers in the hdfs file system:

   hadoop fs -mkdir /user/username

 

4. After passing the test



 

 

 

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=326223592&siteId=291194637