kettle链接hive
1)首先将hive/lib目录的包导入到kettle目录D:\software\data-integration\plugins\pentaho-big-data-plugin\hadoop-configurations\cdh510\lib下(找到你安装的目录)
2)找到D:\software\data-integration\plugins\pentaho-big-data-plugin\hadoop-configurations\cdh510目录
文件core-site.xml 添加内容:
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://bigData01:9000</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/opt/modules/hadoop-2.6.0-cdh5.13.0/data</value>
</property>
</configuration>
文件hive-site.xml 添加内容:
<property>
<name>hive.metastore.local</name>
<value>false</value>
</property>
<property>
<name>hive.metastore.uris</name>
<value>thrift://ip地址:9083</value>
</property>
<property>
<name>hive.exec.reducers.bytes.per.reducer</name>
<value>1073741824</value>
</property>
<property>
<name>hive.support.concurrency</name>
<value>true</value>
</property>
<property>
<name>hive.server2.thrift.port</name>
<value>10000</value>
</property>
<property>
<name>hive.server2.thrift.min.worker.threads</name>
<value>5</value>
</property>
<property>
<name>hive.server2.thrift.max.worker.threads</name>
<value>500</value>
</property>
<property>
<name>hive.jdbc_passwd.auth.zjl</name>
<value>123456</value>
<description/>
</property>
文件mapred-site.xml添加如下
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
文件yarn-site.xml添加内容:
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
3)找到D:\software\data-integration\plugins\pentaho-big-data-plugin\hadoop-configurations\cdh510目录
将cdh510配置在D:\software\data-integration\plugins\pentaho-big-data-plugin目录下的plugin.properties文件
添加内容:active.hadoop.configuration=cdh510