1. Knime Analytics installation
Download the appropriate version from the official website https://www.knime.com/downloads
Unzip the downloaded installation package in the installation path https://www.knime.com/installation-0
The following picture is the welcome page after knime is started
To interact with the spark set, the KNIME® Extension for Apache Spark needs to be installed in Knime. And install Spark Job Server on Hadoop cluster boundary nodes or nodes capable of executing spark-submit. The architecture diagram is as follows:
2. KNIME® Extension for Apache Spark installation
In KNIME Analytics, click File->Install KNIME extensions to select KNIME Big Data Extensions, and click Next to install.
3. SPARK JOB SERVER installation
The following steps take Centos 6.5 + CDH 5.7 as an example
3.1 Download spark job server
$ wget http://download.knime.org/store/3.5/spark-job-server-0.6.2.3-KNIME_cdh-5.7.tar.gz
3.2 login as root or su root
3.3 Installation
3.4 Startup
3.5 Edit environment.conf
set master, e.g.
master = "spark://ifrebdplatform1:7077"
设置Default settings for Spark contexts: context-settings
3.6 Edit settings settings.sh
Set SPARK_HOME, the default is correct in this example, do not change
Set LOG_DIR, if you do not use the default directory
3.7 Edit log4j-server.properties as you like
3.8 Start spark job server
/etc/init.d/${LINKNAME} start
3.9 Add create spark context node test link in knime
Right-click the create spark context node and click Execute
Right-click the create spark context node and click Spark Context to view the results
To be continued...