1. Planning and cluster nodes Description
rzx1 all
rzx2 query
rzx3 query
Description:
Kylin node has three roles:
all: 包含query和job
query: 查询节点
job: 工作节点
2. Kylin other large data components depend very much, the following list is to install the required components kylin
JDK 1.8<必须项>
HADOOP<必须项,hdfs作为数据存储基础,这里版本是hadoop-2.7.7>
ZOOKEERER<必须项,集群协调,这里版本zookeeper-3.4.13>
HBASE<必须项,可以理解为数据中间件,这里版本hbase-2.0.4>
HIVE<必须项,kylin OLAP基础数仓或可以理解为OLAP数据源,这里版本hive-2.3.4>
KAFKA<可选项,这里不安装>
3. In good unzip the downloaded directory
<Download: https://archive.apache.org/dist/kylin/>
at rzx1 node:
vim conf/kylin.properties:
kylin.server.mode=all
kylin.server.cluster-servers=rzx1:7070,rzx2:7070,rzx3:7070
kylin.coprocessor.local.jar=/home/bigdata/software/kylin-2.6.2/lib/kylin-coprocessor-2.6.2.jar
Description: development and testing environment currently only easy to install version, the profile configuration parameter is very large, the actual production environment needs to be configured according to the actual situation
4. In the above configuration good rzx1 directory node kylin scp to rzx2, the node rzx3
Kylin on the parent directory of the current directory:
scp -r kylin-2.6.2 root@rzx2:/home/bigdata/software/
scp -r kylin-2.6.2 root@rzx2:/home/bigdata/software/
The rzx2, the rzx3 kylin directory node conf/kylin.properties
of
kylin.server.mode to Query
kylin.server.mode = Query
The environment variable configuration kylin
Premise configured components kylin dependent environment variables
export KYLIN_HOME=/home/bigdata/software/kylin-2.6.2
export PATH=$JAVA_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$HIVE_HOME/bin:$ZK_HOME/bin:$KAFKA_HOME/bin:$HBASE_HOME/bin:$HCAT_HOME/bin:$KYLIN_HOME/bin:$PATH
In order to facilitate this environment variable integral component of kylin dependent posted all of my environment variable configuration:
export JAVA_HOME=/home/bigdata/software/jdk1.8.0_201
export HADOOP_HOME=/home/bigdata/software/hadoop-2.7.7
export HADOOP_MAPRED_HOME=${HADOOP_HOME}
export HADOOP_COMMON_HOME=${HADOOP_HOME}
export HADOOP_HDFS_HOME=${HADOOP_HOME}
export YARN_HOME=${HADOOP_HOME}
export HIVE_HOME=/home/bigdata/software/hive-2.3.4
export HIVE_CONF_DIR=/home/bigdata/software/hive-2.3.4/conf
export HCAT_HOME=$HIVE_HOME/hcatalog
export ZK_HOME=/home/bigdata/software/zookeeper-3.4.13
export KAFKA_HOME=/home/bigdata/software/kafka_2.11-2.0.0
export HBASE_HOME=/home/bigdata/software/hbase-2.0.4
export KYLIN_HOME=/home/bigdata/software/kylin-2.6.2
export HADOOP_COMMON_LIB_NATIVE_DIR=${HADOOP_HOME}/lib/natvie
export HADOOP_OPTS="-Djava.library.path=${HADOOP_HOME}/lib:${HADOOP_HOME}/lib/native"
export PATH=$JAVA_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$HIVE_HOME/bin:$ZK_HOME/bin:$KAFKA_HOME/bin:$HBASE_HOME/bin:$HCAT_HOME/bin:$KYLIN_HOME/bin:$PATH
export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar
6. After the correct configuration above, the detection-dependent promoter kylin
Premise: pre-test to ensure hadoop, zookeeper, hbase, hive starts normally
detected in turn execute the following command:
# 执行下面的检查命令会在 hdfs 上创建 kylin 目录
./check-env.sh
# 检查数据源 hive 和数据存储 hbase
./find-hive-dependency.sh
./find-hbase-dependency.sh
Note: If the environment variable is not configured correctly, dependent components can not start properly, it is not detected by only one key to the investigation
7. Start all cluster on each node kylin
kylin.sh start
Description: kylin will depend to automatically detect the relevant associated components it needs after startup command is executed, as shown below tips
It should be the main, if you do not install the spark, kylin on the back of it depends on the detection of dependencies execution engine does not exist, the script will be prompted to provide itself with the download, if you are not a big data calculation engine spark follow the prompts to download if only you need to properly configure it to provide a spark to download the scripts in the bin directory of kylin
bin/download-spark.sh
Well, here in advance so you will not be prompted to download
8. Verify
In Part 7 tips to start end end
It proved correct to start, pay attention to three nodes need to get the prompt will prove entirely successful, otherwise there will be a lack of the ability to query and lead to job function can not be used
further confirmed through a graphical interface based on tips
Note: The table where the red box in the correct start will not immediately, because this is the data model and, after a successful start all Models, Datasour, Cubes are empty
9. Add and Data Model
Kylin very intimate, know that you will not be the first time, it provides an example of kylin three core Models, Datasour, Cubes of script in the bin directory kylin directory
bin/sample.sh
And then started properly execute bin / sample.sh, this process takes some time, when you see the following information prove incorrect create a kylin instance
When prompted, restart the kylin instance to take effect, so the restart kylin
Note: The command does not provide support kylin
kylin.sh restart
Therefore, only the first
kylin.sh stop
again
kylin.sh start
Note that each node is
performed after the above operation view visual interface:
So far kylin deployment configuration, start adding the sample was successful in all instances
10. Verify that there may also be instances kylin data table on hive
Note: This is a simple test developed to build a cluster configuration, the actual data is too large production environment configuration may be relatively complicated