Extraneous: ranking method topological sorting algorithm
A: Strom installed in the virtual machine
Strom Configuration
1: download, unzip the installation package
2: Modify Profile
(Strom's profile /root/apps/storm-1.2.2/conf/storm.yaml)
storm.zookeeper.servers:
- "hdp-1"
- "hdp-2"
- "hdp-3"
storm.local.dir: "/root/stormdata"
nimbus.seeds: ["hdp-1"]
supervisor.slots.ports:
- 6700
- 6701
- 6702
- 6703
Configure the environment variables: cd / etc / profile bin path STROM_HOME added and the strom
export KAFKA_HOME=/root/apps/kafka_2.12-2.2.0
export PATH=$PATH:$STORM_HOME/bin
After configuring the environment variables should pay attention to the need to update the file command is: source / etc / profile
3 Configure distributed to other machines (here is a scripted, sent to another machine and environment variables to create stromdata file used to store data generated Strom)
Script creates stromdata
Add the following code to create a test script executable permissions chmod 777 test increase or chmod + x test
#!/bin/bash
for host in hdp-2 hdp-3 hdp-4
do
ssh ${host} "mkdir /root/stormdata"
done
Send profile script
#!/bin/bash
for host in hdp-2 hdp-3 hdp-4
do
scp /etc/profile ${host}:/etc
done
Strom configured to copy to other machines: Example copied from the strom hdp-1 to the next apps hdp-4
scp -r strom hdp-4:apps
Storm running cluster (Note that Strom and kafka, like, are dependent on the zookeeper, so before starting Strom, first start zookeeper)
##启动nimbus
storm nimbus >/dev/null 2>&1 &
##启动supervisor
storm supervisor >/dev/null 2>&1 &
##启动core
storm ui &
##成功标志 输入jps查询
[root@hdp-1 bin]# jps
1760 core
1462 QuorumPeerMain
1656 Supervisor
1867 Jps
1500 nimbus
##注意为什么是 core不显示启动语句中的ui 是因为版本太低
Enter on the page hdp-1: 8080 the following page
Some components in the concept Strom
Nimbus: Strom's Master, responsible for resource allocation and task scheduling. A cluster is only a Nimbus
Supervisor: Strom's Slave, responsible for receiving Nimbus task assigned to manage all Workers (work processes), a Supervisor node can contain multiple worker processes
Worker: work processes, each worker process has multiple Task (task)
Task: tasks, each Spout and Bolt has a number of tasks (Tasks) Strom be performed in the cluster. Each task corresponds to a thread of execution.
parallelism: parallelism topology
Topology: computing topology (topology can understand how the algorithm in advance)
Storm topology is a package for real-time calculation application logic, the role of the MapReduce task (job) is similar, except that of a MapReduce job at the end always get the results later, and topology will always be running in the cluster, until you manually go terminated it. Another topology is appreciated that a series of the data stream (Stream Grouping) and Bolt interrelated Spout composition
Unfinished