table of Contents
1. Storm configured in Linux environment
CPU name | Tugel | tuge2 | tuge3 |
---|---|---|---|
Deployment environment | Zookeeper/Nimbus | Zookeeper/Supervisor | Zookeeper/Supervisor |
(Deployment glance view)
Zookeeper 1.1 Configuration Environment (three machines to be configured, you can configure one and then distribute)
Go to the official website to download the Apache-ZooKeeper-3.5.5-bin.tar.gz , and then uploaded to the Linux / opt / zookeeper directory. (If you do not create the next.)
Decompression
tar -xvf apache-zookeeper-3.5.5-bin.tar.gz
Configuration Environment
vim /etc/profile
export ZK_HOME=/opt/zookeeper/apache-zookeeper-3.5.5-bin export PATH=$ZK_HOME/bin:$PATH
Zookeeper configuration log auto clean
by arranging autopurge.snapRetainCount and autopurge.purgeInterval these two parameters enables a regular cleaning.
Both parameters are arranged in zoo.cfg in the front Notes removed, the number of modifications required to retain the log:autopurge.purgeInterval This parameter specifies the cleanup frequency, in hours, required to complete an integer of 1 or greater, the default is 0, which means do not open their own clean-up function.
autopurge.snapRetainCount this parameter and the above parameters with the use of this parameter specifies the number of files you want to keep. The default is to retain 3.
1.2 Configuring the Java environment (three machines must be configured)
Go to the official website to download the JDK-8u221-Linux-x64.tar.gz , and then upload it to / opt / java directory. (At if not created, according to the official website to download Java8 + version)
Decompression
tar -xvf jdk-8u221-linux-x64.tar.gz
Configuration Environment
vim /etc/profile
export JAVA_HOME=/opt/java/jdk1.8.0_221 export PATH=$JAVA_HOME/bin:$ZK_HOME/bin:$PATH
Storm 1.3 Configuration Environment (three machines must be configured)
Configuring Global Environment
- Go to the official website to download the Apache-Storm-2.1.0.tar.gz , and then upload it to / opt / storm directory. (Not created under)
- Decompression
tar -xvf apache-storm-2.1.0.tar.gz
- Configuration Environment
vim /etc/profile
export STORM_HOME=/opt/storm/apache-storm-2.1.0 export PATH=$STORM_HOME/bin:$JAVA_HOME/bin:$ZK_HOME/bin:$PATH
- Reload the configuration file
source /etc/profile
- Go to the official website to download the Apache-Storm-2.1.0.tar.gz , and then upload it to / opt / storm directory. (Not created under)
Storm.yaml configuration file
vim /opt/storm/apache-storm-2.1.0/conf/storm.yaml
- Zookeeper server configuration:
the
# storm.zookeeper.servers: # - "server1" # - "server2"
Changed
storm.zookeeper.servers: - "tuge1" - "tuge2" - "tuge3"
- Zookeeper server configuration:
Create a state directory:
Create a storm-local directory, and modify permissions
mkdir -p /opt/storm/apache-storm-2.1.0/status
storm.local.dir: "/opt/storm/apache-storm-2.1.0/status"
Configuring the master node address
nimbus.seeds: ["tuge1"]
Configuring Worker number of computers (the actual production environment is configured according to the task execution, and here I refer to the official website to learn first four configuration)
Add several ports, you can assign up to several Worker. Here four first configuration.
Add storm.yaml which follows:
supervisor.slots.ports: - 6700 - 6701 - 6702 - 6703
1.4 Starting Storm
First start Zookeeper, three are activated concrete steps before starting reference blog
Nimbus start and UI, executed on tuge1
./storm nimbus >./logs/nimbus.out 2>&1 & ./storm ui >>./logs/ui.out 2>&1 &
Start Supervisor, running on tuge2, tuge3
./storm supervisor >>./logs/supervisor.out 2>&1 &
PS:> dev> mean null 2> & 1 is input to the standard error output, then the output is input to the standard file dev null and inside the final & meant background.
Access ui page: http: // tuge1: 8080 /
(PS: If there is anything abnormal, look zookeeper whether to start, in addition to using jps look numbus and supervisor whether to start)
如果报异常:Could not find leader nimbus from seed hosts [tuge1]. Did you specify a valid list of nimbus hosts for config nimbus.seeds?
Please go to the bin directory zookeeper run: zkCli.sh, zookeeper into the console, and then delete the Storm node:
Note: delete only delete nodes do not contain child nodes, if you want to delete a node contains child nodes, use the command rmr
Restart zookeeper node:
bin / zkServer.sh restartNo problem you can see the following interface friends ~
2.Storm run locally
Here is a word of additional content small case:
Creating a Maven project, and then add the following class structure:
Code is as follows (refer to the idea on an architecture stroke cis):
App.java (inlet categories):
package Demo.Storm;
import java.util.Map;
import javax.security.auth.login.AppConfigurationEntry;
import javax.security.auth.login.Configuration;
import org.apache.storm.Config;
import org.apache.storm.LocalCluster;
import org.apache.storm.LocalCluster.LocalTopology;
import org.apache.storm.generated.StormTopology;
import org.apache.storm.thrift.TException;
import org.apache.storm.topology.TopologyBuilder;
import Demo.Storm.TestWordSpout;
/**
* Hello world!
*
*/
public class App {
public static void main(String[] args) {
try {
TopologyBuilder builder = new TopologyBuilder();
builder.setSpout("words", new TestWordSpout(), 6);//6个spout同时运行
builder.setBolt("exclaim1", new ExclamationBolt1(), 2).shuffleGrouping("words");//2个bolt同时运行
builder.setBolt("exclaim2", new ExclamationBolt2(), 2).shuffleGrouping("exclaim1");//2个bolt同时运行
LocalCluster lc = new LocalCluster();//设置本地运行
lc.submitTopology("wordadd", new Config(), builder.createTopology());//提交topology
} catch (Exception ex) {
ex.printStackTrace();
}
}
}
ExclamationBolt1.java(Bolt1):
package Demo.Storm;
import java.util.Map;
import org.apache.storm.task.OutputCollector;
import org.apache.storm.task.TopologyContext;
import org.apache.storm.topology.OutputFieldsDeclarer;
import org.apache.storm.topology.base.BaseRichBolt;
import org.apache.storm.tuple.Fields;
import org.apache.storm.tuple.Tuple;
import org.apache.storm.tuple.Values;
public class ExclamationBolt1 extends BaseRichBolt {
OutputCollector _collector;
public void prepare(Map<String, Object> topoConf, TopologyContext context, OutputCollector collector) {
// TODO Auto-generated method stub
_collector=collector;
}
public void execute(Tuple input) {
// TODO Auto-generated method stub
String val=input.getStringByField("words")+"!!!";
_collector.emit(input, new Values(val));//input用来标识是哪个bolt
_collector.ack(input);//确认bolt
}
public void declareOutputFields(OutputFieldsDeclarer declarer) {
// TODO Auto-generated method stub
declarer.declare(new Fields("exclaim1"));
}
}
ExclamationBolt2.java(Bolt2):
package Demo.Storm;
import java.util.Map;
import org.apache.storm.task.OutputCollector;
import org.apache.storm.task.TopologyContext;
import org.apache.storm.topology.OutputFieldsDeclarer;
import org.apache.storm.topology.base.BaseRichBolt;
import org.apache.storm.tuple.Tuple;
public class ExclamationBolt2 extends BaseRichBolt {
OutputCollector _collector;
public void prepare(Map<String, Object> topoConf, TopologyContext context, OutputCollector collector) {
// TODO Auto-generated method stub
this._collector=collector;
}
public void execute(Tuple input) {
// TODO Auto-generated method stub
String str= input.getStringByField("exclaim1")+"~~~";
System.err.println(str);
}
public void declareOutputFields(OutputFieldsDeclarer declarer) {
// TODO Auto-generated method stub
}
}
TestWordSpout.java (continuously streamed data):
package Demo.Storm;
import java.util.Map;
import java.util.Random;
import org.apache.storm.spout.SpoutOutputCollector;
import org.apache.storm.task.TopologyContext;
import org.apache.storm.topology.OutputFieldsDeclarer;
import org.apache.storm.topology.base.BaseRichSpout;
import org.apache.storm.tuple.Fields;
import org.apache.storm.tuple.Values;
import org.apache.storm.utils.Utils;
public class TestWordSpout extends BaseRichSpout {
SpoutOutputCollector _collector;
public void open(Map<String, Object> conf, TopologyContext context, SpoutOutputCollector collector) {
// TODO Auto-generated method stub
_collector=collector;
}
public void nextTuple() {
// TODO Auto-generated method stub
Utils.sleep(100);
final String[] words = new String[] { "你好啊", "YiMing" };
final Random rand = new Random();
final String word = words[rand.nextInt(words.length)];//司机发送字符串
_collector.emit(new Values(word));
}
public void declareOutputFields(OutputFieldsDeclarer declarer) {
// TODO Auto-generated method stub
declarer.declare(new Fields("words"));
}
}
pom.xm (maven configuration file, if the problem refer described later):
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>Demo</groupId>
<artifactId>Storm</artifactId>
<version>0.0.1-SNAPSHOT</version>
<packaging>jar</packaging>
<name>Storm</name>
<url>http://maven.apache.org</url>
<properties>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
</properties>
<dependencies>
<dependency>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
<version>3.8.1</version>
<scope>test</scope>
</dependency>
<!-- https://mvnrepository.com/artifact/org.apache.storm/storm-client -->
<dependency>
<groupId>org.apache.storm</groupId>
<artifactId>storm-client</artifactId>
<version>2.1.0</version>
</dependency>
<!-- https://mvnrepository.com/artifact/org.apache.storm/storm-server -->
<dependency>
<groupId>org.apache.storm</groupId>
<artifactId>storm-server</artifactId>
<version>2.1.0</version>
</dependency>
</dependencies>
</project>
Operating results are as follows:
Problems encountered:
One problem: LocalCluster this class obviously exist, but not introduced?
Solution: First, look at the difference jar package and can introduce is that this jar package is gray, and a look ignorant, it is estimated that this thing. Then why do some searching online search package is gray, and she has the answer to that is there with the pom
Storm eight Grouping Policy
1) shuffleGrouping (randomization)
2) fieldsGrouping (field of a packet according to, i.e., where one and the same word can be sent to a Bolt)
3) allGrouping (broadcast transmission, i.e. each Tuple, each will receive a Bolt)
4) globalGrouping (global packet Tuple assigned to the lowest value which task task id)
5) noneGrouping (randomized)
6) directGrouping (direct packet, the Tuple Bolt designate corresponding transmission relationship)
7)Local or shuffle Grouping
8) customGrouping (custom Grouping)
3.Storm run on Linux clusters
Here is a word count of the number of small case:
Continue to add the following class files in the project on the basis of the above, the following structure:
code show as below:
WordCountApp.java (class entry)
package Demo.Storm;
import org.apache.storm.Config;
import org.apache.storm.LocalCluster;
import org.apache.storm.StormSubmitter;
import org.apache.storm.topology.TopologyBuilder;
import org.apache.storm.tuple.Fields;
public class WordCountApp {
/**
* @param args
*/
public static void main(String[] args) {
// TODO Auto-generated method stub
TopologyBuilder builder = new TopologyBuilder();
builder.setSpout("words", new WordCountSpout(), 8);//8个Spout同时执行
builder.setBolt("wordSplit", new WordCountSplitBolt(), 3).shuffleGrouping("words");//3个Bolt同时执行
builder.setBolt("wordSum", new WordCountSumBolt(), 3).fieldsGrouping("wordSplit", new Fields("word"));//3个Bolt同时执行
if (args.length > 0) {//如果有参数,走集群执行
try {
StormSubmitter.submitTopology(args[0], new Config(), builder.createTopology());
} catch (Exception ex) {
ex.printStackTrace();
}
} else {//没有参数走本机执行
try {
LocalCluster lc = new LocalCluster();
lc.submitTopology("wordCount", new Config(), builder.createTopology());
} catch (Exception ex) {
ex.printStackTrace();
}
}
}
}
WordCountSpout.java (providing a steady stream of data)
package Demo.Storm;
import java.util.Map;
import java.util.Random;
import org.apache.storm.spout.SpoutOutputCollector;
import org.apache.storm.task.TopologyContext;
import org.apache.storm.topology.OutputFieldsDeclarer;
import org.apache.storm.topology.base.BaseRichSpout;
import org.apache.storm.tuple.Fields;
import org.apache.storm.tuple.Values;
import org.apache.storm.utils.Utils;
public class WordCountSpout extends BaseRichSpout {
SpoutOutputCollector _collector;
public void open(Map<String, Object> conf, TopologyContext context, SpoutOutputCollector collector) {
// TODO Auto-generated method stub
this._collector=collector;
}
public void nextTuple() {
// TODO Auto-generated method stub
Utils.sleep(1000);
String[] words=new String[] {
"hello YiMing",
"nice to meet you"
};
Random r=new Random();
_collector.emit(new Values(words[r.nextInt(words.length)]));//随机传递一个字母
}
public void declareOutputFields(OutputFieldsDeclarer declarer) {
// TODO Auto-generated method stub
declarer.declare(new Fields("words"));
}
}
WordCountSplitBolt.java (split type)
package Demo.Storm;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
import org.apache.storm.task.OutputCollector;
import org.apache.storm.task.TopologyContext;
import org.apache.storm.topology.OutputFieldsDeclarer;
import org.apache.storm.topology.base.BaseRichBolt;
import org.apache.storm.tuple.Fields;
import org.apache.storm.tuple.Tuple;
import org.apache.storm.tuple.Values;
public class WordCountSplitBolt extends BaseRichBolt {
OutputCollector _collector;
public void prepare(Map<String, Object> topoConf, TopologyContext context, OutputCollector collector) {
// TODO Auto-generated method stub
this._collector = collector;
}
//传递分割后的字母
public void execute(Tuple input) {
// TODO Auto-generated method stub
String line=input.getString(0);
String[] lineGroup= line.split(" ");
for(String str:lineGroup) {
List list=new Values(str);
_collector.emit(input, list);
_collector.ack(input);
}
}
//声明传递的字母名称为 word,下一个bolt可以通过此名称获取
public void declareOutputFields(OutputFieldsDeclarer declarer) {
// TODO Auto-generated method stub
declarer.declare(new Fields("word"));
}
}
WordCountSumBolt.java (inductive statistical category)
package Demo.Storm;
import java.util.HashMap;
import java.util.Map;
import org.apache.storm.task.OutputCollector;
import org.apache.storm.task.TopologyContext;
import org.apache.storm.topology.OutputFieldsDeclarer;
import org.apache.storm.topology.base.BaseRichBolt;
import org.apache.storm.tuple.Tuple;
public class WordCountSumBolt extends BaseRichBolt {
OutputCollector _collector;
Map<String, Integer> map = new HashMap<String, Integer>();
public void prepare(Map<String, Object> topoConf, TopologyContext context, OutputCollector collector) {
// TODO Auto-generated method stub
this._collector = collector;
}
//归纳统计
public void execute(Tuple input) {
// TODO Auto-generated method stub
String word = input.getString(0);
if (map.containsKey(word)) {
map.put(word, (map.get(word) + 1));
} else {
map.put(word, 1);
}
System.err.println("单词:" + word + ",出现:" + map.get(word) + "次");
}
public void declareOutputFields(OutputFieldsDeclarer declarer) {
// TODO Auto-generated method stub
}
}
When you are ready, compiler package, and then upload it to Linux on it.
Into /opt/storm/apache-storm-2.1.0/bin execution
[root@tuge1 bin]# ./storm jar /opt/data/storm/WordCount.jar Demo.Storm.WordCountApp wc
Official Example run:
storm jar all-my-code.jar org.apache.storm.MyTopology arg1 arg2
End Task
storm kill wc (ie topology name)
To get results, please refer to: https://blog.csdn.net/cuihaolong/article/details/52684396
PS: During operation, Task can not be changed, but the Worker and Executer can change.
ZJ ...