Getting large data structures and application --- Storm

1. Storm configured in Linux environment

CPU name Tugel tuge2 tuge3
Deployment environment Zookeeper/Nimbus Zookeeper/Supervisor Zookeeper/Supervisor

(Deployment glance view)

Zookeeper 1.1 Configuration Environment (three machines to be configured, you can configure one and then distribute)

  • Go to the official website to download the Apache-ZooKeeper-3.5.5-bin.tar.gz , and then uploaded to the Linux / opt / zookeeper directory. (If you do not create the next.)

  • Decompression

    tar -xvf apache-zookeeper-3.5.5-bin.tar.gz

  • Configuration Environment

    vim /etc/profile

    export ZK_HOME=/opt/zookeeper/apache-zookeeper-3.5.5-bin
    
    export PATH=$ZK_HOME/bin:$PATH
  • Zookeeper configuration log auto clean
    by arranging autopurge.snapRetainCount and autopurge.purgeInterval these two parameters enables a regular cleaning.
    Both parameters are arranged in zoo.cfg in the front Notes removed, the number of modifications required to retain the log:

    autopurge.purgeInterval This parameter specifies the cleanup frequency, in hours, required to complete an integer of 1 or greater, the default is 0, which means do not open their own clean-up function.

    autopurge.snapRetainCount this parameter and the above parameters with the use of this parameter specifies the number of files you want to keep. The default is to retain 3.

1.2 Configuring the Java environment (three machines must be configured)

  • Go to the official website to download the JDK-8u221-Linux-x64.tar.gz , and then upload it to / opt / java directory. (At if not created, according to the official website to download Java8 + version)

  • Decompression

    tar -xvf jdk-8u221-linux-x64.tar.gz

  • Configuration Environment

    vim /etc/profile

    export JAVA_HOME=/opt/java/jdk1.8.0_221
    export PATH=$JAVA_HOME/bin:$ZK_HOME/bin:$PATH

Storm 1.3 Configuration Environment (three machines must be configured)

  • Configuring Global Environment

    • Go to the official website to download the Apache-Storm-2.1.0.tar.gz , and then upload it to / opt / storm directory. (Not created under)
    • Decompression

    tar -xvf apache-storm-2.1.0.tar.gz

    • Configuration Environment

    vim /etc/profile

    export STORM_HOME=/opt/storm/apache-storm-2.1.0
    export PATH=$STORM_HOME/bin:$JAVA_HOME/bin:$ZK_HOME/bin:$PATH
    • Reload the configuration file

    source /etc/profile

  • Storm.yaml configuration file

    vim /opt/storm/apache-storm-2.1.0/conf/storm.yaml

    • Zookeeper server configuration:
      the
    # storm.zookeeper.servers:
    #     - "server1"
    #     - "server2"

    Changed

    storm.zookeeper.servers:
    - "tuge1"
    - "tuge2"
    - "tuge3"
  • Create a state directory:

    Create a storm-local directory, and modify permissions

    mkdir -p /opt/storm/apache-storm-2.1.0/status

    storm.local.dir: "/opt/storm/apache-storm-2.1.0/status"
  • Configuring the master node address

    nimbus.seeds: ["tuge1"]
  • Configuring Worker number of computers (the actual production environment is configured according to the task execution, and here I refer to the official website to learn first four configuration)

    Add several ports, you can assign up to several Worker. Here four first configuration.

    Add storm.yaml which follows:

    supervisor.slots.ports:
    - 6700
    - 6701
    - 6702
    - 6703

1.4 Starting Storm

  • First start Zookeeper, three are activated concrete steps before starting reference blog

  • Nimbus start and UI, executed on tuge1

    ./storm nimbus >./logs/nimbus.out 2>&1 &
    ./storm ui >>./logs/ui.out 2>&1 &
  • Start Supervisor, running on tuge2, tuge3

    ./storm supervisor >>./logs/supervisor.out 2>&1 &

    PS:> dev> mean null 2> & 1 is input to the standard error output, then the output is input to the standard file dev null and inside the final & meant background.

    Access ui page: http: // tuge1: 8080 /

    (PS: If there is anything abnormal, look zookeeper whether to start, in addition to using jps look numbus and supervisor whether to start)

    如果报异常:Could not find leader nimbus from seed hosts [tuge1]. Did you specify a valid list of nimbus hosts for config nimbus.seeds?

    Please go to the bin directory zookeeper run: zkCli.sh, zookeeper into the console, and then delete the Storm node:

    Bug1

    Bug2

    Note: delete only delete nodes do not contain child nodes, if you want to delete a node contains child nodes, use the command rmr

    Bug3

    Restart zookeeper node:
    bin / zkServer.sh restart

    No problem you can see the following interface friends ~

    stormed

2.Storm run locally

Here is a word of additional content small case:

Creating a Maven project, and then add the following class structure:

StormProjectPreView

Code is as follows (refer to the idea on an architecture stroke cis):

App.java (inlet categories):
package Demo.Storm;

import java.util.Map;

import javax.security.auth.login.AppConfigurationEntry;
import javax.security.auth.login.Configuration;

import org.apache.storm.Config;
import org.apache.storm.LocalCluster;
import org.apache.storm.LocalCluster.LocalTopology;
import org.apache.storm.generated.StormTopology;
import org.apache.storm.thrift.TException;
import org.apache.storm.topology.TopologyBuilder;
import Demo.Storm.TestWordSpout;

/**
 * Hello world!
 *
 */
public class App {
    public static void main(String[] args) {
        try {

            TopologyBuilder builder = new TopologyBuilder();
            builder.setSpout("words", new TestWordSpout(), 6);//6个spout同时运行
            builder.setBolt("exclaim1", new ExclamationBolt1(), 2).shuffleGrouping("words");//2个bolt同时运行
            builder.setBolt("exclaim2", new ExclamationBolt2(), 2).shuffleGrouping("exclaim1");//2个bolt同时运行

            LocalCluster lc = new LocalCluster();//设置本地运行
    
            lc.submitTopology("wordadd", new Config(), builder.createTopology());//提交topology

        } catch (Exception ex) {
            ex.printStackTrace();
        }
    }
}
ExclamationBolt1.java(Bolt1):
package Demo.Storm;

import java.util.Map;

import org.apache.storm.task.OutputCollector;
import org.apache.storm.task.TopologyContext;
import org.apache.storm.topology.OutputFieldsDeclarer;
import org.apache.storm.topology.base.BaseRichBolt;
import org.apache.storm.tuple.Fields;
import org.apache.storm.tuple.Tuple;
import org.apache.storm.tuple.Values;

public class ExclamationBolt1 extends BaseRichBolt {
    
    OutputCollector _collector;

    public void prepare(Map<String, Object> topoConf, TopologyContext context, OutputCollector collector) {
        // TODO Auto-generated method stub
        _collector=collector;
    }

    public void execute(Tuple input) {
        // TODO Auto-generated method stub
        String val=input.getStringByField("words")+"!!!";
        _collector.emit(input, new Values(val));//input用来标识是哪个bolt
        _collector.ack(input);//确认bolt
    }

    public void declareOutputFields(OutputFieldsDeclarer declarer) {
        // TODO Auto-generated method stub
        declarer.declare(new Fields("exclaim1"));
    }

}
ExclamationBolt2.java(Bolt2):
package Demo.Storm;

import java.util.Map;

import org.apache.storm.task.OutputCollector;
import org.apache.storm.task.TopologyContext;
import org.apache.storm.topology.OutputFieldsDeclarer;
import org.apache.storm.topology.base.BaseRichBolt;
import org.apache.storm.tuple.Tuple;

public class ExclamationBolt2 extends BaseRichBolt {

    OutputCollector _collector;
    
    public void prepare(Map<String, Object> topoConf, TopologyContext context, OutputCollector collector) {
        // TODO Auto-generated method stub
        this._collector=collector;
    }

    public void execute(Tuple input) {
        // TODO Auto-generated method stub
        String str= input.getStringByField("exclaim1")+"~~~";
        System.err.println(str);
    }

    public void declareOutputFields(OutputFieldsDeclarer declarer) {
        // TODO Auto-generated method stub

    }

}
TestWordSpout.java (continuously streamed data):
package Demo.Storm;
import java.util.Map;
import java.util.Random;
import org.apache.storm.spout.SpoutOutputCollector;
import org.apache.storm.task.TopologyContext;
import org.apache.storm.topology.OutputFieldsDeclarer;
import org.apache.storm.topology.base.BaseRichSpout;
import org.apache.storm.tuple.Fields;
import org.apache.storm.tuple.Values;
import org.apache.storm.utils.Utils;
public class TestWordSpout extends BaseRichSpout {
    SpoutOutputCollector _collector;

    public void open(Map<String, Object> conf, TopologyContext context, SpoutOutputCollector collector) {
        // TODO Auto-generated method stub
        _collector=collector;

    }

    public void nextTuple() {
        // TODO Auto-generated method stub
        Utils.sleep(100);
        final String[] words = new String[] { "你好啊", "YiMing" };
        final Random rand = new Random();
        final String word = words[rand.nextInt(words.length)];//司机发送字符串
        _collector.emit(new Values(word));

    }

    public void declareOutputFields(OutputFieldsDeclarer declarer) {
        // TODO Auto-generated method stub
        declarer.declare(new Fields("words"));
    }

}
pom.xm (maven configuration file, if the problem refer described later):
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
  xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
  <modelVersion>4.0.0</modelVersion>

  <groupId>Demo</groupId>
  <artifactId>Storm</artifactId>
  <version>0.0.1-SNAPSHOT</version>
  <packaging>jar</packaging>

  <name>Storm</name>
  <url>http://maven.apache.org</url>

  <properties>
    <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
  </properties>

  <dependencies>
    <dependency>
      <groupId>junit</groupId>
      <artifactId>junit</artifactId>
      <version>3.8.1</version>
      <scope>test</scope>
    </dependency>
    <!-- https://mvnrepository.com/artifact/org.apache.storm/storm-client -->
    <dependency>
        <groupId>org.apache.storm</groupId>
        <artifactId>storm-client</artifactId>
        <version>2.1.0</version>
    
    </dependency>
    <!-- https://mvnrepository.com/artifact/org.apache.storm/storm-server -->
    <dependency>
        <groupId>org.apache.storm</groupId>
        <artifactId>storm-server</artifactId>
        <version>2.1.0</version>      
    </dependency>
    
  </dependencies>
</project>

Operating results are as follows:

StormProjectResult

Problems encountered:

One problem: LocalCluster this class obviously exist, but not introduced?

Solution: First, look at the difference jar package and can introduce is that this jar package is gray, and a look ignorant, it is estimated that this thing. Then why do some searching online search package is gray, and she has the answer to that is there with the pom test Caused, delete it OK, if there is gray, then just a little, will *** They are removed, or else this is a pit. . . ( Reference source )

Storm eight Grouping Policy

1) shuffleGrouping (randomization)

2) fieldsGrouping (field of a packet according to, i.e., where one and the same word can be sent to a Bolt)

3) allGrouping (broadcast transmission, i.e. each Tuple, each will receive a Bolt)

4) globalGrouping (global packet Tuple assigned to the lowest value which task task id)

5) noneGrouping (randomized)

6) directGrouping (direct packet, the Tuple Bolt designate corresponding transmission relationship)

7)Local or shuffle Grouping

8) customGrouping (custom Grouping)

3.Storm run on Linux clusters

Here is a word count of the number of small case:

Continue to add the following class files in the project on the basis of the above, the following structure:

WordCountProjectPreView

code show as below:

WordCountApp.java (class entry)
package Demo.Storm;

import org.apache.storm.Config;
import org.apache.storm.LocalCluster;
import org.apache.storm.StormSubmitter;
import org.apache.storm.topology.TopologyBuilder;
import org.apache.storm.tuple.Fields;


public class WordCountApp {

      
      

    /**
     * @param args
     */
    public static void main(String[] args) {
        // TODO Auto-generated method stub
        TopologyBuilder builder = new TopologyBuilder();
        
        builder.setSpout("words", new WordCountSpout(), 8);//8个Spout同时执行
        
        builder.setBolt("wordSplit", new WordCountSplitBolt(), 3).shuffleGrouping("words");//3个Bolt同时执行
        builder.setBolt("wordSum", new WordCountSumBolt(), 3).fieldsGrouping("wordSplit", new Fields("word"));//3个Bolt同时执行

        if (args.length > 0) {//如果有参数,走集群执行
            try {
                StormSubmitter.submitTopology(args[0], new Config(), builder.createTopology());

            } catch (Exception ex) {
                ex.printStackTrace();
            }

        } else {//没有参数走本机执行

            try {
                LocalCluster lc = new LocalCluster();
                lc.submitTopology("wordCount", new Config(), builder.createTopology());

            } catch (Exception ex) {
                ex.printStackTrace();
            }
        }

    }

}
WordCountSpout.java (providing a steady stream of data)
package Demo.Storm;

import java.util.Map;
import java.util.Random;

import org.apache.storm.spout.SpoutOutputCollector;
import org.apache.storm.task.TopologyContext;
import org.apache.storm.topology.OutputFieldsDeclarer;
import org.apache.storm.topology.base.BaseRichSpout;
import org.apache.storm.tuple.Fields;
import org.apache.storm.tuple.Values;
import org.apache.storm.utils.Utils;


public class WordCountSpout extends BaseRichSpout {

    SpoutOutputCollector _collector;
    public void open(Map<String, Object> conf, TopologyContext context, SpoutOutputCollector collector) {
        // TODO Auto-generated method stub
        this._collector=collector;
    }

    public void nextTuple() {
        // TODO Auto-generated method stub
        Utils.sleep(1000);
        String[] words=new String[] {
                "hello YiMing",
                "nice to meet you"
        };
        Random r=new Random();
        _collector.emit(new Values(words[r.nextInt(words.length)]));//随机传递一个字母
    }

    public void declareOutputFields(OutputFieldsDeclarer declarer) {
        // TODO Auto-generated method stub
        declarer.declare(new Fields("words"));
    }

}
WordCountSplitBolt.java (split type)
package Demo.Storm;

import java.util.HashMap;
import java.util.List;
import java.util.Map;

import org.apache.storm.task.OutputCollector;
import org.apache.storm.task.TopologyContext;
import org.apache.storm.topology.OutputFieldsDeclarer;
import org.apache.storm.topology.base.BaseRichBolt;
import org.apache.storm.tuple.Fields;
import org.apache.storm.tuple.Tuple;
import org.apache.storm.tuple.Values;

public class WordCountSplitBolt extends BaseRichBolt {

    OutputCollector _collector;

    public void prepare(Map<String, Object> topoConf, TopologyContext context, OutputCollector collector) {
        // TODO Auto-generated method stub
        this._collector = collector;
    }

    //传递分割后的字母
    public void execute(Tuple input) {
        // TODO Auto-generated method stub
        String line=input.getString(0);
        String[] lineGroup= line.split(" ");
        for(String str:lineGroup) {
            List list=new Values(str);
            _collector.emit(input, list);
            _collector.ack(input);
        }       
    }
    //声明传递的字母名称为 word,下一个bolt可以通过此名称获取
    public void declareOutputFields(OutputFieldsDeclarer declarer) {
        // TODO Auto-generated method stub
        declarer.declare(new Fields("word"));
    }

}
WordCountSumBolt.java (inductive statistical category)
package Demo.Storm;

import java.util.HashMap;
import java.util.Map;

import org.apache.storm.task.OutputCollector;
import org.apache.storm.task.TopologyContext;
import org.apache.storm.topology.OutputFieldsDeclarer;
import org.apache.storm.topology.base.BaseRichBolt;
import org.apache.storm.tuple.Tuple;

public class WordCountSumBolt extends BaseRichBolt {

    OutputCollector _collector;
    Map<String, Integer> map = new HashMap<String, Integer>();

    public void prepare(Map<String, Object> topoConf, TopologyContext context, OutputCollector collector) {
        // TODO Auto-generated method stub
        this._collector = collector;
    }
    //归纳统计
    public void execute(Tuple input) {
        // TODO Auto-generated method stub
        String word = input.getString(0);

        if (map.containsKey(word)) {
            map.put(word, (map.get(word) + 1));
        } else {
            map.put(word, 1);
        }
        System.err.println("单词:" + word + ",出现:" + map.get(word) + "次");

    }

    public void declareOutputFields(OutputFieldsDeclarer declarer) {
        // TODO Auto-generated method stub

    }

}
When you are ready, compiler package, and then upload it to Linux on it.

WordCountDaBao

WordCountUpdatePack

Into /opt/storm/apache-storm-2.1.0/bin execution

[root@tuge1 bin]# ./storm jar /opt/data/storm/WordCount.jar Demo.Storm.WordCountApp wc

Official Example run:

storm jar all-my-code.jar org.apache.storm.MyTopology arg1 arg2

WordCountConsole

WordCountUI

End Task

storm kill wc (ie topology name)

To get results, please refer to: https://blog.csdn.net/cuihaolong/article/details/52684396

PS: During operation, Task can not be changed, but the Worker and Executer can change.

ZJ ...

Series Portal

Guess you like

Origin www.cnblogs.com/shun7man/p/12424386.html