Storm configuration

2.2 Configuration of Storm
 
2.2.1 Configuration Types of Storm
 
Storm has a large number of configurations that can adjust the parameters of Nimbus, Supervisor, and topology operation. Some of these configurations are system configurations that cannot be modified, while others can be modified.
 
Each configuration will have a default value, which is defined in the defaults.yaml file in the Storm codebase. These configuration values ​​can be overridden by defining a storm.yaml file in the Nimbus and Supervisor classpaths. When using StormSubmitter to submit a topology, you can define a configuration that specifies the topology, but only the configuration items prefixed with TOPOLOGY can be overwritten.
 
Storm 0.7.0 and later versions allow to override the configuration in Spout/Bolt. The configurations that can be modified mainly include:
"topology.debug"。
"topology.max.spout.pending"。
"topology.max.task.parallelism"。
"topology.kryo.register"。
 
topology.kryo.register differs from other configurations in that its serialization applies to all components on the topology.
 
Storm's Java API also provides two ways to specify component configuration.
 
Internally
 
In the Spout or Bolt class, override the getComponentConfiguration method to return a Map object of the component configuration.
The getComponentConfiguration method is defined as follows:
Map<String, Object> getComponentConfiguration()
 
Externally
 
Use the setSpout method of the TopologyBuilder class to return the SpoutDeclarer object, and use the setBolt method to return the BoltDeclarer object. SpoutDeclarer and BoltDeclarer implement the ComponentConfigurationDeclarer interface, which has an addConfiguration method and an addConfigurations method, which can be used to override the component's configuration by calling these two methods.
 
The definition code of the SpoutDeclarer interface is as follows:
public interface SpoutDeclarer extends
ComponentConfigurationDeclarer<SpoutDeclarer> {
    
}
 
The definition code of the BoltDeclarer interface is as follows:
public interface BoltDeclarer extends InputDeclarer<BoltDeclarer>,
ComponentConfigurationDeclarer<BoltDeclarer> {
    
}
 
The definition code of the ComponentConfigurationDeclarer interface is as follows:
public interface ComponentConfigurationDeclarer<T
extends ComponentConfigurationDeclarer> {
    T addConfigurations(Map conf);
    T addConfiguration(String config, Object value);
    T setDebug(boolean debug);
    T setMaxTaskParallelism(Number val);
    T setMaxSpoutPending(Number val);
    T setNumTasks (Number val);
}
 
The order of precedence for Storm configuration values ​​is:
defaults.yaml < storm.yaml < configuration for specific topology < configuration for internal specific components < configuration for external specific components


-------------------------------------------------

public interface IComponent extends Serializable {
    void declareOutputFields(OutputFieldsDeclarer var1);

    Map<String, Object> getComponentConfiguration();
}


public Map<String, Object> getComponentConfiguration() {
        Map<String, Object> conf = new HashMap<String, Object>();
        conf.put(Config.TOPOLOGY_TICK_TUPLE_FREQ_SECS, emitFrequencyInSeconds);
        return conf;
    }


--------------------------------------------------
    I took a look at RollingTopWords again, and what I paid more attention to was in the RollingCountBolt class. The method TupleHelpers.isTickTuple(tuple) is used to judge whether the current window data should be emitted, but the basis of the judgment confused me at first, it is actually a judgment Whether the tuple originates from the components of "__system" and the "__tick" stream.
    
    As someone who doesn't know much about Storm, I'm really confused. Aren't the tuples all launched from the upstream spout, where are tuples from different sources.

    Well, I started to guess, could there be a hidden spout? Or what special tuple the RollingCountBolt sends itself.

    When I was clueless, a miracle happened. When I moved the mouse to the constant Config.TOPOLOGY_TICK_TUPLE_FREQ_SECS, a small prompt appeared:

How often a tick tuple from the "__system" component and "__tick" stream should be sent to tasks. Meant to be used as a component-specific configuration.

Oh, in the method getComponentConfiguration()

        conf.put(Config.TOPOLOGY_TICK_TUPLE_FREQ_SECS, emitFrequencyInSeconds);

This sentence tells the system that it needs to generate a tuple from the "__system" component and the "__tick" stream to the task according to the frequency of emitFrequencyInSeconds.

     Well, this is how RollingTopWords generates special tuples at regular intervals. For a novice like me, it is a bit of progress.


------------------------------------------------

Sliding window There are a wide range of monitoring and statistical applications. For example, every period of time (10s), the number of requests or abnormal times in the last 30s is counted, and corresponding measures are taken according to the number of requests or abnormal times. Here, let's talk about the principle of sliding window implementation in Storm. See the figure below: the




window size is 30s, and the statistics are counted every 10s, then the window has a total of 3 slots, and an array with a length of 3 can be created for the window; in the storm's blot, the execute(tuple) function is used continuously within 10s. Count the received tuple (if the built-in variable is tuple_count), the sliding window movement will be triggered automatically every 10s (Array[slot3]=”Array[slot2], Array[slot2]=”Array[slot1]), And store the current tuple_count value to Array[slot3], and then the data of the statistical window can be performed.

So how to automatically trigger every 10s, storm has a TickTuple to meet this requirement,

the "__system" component will regularly send the tuple of "__tick" stream to the task. The
frequency of tuple sending is configured by TOPOLOGY_TICK_TUPLE_FREQ_SECS, which
can in the code It is configured through getComponentConfiguration(),

public Map<String, Object> getComponentConfiguration() {
Map<String, Object> conf = new HashMap<String, Object>();
conf.put(Config.TOPOLOGY_TICK_TUPLE_FREQ_SECS, emitFrequencyInSeconds);
return conf;
After the configuration is complete, storm will periodically send ticktuple to the task. It
only needs to use isTickTuple to determine whether it is a tickTuple, and the function of timing trigger can be completed.

public static boolean isTickTuple(Tuple tuple) {
return tuple.getSourceComponent().equals(Constants.SYSTEM_COMPONENT_ID) \\ SYSTEM_COMPONENT_ID == "__system"
&& tuple.getSourceStreamId().equals(Constants.SYSTEM_TICK_STREAM_ID); \\ SYSTEM_TICK_STREAM_ID == "__tick"
}

Guess you like

Origin http://10.200.1.11:23101/article/api/json?id=327063352&siteId=291194637