The initial use Flink cep

First, what CEP is

In the application system, this or that event would happen, some events are triggered by the user, the system is triggered by some event, some may be triggered by a third party, but they can be seen as a state of the system observable change , such as user login application fails, under an order or RFID sensor return messages users. Respond to changes in state policy can be divided into two categories, one is a simple event processing (Simple event processing), generally have two simple event processing steps, filtering and routing, decide whether you want to deal with, by whom, and the other is complex event processing (complex event processing), complex event processing itself can handle a single event, but it is typically necessary to characteristics of the plurality of events is an event stream, and in response to detection and analysis.

Wikipedia also made a CEP defined, "CEP is an event-processing mode, it gets events from a number of sources, and detect complex patterns or environmental event, CEP aims to identify some meaningful events (such as some kind of a threat or opportunity), and make its response as soon as possible, "shows the main features of CEP include: the complexity, the need for testing in the event flow in a multi-source; low-latency, in seconds or milliseconds of response, such as respond to threats; high throughput, need to quickly respond to a large number of ultra-large number of events or streams.

Past CEP framework often handle extensive collection of events, can not handle events being collected, then, Flink came.

Two, Flink CEP

Flink mainstream computing framework as the current field of big data calculated in real time, natural support for low-latency, high throughput characteristics, coupled with the window Flink model and state model, it is the CEP provides a very strong support. Flink implemented in special library --Flink CEP complex event processing, used to facilitate detection of event patterns in the event stream.

The following is a simple example of how to achieve the CEP in Flink:

 1 public class CepEvent {
 2     public static void main(String[] args) throws Exception {
 3         StreamExecutionEnvironment env
 4                 = StreamExecutionEnvironment.getExecutionEnvironment();
 5         DataStream<Tuple3<Integer, String, String>> eventStream = env.fromElements(
 6                 Tuple3.of(1500, "login", "fail"),
 7                 Tuple3.of(1500, "login", "fail"),
 8                 Tuple3.of(1500, "login", "fail"),
 9                 Tuple3.of(1320, "login", "success"),
10                 Tuple3.of(1450, "exit", "success"),
11                 Tuple3.of(982, "login", "fail"));
12         AfterMatchSkipStrategy skipStrategy = AfterMatchSkipStrategy.skipPastLastEvent();
13         Pattern<Tuple3<Integer, String, String>, ?> loginFail =
14                 Pattern.<Tuple3<Integer, String, String>>begin("begin", skipStrategy)
15                         .where(new IterativeCondition<Tuple3<Integer, String, String>>() {
16                             @Override
17                             public boolean filter(Tuple3<Integer, String, String> s,
18                                                   Context<Tuple3<Integer, String, String>> context) throws Exception {
19                                 return s.f2.equalsIgnoreCase("fail");
20                             }
21                         }).times(3).within(Time.seconds(5));
22         PatternStream<Tuple3<Integer, String, String>> patternStream =
23                 CEP.pattern(eventStream.keyBy(x -> x.f0), loginFail);
24         DataStream<String> alarmStream =
25                 patternStream.select(new PatternSelectFunction<Tuple3<Integer, String, String>, String>() {
26                     @Override
27                     public String select(Map<String, List<Tuple3<Integer, String, String>>> map) throws Exception {
28                         String msg = String.format("ID %d has login failed 3 times in 5 seconds."
29                                 , map.values().iterator().next().get(0).f0);
30                         return msg;
31                     }
32                 });
33 
34         alarmStream.print();
35 
36         env.execute("cep event test");
37     }
38 }

Results are as follows:

 

   Visible successfully capture the user ID of 1500.

         Flink achieved in a CEP can be summarized in four steps: First, the need to build the data stream, two modes of the correct configuration, three, and data stream are combined mode, four, matching the acquired data in the stream mode. In which the first and third steps are generally standard practice, the core is the second part build mode, you need to take advantage of the characteristics Flink CEP supports, constructed correctly reflect the pattern matching business needs.

Three, Flink CEP to the CEP supporting

Flink CEP in that the core pattern matching, pattern matching for the different support characteristics often determine whether the corresponding frame CEP can be widely applied. Flink CEP on the model provided some support for the following:

(A) support matching pattern

Pattern matching with some common base mode, and a semantic expression of different pattern match, means that the model can be applied what extent.

Flink CEP has the following features to support semantic pattern matching:

  1. Support the number of matches, to support the number of matches, the match may specify one or more times (oneOrMore) can be specified to match a fixed number of times (times (n)), a specified range may be a fixed number of times (times (n, m)).
  2. Support history matching, the process of matching, may be performed to determine the properties of current events, we can also go back to history matching results matching event group for judgment.
  3. Support Group match, the support combined into the single mode pattern group, different combinations of support and semantic model, such as, or, until, begin, next, followBy, otNext, notFollowBy.
  4. Support matching window, the time window support, you can easily perform pattern matching within a certain time window.

(Ii) Support different conditions near

If only a single pattern matching, there is no need to consider approaching conditions. In the group execution mode, the process that is a combination of a plurality of modes of execution, the condition refers to how close to matching a set of events to a particular group pattern of different modes. Using different approaches conditions, will significantly change the results of the final match.

Flink CEP in close support three conditions:

A. Strict Contiguity, near refers to the strict matching events before and after the neighbor must have a strict relationship between an event that is not a non-match match event.

B. Relaxed Contiguity, refers to a loose matching event can have a non-matching matching event, the presence of non-matching event is not the event is non-continuous barrier successful match.

C. Non-Deterministic Relaxed Contiguity, non-deterministic matching loose on the basis of matching loose, even if an event has been finished to match a pattern can also participate in the match behind other modes.

(C) support different match after policy

After matching strategy refers to the incident when a group successfully matches a certain pattern, this set of events involved in the manner in which subsequent pattern matching. After different matching strategy will lead to match results to differ materially, so in actual development, need to be careful to select the appropriate match policy.

Flink CEP supports the following five matching strategy:

A. NO_SKIP strategy, which means matching the current event group event will be unfettered participation in follow-up mode.

B. SKIP_TO_NEXT strategy, which means matching the current event set in addition to the first event, other events may unfettered participation in follow-up mode.

C. SKIP_PAST_LAST_EVENT policy, meaning that any of the current events in the event group is not involved in the subsequent pattern matching.

D. SKIP_TO_FIRST policies, such policies need to specify a pattern, any child matching the current event group if it contains the specified pattern matching maximum matching event group event group, this sub-match will be discarded.

E. SKIP_TO_LAST policies, such policies need to specify a pattern, any child matching the current event group if it contains the specified pattern matches the minimum matching event group event group, this sub-match will be discarded.

(D) to support the event and time out of order

In the process of the CEP, the sequence of events is crucial to reach it because the sequence of events to reach the real decision will be successfully matched with the corresponding mode. CEP industry currently existing computing framework generally used natural order of events arriving, i.e., the processing time as a basis for pattern matching, this model can not meet the current requirements of the CEP distributed environment. In the big data in a distributed environment, the sequence of events to reach the sequence of events and often do not match, there is a delay to reach and out of order, etc., then often need to rely on time to the appropriate event pattern matching, or matching error occurs or matching failure.

Flink time model-specific events, including event time characteristics with watermark mechanism can also play a role in Flink CEP in. Flink CEP event will be cached, not the start pattern matching, then in the end, Flink CEP cached events are sorted by event time in the corresponding watermark, and then make the appropriate pattern matching, can to a large extent CEP solve the problem in a distributed environment.

Fourth, the end of the

CEP in all walks of life can have many applications, such as the risk of financial sector control, fraud detection, market strategies, etc., such as in the field of security attack alarms, hazard modeling, vulnerability discovery, and so on, and then such as intelligent transportation, user funnel, etc. the IOT of re-associated with the current, which can be applied are numerous scenarios. Mining and the need for further improvement of the capacity Flink CEP, empowerment and the output value for a variety of business scenarios.

Guess you like

Origin www.cnblogs.com/029zz010buct/p/11570551.html