Log data processing services: grammar framework and sample

1. Global Operations Event

1.1 Field assignment (set event)

1.1.1. Syntax Introduction

grammar:

SET_EVENT[[_]数字]_新字段 = 固定值
SET_EVENT[[_]数字]_新字段 = 表达式函数

Explanation

  • Providing a single field value, the field name 新字段, if one exists, it overwrites the existing field values
  • 新字段The character constraint is: in English _ digital composition, but can not start with a number. Note: Chinese support, but not :, in this manner can not log tagthe like, may refer to general procedure complete such requirements.
  • Expression function returns no value Nonewhen the operation will be ignored
  • Expression of any type of the value returned will be converted to a string (e.g., the number is formatted as a string, back into the event)
  • When set to the same require multiple field under certain circumstances, require the use of an intermediate [[_]数字to make a different value to distinguish placeholder.
  • Complete expression function information, please refer expression functions

1.1.2. Sample

Example 1: set a fixed value
to add a new field cityvalue 上海.

SET_EVENT_city = "上海"

Example 2: Copy field values
calling a single function of the expression v, the existing field retvalues assigned to the new field result.

SET_EVENT_result = v("ret")

Example 3: Dynamic settings
where the function call expression composition, from the field retand the returnvalue present in the first field, the field returns to its assigned to lowercaseresult

SET_EVENT_result = str_lower(v("ret", "return"))

Example 4: Multiple set field values
set first field event_typedefaults:

SET_EVENT_1_event_type = "login event"

After some other operations, the field resultis faila field event, which event_typeis set to login failed event:

# 中间一些其他操作

SET_EVENT_2_event_type = op_if(op_eq(v("ret"), "fail"), "login failed event", v("event_type"))

Note: This requires the use of different numbers do placeholder field in front of the same name.

1.2 Field extract (extract event)

1.2.1. Syntax Introduction

grammar:

EXTRACT_EVENT[[_]数字]_字段 = 字符串
EXTRACT_EVENT[[_]数字]_字段 = 字段操作类函数

Explanation

  • Single 字段value operates to extract a value typically multiple fields, such as regular expressions, JSON expand, enriched lookup table, the split key-value pairs, also comprising a field value, an event will be split into a plurality.
  • String field operation is a function of the class REGEXa shorthand of.
  • A plurality of default values will be extracted does not exist or is empty when the original cover field, with further reference to the extracted value of the field inspection and overwrite mode
  • 字段The character constraints are: Chinese, English composition _ but can not start with a number. Note: Support for Chinese, but does not support :, it is not this way for logs tag, etc. to operate, can refer to the generic operations complete such requirements.
  • Expression function returns no value Nonewhen the operation will be ignored
  • When the same values as needed in some cases a multiple extraction field, it requires the use of an intermediate [[_]数字to make a different value to distinguish placeholder.
  • The complete operation field class functions, refer to the field-based manipulation functions

1.2.2. Sample

Example 1: Regular expression value extracted
from the field emailto extract user name user, mail name Division company:

EXTRACT_EVENT_email = r"(?P<user>\w+)@(?P<company>\w+)\.com"

Note: String field operation is a regular expression class functions ( REGEX) shorthand, more details with reference to use.

Example 2: The field value mapping new field
based on the field levelvalue, the function calls the operating field class LOOKUPmapping a new field level_infoto:

EXTRACT_EVENT_level = LOOKUP({"1": "info","2": "warning","3": "error", "*": "other"},"level_info")

Example 3: Expand JSON
The fields request_bodyand response_bodyvalues, the function calls the operating field class JSONexpand automatically (default depth 10) into a plurality of values:

EXTRACT_EVENT_request_body = JSON
EXTRACT_EVENT_response_body = JSON(depth=1)

JSON with no parameters is a simplified way to call more JSONfunction parameters, specific reference manual.

Example 4: Multiple extract field values
First field response_bodydeployed JSON, then the regular expression to extract a particular value of them:

EXTRACT_EVENT_1_response_body = JSON(depth=1)
EXTRACT_EVENT_2_response_body = r"trace_id=(?P<trace_id>[\w\-]+)"

Note: This requires the use of different numbers do placeholder field in front of the same name.

1.3. Spoken operation

1.3.1. Syntax Introduction

grammar:

TRANSFORM_ANY_[占位符] = 操作
TRANSFORM_ANY_[占位符] = 操作列表

Operation
General operation of three forms, extended forms comprising the foregoing two operations, and the third 事件操作类函数:

字段赋值操作 = {"新字段名": 固定值或表达式函数, "另一个字段名": }
字段提取操作 = 字段输入, 字符串或字段操作类函数
一般操作 = 事件操作类函数

Action list
list of multiple operations, such as [操作1, 操作2, 操作3, ....]a list composed, will be followed by the implementation continues, unless an operation to discard the event.
Note: a plurality of operations must be []enclosed.

Explanation

  • Field assignment: a { key1: value1, key2: value2}plurality of key-value form, support multiple values assignment.
  • Field Extraction operation: a 输入, 操作single pair, which 输入can not only a field, for example, OSSLOOKUPsupport a plurality of inputs do mapping.
  • General Procedure: covering the regular operation of the event, such as: drop events, retain or discard a particular field, etc., but also output events
  • A plurality of generic operations do require different values 占位符to distinguish, generally to digital accumulation.
  • Complete event class operations function information, details refer to operating details of the incident class function

1.3.2. Sample

Example 1: a plurality of field assignment
to assign a plurality of fields, supports expression functions.

TRANSFORM_ANY_1 = {"__topic__": "default topic", "tag:__type__": v("event_type"), "level": "1"}

Example 2: The extracted field values
based on the field request_bodyvalue, the field-based call function operation JSONexpanded to a plurality of values:

TRANSFORM_ANY_2 = "request_body", JSON(depth=1)

Example 3: General Procedure
drop event fields field1and field2:

TRANSFORM_ANY_3 = DROP_F(["field1", "field2"])

Examples 4: a plurality of operating
a plurality of operations are executed in order:

TRANSFORM_ANY_4 = [ {"email": "[email protected]"}, ("request_body", JSON) ]

Example 4: Expression and function-based operation function interoperability
The Field validvalues whether trueto retain or discard event:

TRANSFORM_ANY_5 = op_if(op_eq(v("valid"), "true"), KEEP, DROP)

Note: where KEEPand DROPare retained and discarded identifies the event class operation.

1.4 General operating conditions with

1.4.1. Syntax Introduction

grammar:

TRANSFORM_EVENT_占位符 = 条件操作
TRANSFORM_EVENT_占位符 = 条件操作列表

Operating condition
is common with operating conditions, if the conditions are met, namely the implementation of its operation, or no operation. :

条件操作 = 条件, 操作

Note: where 操作can also be 操作列表details, refer to the operation

Conditions
条件 are used to determine whether the current event certain conditions are met expression in the form of three ways:

- 固定条件标识
- {"字段名1", "正则表达式1"}
- {"字段名1", NOT("正则表达式1")}            # NOT
- {"字段名1", "正则表达式1", "字段名2": "正则表达式2", ... }   # AND
- 表达式函数
- 以上形式的列表,如:[ {"字段名1": "正则表达式1"}, {"字段名2": "正则表达式2"}, ... ]   # OR

A list of conditional operations,
a plurality of 条件操作lists, such as [条件操作1, 条件操作2, 条件操作3, ....]list configuration. Each check each 条件操作condition, i.e., satisfies the actuator operation, or no operation. After a check next to continue 条件操作, unless a step discarded event.
Note: the entire list must be []enclosed, wherein each of the operating conditions need to use ()enclosed.

Conditions Syntax

  • Fixed conditions Logo: is the use of certain predefined identifier, for example ANY, ALLand so identify all that is in any event will match and subsequent operations.
  • Key-value pair: { key : value }Is the field value is a positive full match, note that one of the field's value must be a regular expression is exact match (match from start to finish) can be considered to meet the conditions.

    • For example: Field uservalue "i love python", then the regular expression "i love" or "python" were unable to match.
    • The relationship between multiple key team relationship is AND, the whole must meet in order to perform the pairing operation (the list).
    • Regular use of NOTthe mobilization, the logic becomesnot
  • Can 表达式函数return the value as a judgment condition (default empty string, None, boolean value False, the number 0, represents an empty list, etc. is not satisfied, expressed indicates otherwise satisfied.
  • By using the list of logical combination of a plurality of expressed ORmeaning, is simply to have a satisfying, i.e., pairing operations performed (list)

    • Note OR, AND, NOT, etc. The current version can not be arbitrarily nested logic.
    • Complex determination logic may be used 表达式函数.
  • Expression Functions Reference Expression Functions

1.4.2. Sample

Example 1: After the operation value matches the
field resultis failedor failure, the set event to the theme login_failed_event:

TRANSFORM_EVENT_1 = {"result": r"failed|failure"}, {"__topic__": "login_failed_event"}

Example 2: The determination was re-extracted field value
when the field request_bodyis present and non-null value, the field-based operations invoked function JSONfor the field request_bodyare expanded to a plurality of values:

TRANSFORM_EVENT_2 = NO_EMPTY("request_body"), ("request_body", JSON)

As used herein, a specific function of the expression NO_EMPTYindicates the presence of the field request_bodyand non-empty.

Example 3: Advanced determining operation again
when the field validvalue is failed, the drop events:

TRANSFORM_EVENT_3 = op_if(v("valid"), "failed"), DROP

Examples 4: a plurality of operating conditions of
a plurality of operations are executed in order:

TRANSFORM_EVENT_3 = [ 
                          (ANY, {"__topic__": "default_login"}), 
                          ( {"valid": "failed"}, {"__topic__": "login_failed_event"} ) 
]

Note that, using a plurality of operating conditions []for enclosed, wherein each of the operating conditions, are used ()enclosed.

1.5. Dispatch operations based on the conditions

1.5.1. Syntax Introduction

grammar:

DISPATCH_EVENT_占位符 = 条件操作列表

Explanation

  • General operating conditions with the form to be basically the same
  • Wherein the plurality 条件操作of lists, each check each 条件操作condition is not satisfied constantly checks the next 条件操作, i.e. to meet the pairing operation actuator (list), it is no longer performed after subsequent 条件操作up.

1.5.2. Sample

Example 4: Conditional assignment
in accordance with the field http_statusto set different event themes:

DISPATCH_EVENT_1 = [ 
                          ({"http_status": r"2\d+"} , {"__topic__": "success_event"}), 
                          ({"http_status": r"3\d+"} , {"__topic__": "redirection_event"}), 
                          ({"http_status": r"4\d+"} , {"__topic__": "unauthorized_event"}), 
                          ({"http_status": r"5\d+"} , {"__topic__": "internal_server_error_event"}), 
]

Note that, using a plurality of operating conditions []for enclosed, wherein each of the operating conditions, are used ()enclosed.

Simplified macro 1.6 Common operations events

1.6.1 Retention / drop events

Syntax
retain or discard an event to meet the conditions for

KEEP_EVENT_占位符 = 条件
DROP_EVENT_占位符 = 条件

Explanation

  • Conditions: consistent with the operating conditions and in general condition may be a list, a reference condition

1.6.2 Reserved / drop field

Syntax
retain or discard meet the conditions for field names

KEEP_FIELDS_占位符 = 字符串或字符串列表
DROP_FIELDS_占位符 = 字符串或字符串列表

String or list of strings

  • String: String herein refers to 正则表达式retain or discard the string when the field name matches.
  • List: expressed []enclosed list of regular expression string, such as:["abc", "xyz"]
  • Provides a number of predefined good metaidentification field names can be used directly, for example, F_TIMErepresents a time field F_METAindicates the time, theme and other fields.

Explanation

  • Conditions: consistent with the operating conditions and in general condition may be a list, a reference condition
  • Because the event log service also contains a hidden meta fields: including __time__, __topic__etc., if you delete __time__, the event time will be reset to the current time, use KEEP_FIELDS_require special attention, do not mistakenly deleted.
  • Common KEEP_FIELDS_formats are:[F_TIME, F_META, F_TAGS, "f1", "f2" ]

1.6.3 Renaming Fields

grammar

ALIAS_xxx = {"现有字段名正则1": "新字段名1", "现有字段正则2": "新字段名2",}
RENAME_FIELDS_xxx = {"现有字段名正则1": "新字段名1", "现有字段正则2": "新字段名2",}

Explanation

  • ALIAS_And RENAME_FIELDS_there is no difference
  • Here is the actual name of an existing field a 正则表达式, when there are multiple matching fields, all fields will change to the new field name, the value of the new field name is one of them, which specific unknown. When the log is mainly to solve the mixed variety of data sources, with a simplified unified field names.

1.6.4 Output events

We will meet the conditions of the event output

grammar

OUTPUT_xxx = 条件
COUTPUT_xxx = 条件

Explanation

  • Conditions: consistent with the operating conditions and in general condition may be a list, a reference condition
  • OUTPUTAfter satisfying the condition output event, no event for subsequent processing, (as may be appreciated discarded).
  • COUTPUTAfter the output to meet the conditions of the event, the event will continue to follow-up treatment, (can be understood as a copy output).
  • 事件类操作函数The OUTPUT ` COUTPUT` support more customized behavior. Reference Event Class manipulation functions

2. Expression Functions

Expression returns a specific value, typically of single function call, or a combination of expressions, covering the following broad categories, and continued to increase from 100:

  • Basic operation functions: field value, control, comparison, the container is determined, determining the presence of the content, multi-field operation field, etc.
  • Transfer function: basic type conversion, digital conversion
  • Arithmetic function: calculated based on multi-value comparison calculation, mathematical calculations, and other mathematical parameters
  • String Functions: multi-field operation, coding / decoding, sorting, flashbacks, alternatively, a conventional structured, find judgment, segmentation, formatting, and other character sets is determined
  • Date Time Function: Date Time Smart conversion, acquisition date and time attributes, date and time of acquisition, acquisition Unix time acquisition date time string, modification date - time, modification date and time, the date and time comparison
  • Regular expression functions: field extraction, matching judgment, alternatively, cut classification

Further details, please refer to the user manual.

3. The operation of the function field class

Based on the value of the input field, the operation, note: the current field is not supported function and the operation expression functions based interoperability.

Covering the following broad categories, and continued to increase:

  • Regular extraction column: Regular full support, including dynamic extraction field names, etc.
  • CSV format extract: Supports standard CSV
  • Dictionary mapping: direct field mapping
  • External OSS multi-column mapping: data enrichment, supports incremental refresh, wide matching CSV from the association on external OSS.
  • External database mapping multi-column: data enrichment from the associated external databases, support dynamic refresh, wide matching.
  • External Logstore multi-column mapping: data enrichment from the associated external logstore, supports incremental refresh, broad matching.
  • Automatic KV: automatic extraction KV, also supports custom delimiter, auto-escape scenes
  • JSON automatically expanded: support auto-expand JSON content, including arrays, support for custom process started.
  • JSON-JMES Filter: JMES supports dynamic calculation after selecting process.
  • Split event (based on JSON array or string): split for event-based array of strings or an array of JSON
  • The combined plurality of rows (strings or arrays based on JSON): multi-field merge array based on an array of strings or JSON

Further details, please refer to the user manual.

Extracting the value of field inspection and overwrite mode

Keywords character sets:

  • The method of implementation of this policy are: REGEX (Dynamic Key name), JSON, KV
  • default:[\u4e00-\u9fa5\u0800-\u4e00a-zA-Z][\w\-\.]*
  • Examples of non-compliant: 123 = abc 1k = 200 { "123": "456"}, etc.

Set overwrite mode mode parameter

  • Supported extraction method: REGEX, KV, CSV, Lookup, JSON

    • ("msg",REGEX(r"(\w+):(\d+)",{r"k_\1": r"v_\2"}, mode="fill-auto")
  • fill - when the primary field is not present or is empty
  • add - Set when the primary field is not present
  • overwrite - always set
  • fill / add / overwrite-auto - (non-null value only when the new operation)
  • Default: fill-auto

4. Event class operations function details

The event function directly operated,

Covering the following categories:

  • KV multi-field extraction
  • Event Meta: Field discarded, rename
  • Event output: copy output, the output dropped, multi-target configurations, overloaded yuan meta, and other additional more TAG

Note: The event is also supported with a particular type of operation expression functions interoperability, such as the return expression functions.

Further details, please refer to the user manual.

With further reference

Welcome scan code to join the official nail group (11,775,223) directly support real-time updates in a timely manner and Ali cloud engineers:
image

Guess you like

Origin yq.aliyun.com/articles/704938