Log service data processing: Rule Scenario overview

Use Scenario 1: Free choreography

Orchestrated by a simple Python-style configuration file, you do not need to write code to generally reach 80% of the data processing requirements and the degree of freedom:
image

Built-in capabilities provided include the following arrangement , and includes scalability:

  • Assignment field
  • Extracting field
  • Spoken operation
  • Assignment conversion
  • Serial converter
  • Division events
  • Reserved events
  • Drop events
  • reserved text
  • Discard field
  • Automatic extraction KV
  • Renaming Fields
  • Output (copy) event

Use Scenario 2: Use the built-conversion module

Log service data processing functions provide a complete built-in processing module, especially for regular expressions, KV, JSON, Lookup and other support flexible and complete. The overall built-in conversion module converts full support for conventional data processing, you can cover 80% of the total conversion needs:

  • The column values ​​(static / copy / UDF): Calculation support various functions
  • Regular extraction column: Regular full support, including dynamic extraction field names, etc.
  • CSV format extract: Supports standard CSV
  • Dictionary mapping: direct field mapping
  • External OSS multi-column mapping: data enrichment, supports incremental refresh, wide matching CSV from the association on external OSS.
  • External database mapping multi-column: data enrichment from the associated external databases, support dynamic refresh, wide matching.
  • External Logstore multi-column mapping: data enrichment from the associated external logstore, supports incremental refresh, broad matching.
  • Automatic KV: automatic extraction KV, also supports custom delimiter , Auto-Escape scenes
  • JSON automatically expanded: support auto-expand JSON content, including arrays , support for custom process started.
  • JSON-JMES Filter: JMES supports dynamic calculation after selecting process.
  • Split event (based on JSON array or string): split for event-based array of strings or an array of JSON
  • The combined plurality of rows (strings or arrays based on JSON): multi-field merge array based on an array of strings or JSON

Use Scenario 3: Use the built-in expression functions

Currently log service to write plug-in external yet to open, but provides built-in 150 kinds of functions , covering the mainstream data processing needs, including:

  • Basic operation functions: field value, control, comparison, judgment container, multi-field operation
  • Transfer function: basic type conversion, digital conversion
  • Arithmetic function: calculated based on multi-value comparison calculation, mathematical calculations, and other mathematical parameters
  • String Functions: multi-field operation, coding / decoding, sorting, flashbacks, alternatively, a conventional structured, find judgment, segmentation, formatting, and other character sets is determined
  • Date Time Function: Date Time Smart conversion, acquisition date and time attributes, date and time of acquisition, acquisition Unix time acquisition date time string, modification date, modification date and time, the date and time comparison
  • Regular expression functions: field extraction, matching judgment, alternatively, cut classification
  • JSON and XML functions: extraction filtering
  • Other higher-order functions: Syslog functions, etc.

Use Scenario 4: extension or UDF

The underlying Python engine currently used, in theory, any Python library can be a little package to enter the log data processing services, the current custom UDF function, not yet open, but external customers using built-in functions 150+ (and counting) basically completed most of the work, can not meet the demand, can raise a ticket and get timely support.

With further reference

Welcome scan code to join the official nail group (11,775,223) directly support real-time updates in a timely manner and Ali cloud engineers:
image

Guess you like

Origin yq.aliyun.com/articles/704937