Elasticsearch: Automatically set date field with server time and update timezone

In most cases, your data contains a field named create_date. Even without date fields, handling dates in various formats and time zones is a significant challenge for data warehouses. Similarly, if you want to detect changing data, you must set the date field accurately.

There is also an option in Elasticsearch to automatically set the server's date as a field.

We'll use set  and  date processors that ingest pipeline properties   .

Create an ingestion pipeline

First we need to set a timestamp field. Later we will use a date handler to update the field.

Date handlers have some functionality. The target_field attribute is one of them. If the target_field attribute is not defined, it will evaluate field and write to a new field called @timestamp. But we want to change an already existing field.

PUT _ingest/pipeline/sales-timestamp
{
  "description": "Set two different timestamp fields.",
  "processors": [
    {
      "set": {
        "field": "timestamp",
        "value": "{
   
   {
   
   {_ingest.timestamp}}}"
      }
    },
    {
      "date": {
          "field": "timestamp",
          "target_field": "tr_timestamp",
          "timezone": "+0300",
          "formats": [ "ISO8601"]
      }
    }
  ]
} 

After running the script above, the ("acknowledged": true) message will be displayed:

{
  "acknowledged": true
}

In addition, you can use the DELETE command to delete or use the GET command to verify attributes. Re-running the PUT command should be sufficient to update the pipeline.

GET _ingest/pipeline

DELETE _ingest/pipeline/sales-timestamp

The next step is to create the index using the pipeline. For newly established ingestion pipelines, index.default_pipeline must be set.

Even though it will automatically populate the date fields, you still need to define them in the index's mapping

# Create "sales" Index
PUT sales
{
  "settings": {
    "index.default_pipeline": "sales-timestamp"
  },
  "mappings": {
     "properties": {
       "timestamp": { "type": "date" },
       "tr_timestamp": { "type": "date" },
       "name": { "type": "text" },
       "authour": { "type": "keyword" }
     }
   }
}

data input

When you want to add multiple data to an index at the same time, you can use the Bulk API . It tries to parse your script from each line. This means you cannot use formatted JSON during bulk inserts.

With this technique, Elasticsearch automatically assigns an ID to the data.

POST sales/_bulk
{"index":{}}
{"name":"The Lord of the Rings: The Fellowship of the Ring","authour":"J. R. R. Tolkien"}
{"index":{}}
{"name":"The Lord of the Rings 2: The Two Towers","authour":"J. R. R. Tolkien"}

The next step will allow us to list or search our data.

GET sales/_search?filter_path=**.hits
{
  "size": 5, 
  "query": {
    "match_all": {}
  }
}

The result of the above operation is:

{
  "hits": {
    "hits": [
      {
        "_index": "sales",
        "_id": "rVjrTooBxPLM4Lwr4CwQ",
        "_score": 1,
        "_source": {
          "name": "The Lord of the Rings: The Fellowship of the Ring",
          "authour": "J. R. R. Tolkien",
          "tr_timestamp": "2023-09-01T07:06:35.783+03:00",
          "timestamp": "2023-09-01T04:06:35.783682Z"
        }
      },
      {
        "_index": "sales",
        "_id": "rljrTooBxPLM4Lwr4CwQ",
        "_score": 1,
        "_source": {
          "name": "The Lord of the Rings 2: The Two Towers",
          "authour": "J. R. R. Tolkien",
          "tr_timestamp": "2023-09-01T07:06:35.792+03:00",
          "timestamp": "2023-09-01T04:06:35.792130Z"
        }
      }
    ]
  }
}

In this case, you should pay close attention to the tr_timestamp and timestamp data. The data in the tr_timestamp column has "+03:00" at the end.

You can add more data to the index.

POST sales/_bulk
{"index":{}}
{"name":"The Lord of the Rings 3: The Return of the King", "authour":"J. R. R. Tolkien"}
      {
        "_index": "sales",
        "_id": "r1j0TooBxPLM4LwreSyl",
        "_score": 1,
        "_source": {
          "name": "The Lord of the Rings 3: The Return of the King",
          "authour": "J. R. R. Tolkien",
          "tr_timestamp": "2023-09-01T07:15:59.397+03:00",
          "timestamp": "2023-09-01T04:15:59.397509Z"
        }
      }

For more on how to use pipelines, please read the article " Elasticsearch: ingest pipelines - tips and tricks ".

Guess you like

Origin blog.csdn.net/UbuntuTouch/article/details/132619539