Some considerations for creating elastic search indexes

When creating an Elasticsearch (referred to as ES) index, there are several considerations to consider. Here are some key considerations, which I will illustrate with specific examples.

  1. Clarify indexing requirements : Before creating an index, we need to have a clear understanding of our data and query requirements. This includes the type of data (such as text, number, date, etc.), the size of the data (such as whether there is a large amount of data that needs to be indexed), and the query requirements (such as whether full-text search is required, whether aggregation operations are required, etc.). These factors all affect how we set up the mapping and setting of the index.

    For example, suppose we have a dataset containing information about users, and each user has attributes such as name, age, birthday, address, etc. If we need to perform full-text searches on names, range queries on ages, and aggregate operations on birthdays, then we need to set the corresponding type and analyzer for each field when creating an index.

  2. Index mapping : Mapping is the process of defining how fields in an index are stored and searched. We can define types for each field (such as text, keyword, date, long, etc.), as well as analyzers, formatters, etc.

    For example, we can create the following mapping for the above user information dataset:

    PUT /user
    {
      `mappings`: {
        `properties`: {
          `name`: { `type`: `text` },
          `age`: { `type`: `integer` },
          `birthday`: { `type`: `date`, `format`: `yyyy-MM-dd` },
          `address`: { `type`: `keyword` }
        }
      }
    }
    

    In this mapping, namethe field is set to texttype and can be searched in full text. ageFields are set to integertype to enable range queries. birthdayThe field is set to datetype and defines the format of the date. addressFields are set to keywordtype to enable exact searches.

  3. Index settings : When creating an index, we can define some settings, including the number of shards, number of replicas, refresh interval, etc.

    For example, assuming our user information dataset is very large, we can set the number of shards to 5 and the number of replicas to 1 to improve search performance and data availability:

    PUT /user
    {
      `settings`: {
        `number_of_shards`: 5,
        `number_of_replicas`: 1
      },
      ...
    }
    

    It should be noted that the number of shards for an index needs to be defined when it is created, and cannot be changed afterwards. The number of copies can be modified later.

  4. Dynamic mapping : ES enables dynamic mapping by default, which means that if newly added documents in the index contain new fields, ES will automatically create mappings for these new fields. While this feature is useful in some situations, it can also cause problems in some situations.

Guess you like

Origin blog.csdn.net/i042416/article/details/132409203