[Elasticsearch Beginners - notes] Elasticsearch core concepts: NRT, indexing, slicing, copy, etc.

Disclaimer: This article is a blogger original article, shall not be reproduced without the bloggers allowed. https://blog.csdn.net/mr_zhuqiang/article/details/88642377

Elasticsearch core concepts

[[Heel]]

  1. lucene and elasticsearch Past and Present
  2. elasticsearch core concepts
  3. elasticsearch core concepts vs core database concepts

lucene and elasticsearch Past and Present

lucene, most advanced, most powerful search library; lucene directly based on the development of very complex, api complex (achieve some simple features, write a lot of java code), it requires in-depth understanding of the principles (of various index structures)

elasticsearch based on lucene, hide the complexity, providing a simple-to-use interfaces restful api, (api interface to other languages) java api interfaces

  1. Distributed document storage engine
  2. Distributed search engines and engine analysis
  3. Distributed to support PB-level data

Out of the box, the default parameters good, does not require any additional settings, completely open source

About a legend elasticsearch, there is a programmer unemployed, accompanied his wife to London, England to study cookery courses. During a programmer to write unemployment want to give my wife a recipe search engine, lucene feel is too complicated, it has developed a package of open-source projects compass lucene. Later, programmers found a job, is to do a distributed high-performance programs, that compass is not enough, wrote elasticsearch, let lucene become a distributed system.

elasticsearch core concepts

  1. Near Realtime (NRT) near real-time

    Two meanings:

    Data from the write data may be searched to have a small delay (about 1 second);

    Es based search and analysis can be achieved in seconds

  2. Cluster Cluster

    Includes a plurality of nodes, each node belongs to which cluster is determined by a configuration (cluster name, the default is elasticsearch), for small and medium sized applications, the beginning of a cluster node on a normal

  3. Node node

    A node in the cluster, the node also has a name (the default is randomly assigned), node name is very important (in the implementation of the operation and maintenance management operations), "elasticsearch" cluster default node to join a name, if the direct start a bunch of nodes, then they will automatically form a cluster elasticsearch, of course, a node can form a cluster elasticsearch

  4. Document & field documentation

    Es in the minimum data unit, a document data may be a customer, a product classification data, one line data is usually represented by a data structure JSON

    A lower index of type in, you can go to store multiple document.

    A document which has a plurality of field, each field is a data field.

    product document
    
    {
      "product_id": "1",
      "product_name": "高露洁牙膏",
      "product_desc": "高效美白",
      "category_id": "2",
      "category_name": "日化用品"
    }
    
  5. Index Index

    There are a bunch of documents containing similar data structure, such as a customer can have an index, commodity classification index, orders index, the index has a name.

    Index contains a lot of document, an index to represent a class of similar or identical document. For example, to create a product index, commodity index, which might store all product data, all of the merchandise document.

  6. Type Type

    Where each index can have one or more type, type is a classification index the logical data, document under a type, have the same Field, such as blog system, an index, a user can define a data type, blog data type, comment data type.

    Commodity index, which store all product data, product document

    But commodities are divided into categories, each category of document the field may not be the same, such as electrical goods, may also contain some, such as after-sales time such a special field; fresh goods, also contains some, such as fresh shelf life like special field

    type, daily commodity type, electrical goods type, commodity type Fresh

    日化商品 type:product_id,product_name,product_desc,category_id,category_name

    电器商品 type:product_id,product_name,product_desc,category_id,category_name,service_period

    生鲜商品 type:product_id,product_name,product_desc,category_id,category_name,eat_period

    Every type which will contain a bunch of document

    {
      "product_id": "2",
      "product_name": "长虹电视机",
      "product_desc": "4k高清",
      "category_id": "3",
      "category_name": "电器",
      "service_period": "1年"
    }
    
    {
      "product_id": "3",
      "product_name": "基围虾",
      "product_desc": "纯天然,冰岛产",
      "category_id": "4",
      "category_name": "生鲜",
      "eat_period": "7天"
    }
    
    • index: a database can be seen as
    • type: it can be seen as a database table
    • document: records can be seen in the table
  7. shard minute piece

    A single machine can not store large amounts of data, es data can be cut into a plurality of index shard, distributed across multiple storage servers. With shard can scale out to store more data, so that operations such as search and analysis distributed to multiple servers to perform up to enhance the throughput and performance. Each shard is a lucene index.

  8. replica set copy / copies

    Subject to any server failure or downtime, then shard might be lost, so you can create multiple copies of each replica shard. replica may provide backup service when shard failure, to ensure data is not lost, a plurality of replica may further improve the throughput and performance of the search operation.

    • primary shard (once set up indexing, you can not modify the default 5)
    • replica shard (at any time modify the number, default 1)

    Each default index 10 shard, 5 th primary shard, 5 th replica shard, the minimum high-availability configuration, the server is two.

The core concept of the core concepts of database vs

Elasticsearch database
Document Row
Type table
Index Storehouse

GIHUB:https://github.com/zq99299/note-book/blob/master/docs/elasticsearch-core/index.md

Guess you like

Origin blog.csdn.net/mr_zhuqiang/article/details/88642377