A brief description of the core concepts of ElasticSearch

I am participating in the "Nuggets·Starting Plan"

ES Core Concepts

ES is document-oriented. The table below is a comparison with relational databases. Everything is JSON

Relational database (Mysql) ES
database Indexes are the same as databases
table types will be deprecated slowly 7.0 is outdated 8.0 will be completely abandoned
rows documents (data) documents
Field (columns) fields

ES can contain multiple indexes (databases), each index can contain multiple types (tables), each type contains multiple documents (rows), and each document contains multiple fields (columns)

Physical Design:

ES divides each index into multiple shards in the background, and no shards can be migrated between different servers in the cluster

ES alone is a cluster, and the default cluster name is elasticsearch

Logic Design:

An index type contains multiple documents

document(table)

ES is document-oriented, which means that the smallest unit of indexing and searching data is a document. In ES, a document has several important attributes:

1 Self-contained A document contains both fields and corresponding values, that is, contains key:value at the same time
2. It can be hierarchical. A document contains self-documents, which is how complex logical entities come from (in fact, it is a JSON object that can be automatically converted by fastjson or jackson in Java)
3 Flexible structure, the document does not depend on the pre-defined schema, we know that in the relational database, the fields must be defined in advance to be used, in ES, the field is very flexible, sometimes, we can ignore the field, or dynamically add a new field

Type (field attribute type name varchar, name int etc.)

The type is the logical container of the document, just like a relational database, the table is the container of the row, and the definition of the field in the type is called mapping. In ES, the type can be set without definition, and ES will guess the data type, which may make mistakes. Of course, receiving the set data type is the safest

index (database)

The index is the database, and the index is divided into 5 shards, 5 shards are 5 inverted indexes, and an ES index is composed of multiple Lucenne libraries
The index is a container of mapping type. The index in ES is a very large collection of documents. The index stores the fields and other settings of the mapping type, and then they are stored on each branch. Let's study how sharding works.

Inverted index

ES uses a structure called inverted index, which uses Lucenne inverted index as the bottom layer. This structure is suitable for fast full-text indexing. An index consists of all non-repeating lists in the document. For each word, there is There is a list of docs containing it

For example, there are two documents

If you want to search for to forever, because the weight (score) of document 1 is higher, so document 1 is given priority, and Baidu also has such a mechanism

Looking at an example, the inverted index will filter out all data irrelevant to the query, which is more efficient

In summary, the following core concepts

1 index
2 field type (mapping)
3 documents
4 shards (inverted index)

Guess you like

Origin blog.csdn.net/weixin_46713508/article/details/131366899