MySQL file table structure and ES index library structure

One, the file table in MySQL

The business fields are as follows, and the public fields such as creator, modification time, etc. increase with their own needs.

file_id_: file table primary key.

file_name_: File name (the original Chinese name).

new_name_: The new name of the file (number of timestamp). When the file name is saved on the server, it may be repeated, so the timestamp is used.

path_: The address of the file on the server.

ext_: file extension, which identifies the file format as docx, pdf, img, etc.

desc_: file description.

total_bytes_: The total number of bytes.

record_status_: Archive status.

Second, the index library in ES

The index library I created is called Grammar. The fields of the index library are designed according to what fields you want to index.

There are two ways to create an index library.

1. Created using Kibana syntax.

2. Create with annotations. After the index type is created in the project, add annotations above the index class, and the project will create it by itself when it starts.

The first method has nothing to say, just query the official Kibana API.

The second @Document annotation, and a brief introduction to internal attributes.
Click in as shown:

Insert picture description here
@Document(indexName = “grammar”,type = “grammar”,shards = 3,replicas = 1)

The comment on this line means that the name of the index library is grammar, the type is grammar, the number of fragments is 3, and the number of copies is 1.

The introduction of sharding and copying and other terms of ES will be discussed later, and now we will mainly introduce business implementation.

There is an @Document annotation on the index class, and the @Field annotation should be added above the field attribute in the index class.

Insert picture description here

@Field annotation has many attributes, only three are commonly used.

type: Field attributes, enumerated types, such as FieldType.Keyword, FieldType.Text. The default is FieldType.Auto, which automatically detects the type, usually set by yourself.

Insert picture description here

index: Whether to be indexed, the default is true, if a field does not want to be indexed, it should be changed to false.

analyzer: Specify the tokenizer, such as the commonly used IK tokenizer. If the tokenizer is not specified, the ES built-in tokenizer is used by default, which is not easy to use.

Word segmentation and tokenizer are a big concept, and a separate blog will be written later.

Guess you like

Origin blog.csdn.net/numbbe/article/details/108485785