"ElasticSearch6.x combat tutorial," the preparatory work, basic terminology

Chapter 1 - Preparations

工欲善其事必先利其器

ElasticSearch installation

ElasticSearch6.3.2 Download (Linux, mac OS, Windows general, you can download the zip package): https://www.elastic.co/cn/downloads/past-releases/elasticsearch-6-3-2 . ES historical version download page: https://www.elastic.co/cn/downloads/past-releases#elasticsearch .

Before the formal installation, you need to make sure your system is configured JDK8 environment.

mac OS

After downloading the above Download elasticsearch-6.3.2.tar.gz, first of all in the currently logged on user homecreated under a Settingsdirectory, by tar -zxvf elasticsearch-6.3.2.tar.gzextract to the current directory.

Enter the elasticsearch-6.3.2.tar.gzdirectory, execute ./bin/elasticsearchcommands, wait for a short time, accessed through a browser http://localhost:9200/?prettythe following response:

{              
    "name": "x4x7wWJ",              
    "cluster_name": "elasticsearch",              
    "cluster_uuid": "sJ6LTYJ1TDmtR1kzl0M2Ig",              
    "version": {              
        "number": "6.3.2",              
        "build_hash": "8bbedf5",              
        "build_date": "2017-10-31T18:55:38.105Z",              
        "build_snapshot": false,              
        "lucene_version": "6.6.1"              
    },              
    "tagline": "You Know, for Search"              
}

Linux

Linux Linux installation process and the same.

ES users need to use common installation, startup, if you are a root user, you need to create a user with the ordinary user rather than root user to start ES.

Chapter II - Basic terms

Baimafeima

ES is a search engine, but it is also a distributed document storage database (of course non-relational). To ensure the smooth follow-up of the actual course, here by comparing the traditional relational database MySQL introduce some of the terminology in the ES.

In MySQL total database (Database), table (Table), Record (Row), column (Column) concept, also in the ES has a similar concept, the index (Index), type (Type), document (Document), field (field).

It can be understood:

database table recording Row
MySQL DB Table Row Column
IS Index Document Document Field

Index Index

ES index concept is not in the relational database "Index", the ES index refers to the local storage of data, similar to the concept of a relational database in the database.

Type Type

The article points out some of the type Type ES is the corresponding relational database tables, in the use of ES, we encounter another concept mapping (Mapping) , there are many articles that pointed Mapping corresponding relational database table. Relational database tables and tables are physically separate, different types of the same name, even if there is a column in both tables, which in our relational database is also very reasonable, but unreasonable in the ES, the ES even if under the same index index, if the field field type exist in different types, even though they represent different meanings, but as long as they are of the same name also must require the same type , in the ES type type corresponding to the relational database tables the concept has existed in name only. Type in fact increasingly being weakened in the ES table as a concept in the late version, before the formal removal of non-ES, ES late version has been allowed to create more than one index Index Type, believe the latter version will completely shift In addition to the type type.

(Note: ES6 has not allowed an Index to create multiple Type, https://github.com/elastic/elasticsearch/pull/24317 )

If at this stage we must understand the ES Type, then it must be combined and Mapping. It can be understood as type Type is the definition of a table, only the definition, but rather defines the mapping Mapping table structure (which columns, what type of column is).

Document Document

In the non-relational databases, some called "document database", corresponding to a row in a relational database.

Fields Field

Correspondence relational database columns.

node

ES example of a call a node, ES and ES-machine deployment has only one node, the cluster nodes and a plurality of deployed one master node.

Fragmentation

ES can be deployed as a distributed cluster, also can be used as a stand-alone single-node deployment. Is dispersed in the ES data stored in the slice, ES shielding the bottom slice implementation, we directly interacting with and not interacting fragmentation index. Slice the number of how many and whether it is clustered deployment and standalone deployment has nothing to do, even for standalone deployment can still be specified when creating an index divided into multiple slices (default 5 main slices 1 part of a backup (Backup contains five pieces) ). Fragment main fragment and the fragments of Backup, as the name implies, is a backup master slice preparation slice, when the primary slice fails, the backup sheet functions as a master slice.

For stand-alone deployment

ES standalone deployment, i.e. represents ES has one and only one node, when creating the index, if the primary slice and the number of Backup sheet is not specified, the default create five primary slices and 1 part of a backup (5 Preparation slices) in fact for ES service in terms of stand-alone deployment, multiple master slice does not make sense, the meaning of existence itself is fragmented multiple stores data to multiple nodes simultaneously ES inquiry, this time only a multi-node a slice does not make sense. Backup piece in a stand-alone deployment also does not make sense, the meaning of existence backup itself when the master slice failure, still can provide services when the primary backup in one node, if the master slice fails, the standby fragmentation also it will also lead to failure.

For cluster deployments

For cluster deployments ES is concerned, at this time there are multiple nodes, assigning the master slice and prepare fragmentation mechanism is particularly important (this involves query performance and service availability), for example, now has three nodes, if at this time assign a primary fragments only when creating an index is a little wasteful (Note: Once the primary slice when creating an index can not be modified). Dividing the primary slice and a certain universal no rules, depending on the amount of data is more and the number of user nodes, etc. ES. Is generally understood that the number of fragments as possible, because it can dispersed into different data fragments for later expansion at the new node, ES can automatically re-slicing a uniform distribution. But this is not an absolute bar theory, if your node has only three, set 100 fragments, each node will have 33 nodes, when a search request dispatch to the same node different time slice it and causing hardware snatch resources (CPU), causing performance problems. Conversely, if the three nodes allocated only three fragments, with business development, the increasing amount of data, a single slice can not afford it the largest amount of data, this time even if the new node, but the fragmentation only the number three, the number of slices will be determined at the time the index is created and can not be modified, only this time by re-creating the index.

It is necessary to have a judge (planning a larger slice) data for a reasonable growth, but also have a degree of certainty (a reasonable number of fragments) to expectations. Official gives some advice, the amount of data of each tile is best at around 20G ~ 40G, which means that if you have four nodes, even larger amount of data estimated at about 200G, this time set number of fragments 5 to 10 is more appropriate, almost 7-8, each node has two slices. (Official blog of recommendation, https://www.elastic.co/cn/blog/how-many-shards-should-i-have-in-my-elasticsearch-cluster )

Mentioned above is the primary slice, the slice division sub equally important. Without backup fragmentation, fragmentation primary failure is loss of data, some data can not query. A copy of the fragmentation caused by excessive set up additional storage space, by default, creates a slice copy when creating an index (a fragment does not mean that a backup copy of fragmentation, if there are five main slices, then sliced ​​a copy of it there are five fragments prepared, the same way if two slices to create copies specified, then there are a total of 10 slices prepared.) Note that the apparatus can be modified fragment so prepared can be directly fragmented default a slice copy.

No public concern: CoderBuff, reply "es" get "ElasticSearch6.x combat tutorial" full version of PDF, reply "lottery" participation "from Lucene to Elasticsearch: full-text search real" books sweepstakes (7.17-7.21).

This is a plus buff to give programmers a public number (CoderBuff)

Guess you like

Origin www.cnblogs.com/yulinfeng/p/11204169.html