入门必学!elasticsearch7.x专业术语详解

Glossary of terms 专业术语

更多的资料,建议参考官方资料
https://www.elastic.co/guide/en/elasticsearch/reference/7.5/glossary.html

分析 analysis

Analysis is the process of converting full text to terms. Depending on which analyzer is used, these phrases: FOO BAR, Foo-Bar, foo,bar will probably all result in the terms foo and bar. These terms are what is actually stored in the index.

A full text query (not a term query) for FoO:bAR will also be analyzed to the terms foo,bar and will thus match the terms stored in the index.

It is this process of analysis (both at index time and at search time) that allows Elasticsearch to perform full text queries.

Also see text and term.

集群 cluster

A cluster consists of one or more nodes which share the same cluster name. Each cluster has a single master node which is chosen automatically by the cluster and which can be replaced if the current master node fails.

跨集群复制 cross-cluster replication (CCR)

The cross-cluster replication feature enables you to replicate indices in remote clusters to your local cluster. For more information, see Cross-cluster replication.

跨集群搜索 cross-cluster search (CCS)

The cross-cluster search feature enables any node to act as a federated client across multiple clusters. See Search across clusters.

文件 ocument

A document is a JSON document which is stored in Elasticsearch. It is like a row in a table in a relational database. Each document is stored in an index and has a type and an id.

文档是存储在Elasticsearch中的JSON文档。它就像关系数据库中表中的一行。每个文档都存储在索引中,并具有类型和 ID。

A document is a JSON object (also known in other languages as a hash / hashmap / associative array) which contains zero or more fields, or key-value pairs.

文档是JSON对象(在其他语言中也称为hash / hashmap /关联数组),其中包含零个或多个 字段或键-值对。

The original JSON document that is indexed will be stored in the _source field, which is returned by default when getting or searching for a document.

被索引的原始JSON文档将存储在 _source字段中,该字段在获取或搜索文档时默认返回。

字段 field

A document contains a list of fields, or key-value pairs. The value can be a simple (scalar) value (eg a string, integer, date), or a nested structure like an array or an object. A field is similar to a column in a table in a relational database.

一个文件包含字段或键-值对的列表。该值可以是简单(标量)值(例如,字符串,整数,日期),也可以是嵌套结构(如数组或对象)。字段类似于关系数据库中表中的列。

The mapping for each field has a field type (not to be confused with document type) which indicates the type of data that can be stored in that field, eg integer, string, object. The mapping also allows you to define (amongst other things) how the value for a field should be analyzed.

所述映射用于每个字段具有一个字段类型(不要与文件相混淆类型),其指示可以被存储在该字段中的数据,例如类型integer,string, object。该映射还允许您定义(除其他事项外)应如何分析字段的值。

过滤 filter

A filter is a non-scoring query, meaning that it does not score documents. It is only concerned about answering the question - “Does this document match?”. The answer is always a simple, binary yes or no. This kind of query is said to be made in a filter context, hence it is called a filter. Filters are simple checks for set inclusion or exclusion. In most cases, the goal of filtering is to reduce the number of documents that have to be examined.

过滤器是不计分的查询,表示它不对文档进行计分。它只关心回答问题“此文档是否匹配?”。答案始终是简单的二进制“是”或“否”。这种查询据说是在过滤器上下文中进行的,因此称为过滤器。过滤器是对集合包含或排除的简单检查。在大多数情况下,过滤的目的是减少必须检查的文档数量。

追随者索引 follower index

Follower indices are the target indices for cross-cluster replication. They exist in your local cluster and replicate leader indices.

跟随者索引是跨集群复制的目标索引。它们存在于您的本地集群中并复制领导者索引。

ID id

The ID of a document identifies a document. The index/id of a document must be unique. If no ID is provided, then it will be auto-generated. (also see routing)

文档 的ID 标识文档。该 index/id文件中必须是唯一的。如果没有提供ID,则会自动生成。(另请参见路由)

索引 index

An index is like a table in a relational database. It has a mapping which contains a type, which contains the fields in the index.

索引就像关系数据库中的表。它具有一个包含type的 映射,该type包含索引中的字段。

An index is a logical namespace which maps to one or more primary shards and can have zero or more replica shards.

索引是一个逻辑名称空间,它映射到一个或多个 主分片,并且可以具有零个或多个 副本分片。

索引别名 index alias

An index alias is a secondary name used to refer to one or more existing indices.

Most Elasticsearch APIs accept an index alias in place of an index name.

See Add index alias.

See Add index alias.

领导者索引 leader index

Leader indices are the source indices for cross-cluster replication. They exist on remote clusters and are replicated to follower indices.

前导索引是跨集群复制的源索引。它们存在于远程集群中,并被复制到 关注者索引。

映射 mapping

A mapping is like a schema definition in a relational database. Each index has a mapping, which defines a type, plus a number of index-wide settings.

映射就像关系数据库中的架构定义。每个 索引都有一个映射,该映射定义一个type,以及许多索引范围的设置。

A mapping can either be defined explicitly, or it will be generated automatically when a document is indexed.

映射可以明确定义,也可以在为文档建立索引后自动生成。

节点 node

A node is a running instance of Elasticsearch which belongs to a cluster. Multiple nodes can be started on a single server for testing purposes, but usually you should have one node per server.

节点是属于集群的Elasticsearch的运行实例 。出于测试目的,可以在单个服务器上启动多个节点,但是通常每个服务器应该有一个节点。

At startup, a node will use unicast to discover an existing cluster with the same cluster name and will try to join that cluster.

在启动时,节点将使用单播来发现具有相同集群名称的现有集群,并将尝试加入该集群

主分片 primary shard

Each document is stored in a single primary shard. When you index a document, it is indexed first on the primary shard, then on all replicas of the primary shard.

每个文档都存储在一个主分片中。当您为文档建立索引时,将首先在主碎片上建立索引,然后在主碎片的所有副本上建立索引。

By default, an index has one primary shard. You can specify more primary shards to scale the number of documents that your index can handle.

默认情况下,索引具有一个主分片。您可以指定更多的主要分片来扩展 索引可以处理的文档数量。

You cannot change the number of primary shards in an index, once the index is created. However, an index can be split into a new index using the split API.

创建索引后,您将无法更改索引中的主碎片数。但是,可以使用split API将索引拆分为新索引 。

See also routing

ps:这解释了为什么要在创建索引的时候就确定好主分片的数量并且永远不会改变这个数量:因为如果数量变化了,那么所有之前路由的值都会无效,文档也再也找不到了。

查询 query

A request for information from Elasticsearch. You can think of a query as a question, written in a way Elasticsearch understands. A search consists of one or more queries combined.

来自Elasticsearch的信息请求。您可以将查询视为一个问题,以Elasticsearch理解的方式编写。搜索由一个或多个查询组合而成。

There are two types of queries: scoring queries and filters. For more information about query types, see Query and filter context.

查询分为两种:评分查询和过滤器。有关查询类型的更多信息,请参见查询和过滤上下文。

恢复 recovery

Shard recovery is the process of syncing a replica shard from a primary shard. Upon completion, the replica shard is available for search.

碎片恢复是 从主碎片同步副本碎片的过程。完成后,副本分片可用于搜索。

Recovery automatically occurs during the following processes:
Node startup or failure. This type of recovery is called a local store recovery.
Primary shard replication.
Relocation of a shard to a different node in the same cluster.
Snapshot restoration.

在以下过程中会自动进行恢复:
节点启动或失败。这种恢复称为本地存储恢复。
主分片复制。
将分片重定位到同一集群中的其他节点。
快照恢复。

重新索引 reindex

To cycle through some or all documents in one or more indices, re-writing them into the same or new index in a local or remote cluster. This is most commonly done to update mappings, or to upgrade Elasticsearch between two incompatible index versions.

要循环浏览一个或多个索引中的某些或所有文档,请将它们重新写入本地或远程群集中的相同或新索引中。这是最常用于更新映射或在两个不兼容的索引版本之间升级Elasticsearch的操作。

复制分片 replica shard

Each primary shard can have zero or more replicas. A replica is a copy of the primary shard, and has two purposes:

每个主分片可以具有零个或多个副本。副本是主碎片的副本,具有两个目的:

increase failover: a replica shard can be promoted to a primary shard if the primary fails
increase performance: get and search requests can be handled by primary or replica shards.

增加故障转移:如果主副本发生故障,副本副本可以提升为主副本
提高性能:获取和搜索请求可以由主或副本分片处理。

By default, each primary shard has one replica, but the number of replicas can be changed dynamically on an existing index. A replica shard will never be started on the same node as its primary shard.

默认情况下,每个主分片都有一个副本,但是可以在现有索引上动态更改副本的数量。副本分片永远不会与其主分片在同一节点上启动。

ps: 复制分片只是主分片的一个副本,它可以防止硬件故障导致的数据丢失,同时可以提供读请求,比如搜索或者从别的 shard 取回文档。

路由 routing

When you index a document, it is stored on a single primary shard. That shard is chosen by hashing the routing value. By default, the routing value is derived from the ID of the document or, if the document has a specified parent document, from the ID of the parent document (to ensure that child and parent documents are stored on the same shard).

当您为文档建立索引时,它会存储在一个 主分片中。该分片是通过散列routing值来选择的。默认情况下,该routing值从文档的ID导出,或者,如果文档具有指定的父文档,则从父文档的ID导出(以确保子文档和父文档存储在同一分片上)。

This value can be overridden by specifying a routing value at index time, or a routing field in the mapping.

通过routing在索引时间或映射中的路由字段中指定一个值,可以覆盖此值。

分片 shard

A shard is a single Lucene instance. It is a low-level “worker” unit which is managed automatically by Elasticsearch. An index is a logical namespace which points to primary and replica shards.

分片是单个Lucene实例。这是一个低级的“工人”单元,由Elasticsearch自动管理。索引是指向主分片和 副本分片的逻辑命名空间。

Other than defining the number of primary and replica shards that an index should have, you never need to refer to shards directly. Instead, your code should deal only with an index.

除了定义索引应具有的主分片和副本分片的数量外,您无需直接引用分片。相反,您的代码应仅处理索引。

Elasticsearch distributes shards amongst all nodes in the cluster, and can move shards automatically from one node to another in the case of node failure, or the addition of new nodes.

Elasticsearch将分片分布在 集群中的所有节点之间,并且在节点故障或添加新节点的情况下,可以将分片自动从一个节点移动到另一个节点。

ps: 这类似于 MySql 的分库分表,只不过 Mysql 分库分表需要借助第三方组件而 ES 内部自身实现了此功能。

源域 source field

By default, the JSON document that you index will be stored in the _source field and will be returned by all get and search requests. This allows you access to the original object directly from search results, rather than requiring a second step to retrieve the object from an ID.

默认情况下,您索引的JSON文档将存储在该 _source字段中,并且将由所有get和search请求返回。这样,您就可以直接从搜索结果中访问原始对象,而无需执行第二步来从ID中检索对象。

术语 term

A term is an exact value that is indexed in Elasticsearch. The terms foo, Foo, FOO are NOT equivalent. Terms (i.e. exact values) can be searched for using term queries.

术语是在Elasticsearch中索引的精确值。该条款 foo,Foo,FOO是不等价的。可以使用术语查询来搜索术语(即精确值)。

另请参阅文本和分析。

See also text and analysis.

文本 text

Text (or full text) is ordinary unstructured text, such as this paragraph. By default, text will be analyzed into terms, which is what is actually stored in the index.

文本(或全文)是普通的非结构化文本,例如本段。默认情况下,文本将被分析为 term,这是索引中实际存储的内容。

Text fields need to be analyzed at index time in order to be searchable as full text, and keywords in full text queries must be analyzed at search time to produce (and search for) the same terms that were generated at index time.

文本字段需要在索引时进行分析才能作为全文搜索,并且全文查询中的关键字必须在搜索时进行分析以产生(和搜索)与索引时生成的词相同的术语。

See also term and analysis.

类型 type

A type used to represent the type of document, e.g. an email, a user, or a tweet. Types are deprecated and are in the process of being removed. See Removal of mapping types.

用于表示文档类型的类型,例如an email,a user或a tweet。类型已弃用,正在删除中。请参阅删除映射类型。

猜你喜欢

转载自blog.csdn.net/qq_34168515/article/details/108315484
今日推荐