ES distributed full-text search engine

                                IS

What is ES Document Actions:

  1.ES is document-oriented (document oriented) , which means it can store the entire object or document (the Document) . However, it is not just storage, but also an index (index) content of each document so that it can be searched. In ES , you can document (instead of rows and columns of data) to index, search, sort, filter.

  2.ES document elements:

    1._index : index database, similar to a relational database in the "Database" - it is the place where we store the index and associated data.

    2._type : In the application, we use object represents some "things" .

    3.id: with _index   and _type   combination, you can ELasticsearch uniquely identify a document .

    4._source : original document data .

    5._all : connection string for all fields

CRUD operations on the document (resfoull style):

  1.PUT {index}/{type}/{id}

  2.POST {index}/{type}

  3.GET itsource/employee/1?employee

DSL query filter:

  What is the query filter: just find out the information you want shielded redundant information

  DSL (Domain Specific Language domain-specific language ) in JSON form bodies request occurs

   GET _search

  

DSL filter and DSL queries difference in performance:

The results can be cached by filtration and applied to subsequent requests.

The documents that match the query, the correlation is calculated, it is more time-consuming, and does not cache.

 Filter statement can effectively meet the query is completed document filtering.

Segmentation and mapping:

  Why segmentation and mapping:

In the full-text retrieval theory, the query document is performed by matching a keyword query document indexing, so divide the text into meaningful words, it is essential for accuracy of search results, therefore, in the process of indexing and search process analysis statements are required for a text string word.

 

ES carve the word field specifies the need for specific word breakers and other details, and therefore need to be clearly stated in the mapping document.

 

  IK word breaker: Source Download

    https://github.com/medcl/elasticsearch-analysis-ik

Cluster operations:

  Why use a cluster:

    When dealing with high concurrent data single point of failure and a lot of data we will need to use cluster

 

Node There are three nodes :  

 

       master Node: master node , maintaining the cluster index information library operations

 

       data node: data node ,   the document crud

 

       client node: only responsible for handling user requests

 

 

1 , by default, each node has the qualifications of the master node will store data, also handles the client's request. - in a production environment, if you do not modify the role information ElasticSearch node, easily split brain and other issues clusters at high data volume and high concurrent scene

 

 

 2 , in a production cluster, we can divide responsibilities for these nodes. Recommended that the cluster is set 3 or more sets of nodes as a master node [ node.master: false: true node.data ] state, these nodes are only responsible becomes the master node, maintenance of the entire cluster.       

 

    3 , then set according to the amount of data that a number of data nodes [ node.master: false node.data: true ], these nodes are only responsible for storing data, the latter providing indexing and query index of service, so if a user requests more frequently, these pressure node will be relatively large. 

 

    4 , propose to set up a group in the cluster client node [ node.master: false node.data: true ], these nodes are only responsible for processing user requests, forwarding the request to achieve load balancing and other functions. 

 

JavaAPI Operating the cluster:

  What is JavaAPI: ES for Java provides a set of operational index database toolkit that Java API . All of the ES operations using the Client object to perform .

First: the introduction of packet

<dependency>

 

    <groupId>org.elasticsearch.client</groupId>

 

    <artifactId>transport</artifactId>

 

    <version>5.2.2</version>

 

</dependency>

 

<dependency>

 

    <groupId>org.apache.logging.log4j</groupId>

 

    <artifactId>log4j-api</artifactId>

 

    <version>2.7</version>

 

</dependency>

 

<dependency>

 

    <groupId>org.apache.logging.log4j</groupId>

 

    <artifactId>log4j-core</artifactId>

 

    <version>2.7</version>

 

</dependency>

 

Second: the client object acquired connection ES

  

Settings settings = Settings.builder()

 

        .put("client.transport.sniff", true).build();

 

TransportClient client = new PreBuiltTransportClient(settings);

 

Third create the documentation index:

 

import static org.elasticsearch.common.xcontent.XContentFactory.*;

 

IndexResponse response = client.prepareIndex("crm", "vip", "1")

 

.setSource(jsonDataText).get();

 

第四 获取文档:

 

GetResponse response = client.prepareGet("crm", "vip", "1").get();

 

下面就是增删改的代码:

  改:

client.prepareUpdate("crm", "vip", "1").setDoc("{\"sex\":0}").get();

 

client.prepareUpdate("crm", "vip", "1")

 

.setScript(new Script("ctx._source.sex = 1"  , ScriptService.ScriptType.INLINE,null, null))

 

            .get();

 

IndexRequest indexRequest = new IndexRequest("crm", "vip", "1")

 

.source(originalJsonData);

 

UpdateRequest updateRequest = new UpdateRequest("crm", "vip", "1")

 

.doc(updateJsonData).upsert(indexRequest);

 

client.update(updateRequest).get();

 

  删:

 

DeleteResponse response = client.prepareDelete("crm", "vip", "1").get();

 

  批量添加:

 

BulkRequestBuilder bulkRequest = client.prepareBulk();

 

bulkRequest.add(client.prepareIndex("crm", "vip", "1")

 

.setSource(vip1JsonData));

 

bulkRequest.add(client.prepareIndex("crm", "vip", "2")

 

.setSource(vip2JsonData));

 

BulkResponse bulkResponse = bulkRequest.get();

 

Guess you like

Origin www.cnblogs.com/1999wang/p/11517626.html