Interview Series Octal es works written data

(1) es the process of writing data

 

1) The client sends a request to select a node in the past, this node is coordinating node (the coordinator node)

2) coordinating node, routing of the document, will forward the request to the corresponding node (with a primary shard)

3) primary shard processing request on the actual node, then the data is synchronized to the replica node

4) coordinating node, if it is found after the primary node and all replica node are done, the result is returned in response to the client

 

(2) es during data read

 

Query, GET a piece of data, a written document, this document will automatically give you assign a globally unique id, doc id, but also be routed to hash corresponding primary shard above, according to the doc id. You can also specify a manual doc id, such as with the order id, user id.

 

You can query by doc id, will be based on hash doc id, was judged to allocate doc id which shard to go above, to make inquiries from the shard

 

1) The client sends a request to an arbitrary node, becomes coordinate node

2) coordinate node document routing to forward the request to the corresponding Node, this time using the round-robin polling random algorithm, and all of its primary shard randomly select a replica, so that load balancing read request

3) The node receives the request returns the document to the Coordinate node

4) coordinate node returns the document to the client

 

(3) es data search process

 

es most powerful full-text search is to do is to say you have three data

 

java really fun ah

java so hard to learn ah

j2ee especially cattle

 

You to search by keyword java, java will contain the document to search out

 

es will give you returns: java really fun ah, java so hard to learn ah

 

1) The client sends a request to a coordinate node

2) coordinating node to forward a search request to all corresponding shard Primary Replica shard may or shard

3) query phase: Each shard their search results (in fact, some of the doc id), returned to the coordinator node for data merging, sorting, paging and other operations by the coordinator node, yielding the final result

4) fetch phase: followed by a coordinating node, pulling the actual document data according to the respective nodes on doc id, eventually returned to the client

 

The underlying principle (4) search inverted index, drawing illustrates the difference between traditional database and inverted index

 

(5) Write underlying principle of data

 

1) first write buffer, when data in the buffer is less than the search; while data is written to a log file translog

 

2) If the buffer is almost full, or to a certain time, they will be refresh buffer data into a new segment file, the data at this time but not directly into the segment file disk file, but first enter the os cache. This process is to refresh.

 

Every one second, es the data buffer is written to a new segment file, every second will generate a new disk file, segment file, this segment file is stored in the data recently in one second buffer written

 

However, if the buffer there is no data at this time, of course, does not perform refresh operations slightly, create an empty segment file for a second, if there are data buffer, default 1 second to perform a refresh operation, the brush into a new segment file in

 

Inside the operating system, disk files actually have a thing called the os cache, the operating system cache, that is to say before the data is written to a disk file, you will first enter os cache, before entering the operating system level to a memory cache

 

As long as the data buffer is in refresh operation, the brush into the os cache, it means that the data can be searched for

 

Why is it called es is near real-time? NRT, near real-time, near-real time. The default is every second refresh time, so es quasi real-time, because the data is written after one second in order to be seen.

 

Restful es through the api api, or Java, to manually perform a refresh operation, the data buffer is to manually brush into the os cache, so that the data can be immediately found.

 

As long as the data is entered os cache, buffer will be cleared because the buffer is not required to retain the data in the translog which has persisted to disk to copy the

 

3) As long as the data entered os cache, then you can let the data provided outside of this segment file search

 

4) Repeat step 1 through 3, the new data into the buffer and translog, continuously, constantly buffer write data one after another in a new segment file to per complete refresh buffer cleared, translog, reserved. With the advance of this process, translog will become bigger and bigger. When translog reaches a certain length of time, it will trigger the commit operation.

 

Data in the buffer, all very well, every one second was os cache brush to go, and the buffer was emptied. So this buffer data is always possible to hold not fill es process of memory.

 

Each time a data write buffer, while the log is written to a log file translog go, so this is a translog log file becomes larger and larger, and when translog log file to a certain extent, will perform the commit operation.

 

5) commit operation occurs first step is to buffer Refresh existing data to the os cache, buffer empty

 

6) a commit point will be written to a disk file, which identifies the segment file with all the commit point corresponding

 

7) os cache will force all current data to a disk file to fsync

 

What is the role translog log file is? That is, before you perform the commit operation, the data in the buffer either stay in or stay in the os cache, whether or os cache is a buffer memory, once the machine is dead, on the whole data memory lost.

 

It is necessary to operation of writing data corresponding to a special log file, the log file translog, once the time the machine is down, when restarting again, ES translog will automatically read the data in the log file, and restored to the buffer memory os cache to go.

 

commit: 1, write commit point; 2, the strong brush fsync os cache data to disk up; 3, clear the log file translog

 

8) existing translog empty, then restart enable a translog again, the commit operation completes. 30 minutes by default automatically performs a commit, but if translog too large, will commit every trigger. The entire commit process, called flush operation. We can manually perform the flush operation, it is to brush all os cache data to disk file.

 

Not called the commit operation, flush operation. es in the flush operation, it corresponds to the whole process of the commit. We can also es api, to flush manually, manually os data in the cache to disk fsync strong brush up, recording a commit point, empty the translog log file.

 

9) translog os cache is actually written first, the default brush once every 5 seconds to go to disk, so by default, maybe five seconds of data will only stay in the os cache buffer or translog file, if this when the machine hung up, I lost 5 seconds of data. But this performance is better, lose up to 5 seconds of data. Translog also can be set to each write operation must be directly fsync to disk, but the performance will be much worse.

 

In fact you're here, if the interviewer does not ask you questions es lose data, you can here to dazzle an interviewer, you say, in fact, es is the first near real-time, data can be written after one second search ; you may lose data, your data and 5 seconds of data, remain in the buffer, translog os cache, segment file os cache, there are five seconds of data on the disk is not, at this time if downtime will result in 5 seconds data lost.

 

If you want must not lose data, you can set the parameters, official documents, look at Baidu. Every piece of data is written, is written into buffer, while translog written on the disk, but it can cause write performance, write throughput will drop an order of magnitude. Originally a second can write 2000, and now you can only write 200 for a second, are likely.

 

10) If the delete operation, commit time will generate a .del file, which will be marked as deleted a doc state, so when searching files according to .del know this doc been deleted

 

11) If it is an update operation, is identified as the original doc deleted state, and then writing a new data

 

12) buffer refresh every time, it will produce a segment file, so the default is 1 second a segment file, segment file will be more and more at this time will perform the merge regularly

 

13) each merge when multiple segment file will be merged into one, while there will be marked as deleted the doc to physically removed, and then the new segment file written to disk, this will write a commit point, identification All new segment file, and then open the segment file to use for the search, and delete the old segment file.

 

es in the writing process, there are four core concepts underlying, refresh, flush, translog, merge

 

When multi-segment file to a certain extent, es will automatically trigger the merge operation, multiple segment file to merge into a segment file.

Guess you like

Origin www.cnblogs.com/xiufengchen/p/11258901.html