Inverted index architecture
In the ad inverted index system plays a vital role when the request came, you need to match the right ads from inverted index based on targeting information. Our inverted index uses ElasticSearch (later referred to as ES), consider the point is active in the community, the relevant acquisition, visualization, monitoring, and alarm and other components more perfect, while ES java-based development, so the tuning is relatively easy and secondary development
Look at our Chart inverted index
The architecture is designed to figure above this, and think through the following iterative
Indexing problems and Optimization
Single point of stability
Multi-node deployment
A builder and wherein B builder are two nodes, a primary and a backup, they are determined by competition lock (implemented ZooKeeper) who is the primary
A plurality of nodes bring data inconsistencies
- And more and more consumers have a news producer timing issues
The message is provided to the stateless
Query the database to obtain the latest data (orders and creative update frequency is low, so the pressure is not on the database)
- Because abnormal resulting in inconsistent data
Retry using (idempotent) and task processing timing of abnormality
- The full amount of the index is updated, the impact of online index search function
Used standby Index
Standby switching process Index: Index Update Standby -> Standby verification Index -> standby switching -> main index update
Index query optimization and indexing problems with reconstruction
ES QPS pressure measurement is not high, high CPU load, YGC frequent, time-consuming rebuild the index index
We were from two directions queries and reconstruction of view
Inquire
- 1s once YGC, STW about 10ms, a greater impact on low-latency system
Adjust -Xmn 3g-> 7g, after adjustment 10s once YGC, STW about 12ms
Before adjustment YGC frequent, low-latency greater impact on the system, so I want to increase the YGC intervals, reduce jitter performance, taking into account the YGC using replication algorithm, each time garbage collection includes scanning the young generation to survive and replicate live objects, the case is much lower than the cost of the scanned object copy objects, so YGC time depends on the number of live objects, there is no major changes in the object life cycle, YGC time naturally there will be major changes
After the adjustment, YGC time interval has been greatly improved, GC time did not increase linearly
- Adjustment and number of copies fragmentation, loss of threads reduced, less IO
ES default number of fragments is 5, the default condition, the index will be assigned to different nodes so that each node has only part of the index, cause a request for data to merge a plurality of nodes, the number of the IO multiple
As shown, if there are three nodes, two main fragments, a copy of each fragment. When a query over time
Query process is roughly: First node3 receives the request, it may put forward the request to the R0 or P0 node2 node1, and then after the completion of the collection of data to retrieve node3, finally returned. Wherein the interior of each index, the data will be saved in multiple segment, the segment is a query of a serial
Our request scenario is large, the index is small (less than 100M), so the master slice was adjusted to 1, a copy of the adjusted number of nodes -1 This ensures that each node stores all index, this will only once io operation, as shown in FIG.
- ES (lucencu) serial read all segment
Index update will increase the number of segment, es queries on segment is serial, so we use every minute timed segment will be reduced to 1 with _forcemerge
- The method of investigation found that hot JSON deserialization representing 50% cpu
Disable source using only the necessary field storage field
- Specify the query favor of this node
Set preference: _local
reconstruction
- Closed before the full amount of reconstruction from fragments, disable real-time indexing
replicas:0 refresh_interval:-1
Reduce the consumption index in the reconstruction process to bring the index synchronization
- Batch rebuild the index
Use bulk batch rebuild indexes to improve the performance of construction index
postscript
We use the program, some are not common in the industry and in line with the recommended way, but in line with our own business, so be sure to plan for their own team of business, not the best solution, only more suitable program