Article Directory
- foreword
- - data aggregation
- Two auto-completion
- Three data synchronization
-
- 3.1 Thought Analysis
- 3.2 Solution 1: Synchronous call
- 3.3 Solution 2: Asynchronous notification
- 3.4 Monitor binlog
- 3.5 Comparison and summary of three schemes
- 3.6 Summary of Data Synchronization Cases
- 3.7 Test of data synchronization case
- 3.8 Supplement: Installation of vue Devtools plugin
- 3.9 Explanation of the vue devtools tool
- Four elasticsearch clusters
-
- 4.1 **ES cluster related concepts**
- 4.2 Build an ES cluster
- 4.4 Cluster status monitoring
-
- 4.4.1 Win installation cerebro [not recommended]
- 4.4.2 Flashback problem [unresolved]
- 4.4.3 install cerebo in linux
- 4.4.5 Create an index library
- 4.4.6 Using kibana's DevTools to create an index library [non-practical operation]
- 4.4.7 Using cerebro to create an index library [practical operation]
- 4.4.8 View Fragmentation Effect
- 4.9 Cluster split-brain problem
- 4.10 Cluster Distributed Storage
foreword
- This article is learned from a dark horse, and after careful study, it is sorted out and summarized, not copied! !
- Regarding the actual operation part, it is recommended that learners do more hands-on and analyze carefully! ! !
- This part has slightly higher requirements on the computer, please pay attention to upgrade the computer configuration! ! !
- data aggregation
- Aggregations can realize the statistics, analysis and operation of document data.
- There are three common types of aggregation:
- Bucket aggregation: used to group documents
- TermAggregation: group by document field value
- Date Histogram: Group by date ladder, for example, a week as a group, or a month as a group
- Bucket aggregation: used to group documents
- Metric aggregation: used to calculate some values, such as: maximum value, minimum value, average value, etc.
- Avg: Average
- Max: find the maximum value
- Min: Find the minimum value
- Stats: Simultaneously seek max, min, avg, sum, etc.
- Pipeline (pipeline) aggregation: aggregation based on the results of other aggregations
- **Note:** The fields participating in the aggregation must be keyword, date, value, and Boolean
1.1 DSL realizes aggregation
1.1.1 Bucket aggregation syntax
GET /hotel/_search
{
"size": 0, // 设置size为0,结果中不包含文档,只包含聚合结果
"aggs": {
// 定义聚合
"brandAgg": {
//给聚合起个名字
"terms": {
// 聚合的类型,按照品牌值聚合,所以选择term
"field": "brand", // 参与聚合的字段
"size": 20 // 希望获取的聚合结果数量
}
}
}
}
- Example:
#桶排序 GET /hotel/_search { "size": 0, "aggs": { "brandAgg": { "terms": { "field": "brand", "size": 20 } } } }
- result:
1.1.2 Sorting aggregation results
- By default, Bucket aggregation will count the number of documents in the Bucket, record it as _count, and sort in descending order of _count.
- Specify the order attribute to customize the sorting method of aggregation
- Demo:
#自定义聚合的排序方式 # 按照_count升序排列 GET /hotel/_search { "size": 0, "aggs": { "brandAgg": { "terms": { "field": "brand", "order": { "_count": "asc" }, "size": 20 } } } }
- result:
- result:
1.1.3 Limit aggregation scope
- By default, Bucket aggregation aggregates all documents in the index library, but in real scenarios, users will enter search conditions, so the aggregation must be the aggregation of search results. Then the aggregation has to be qualified.
- To limit the range of documents to be aggregated, just add query conditions:
- Demo:
# 限定聚合范围 # 只对200元以下的文档聚合 GET /hotel/_search { "query": { "range": { "price": { "lte": 200 } } }, "size": 0, "aggs": { "brandAgg": { "terms": { "field": "brand", "size": 20 } } } }
- result:
1.2 Metric aggregation syntax
- Metric aggregation: used to calculate some values, such as: maximum value, minimum value, average value, etc.
- Avg: Average
- Max: find the maximum value
- Min: Find the minimum value
- Stats: Simultaneously seek max, min, avg, sum, etc.
- Demo:
GET /hotel/_search
{
"size": 0,
"aggs": {
"brandAgg": {
"terms": {
"field": "brand",
"size": 20
},
"aggs": {
// 是brands聚合的子聚合,也就是分组后对每组分别计算
"score_stats": {
// 聚合名称
"stats": {
// 聚合类型,这里stats可以计算min、max、avg等
"field": "score" // 聚合字段,这里是score
}
}
}
}
}
}
- result:
1.3 Summary
-
aggs stands for aggregation, which is at the same level as query. What is the function of query at this time?
- Scope the aggregated documents
-
The three elements necessary for aggregation:
- aggregate name
- aggregation type
- aggregate field
-
Aggregate configurable properties are:
- size: specify the number of aggregation results
- order: specify the sorting method of aggregation results
- field: specify the aggregation field
1.4 RestAPI realizes aggregation
1.5 API Syntax
- Aggregation conditions are at the same level as query conditions, so request.source() needs to be used to specify aggregation conditions.
- Syntax for aggregate conditions:
- The aggregation result is also different from the query result, and the API is also special. However, JSON is also parsed layer by layer:
1.7 case
- Requirement: The brand, city and other information of the search page should not be hard-coded on the page, but obtained by aggregated hotel data in the index library:
- analyze:
- Use the aggregation function and Bucket aggregation to group the documents in the search results based on brand and city, and know the included brands and cities.
- Because it is an aggregation of search results, the aggregation is a limited-range aggregation , that is to say, the limiting conditions of the aggregation are consistent with the conditions of the search document.
- The return value type is the final result to be displayed on the page:
- The result is a Map structure:
- key is a string, city, star, brand, price
- value is a collection, such as the names of multiple cities
-
Implemented important logic code
-
Add a method in
Controller
the following requirements:- Request method:
POST
- Request path:
/hotel/filters
- Request parameters:
RequestParams
, consistent with the parameters of the search document - Return value type:
Map<String, List<String>>
@PostMapping("filters") public Map<String, List<String>> getFilters(@RequestBody RequestParams params){ return hotelService.filters(params); }
- Request method:
-
Service
Define the new method in :Map<String, List<String>> filters(RequestParams params);
-
Implement this method in
HotelService
the implementation class of@Override public Map<String, List<String>> filters(RequestParams params) { try { // 1.准备Request SearchRequest request = new SearchRequest("hotel"); // 2.准备DSL // 2.1.query buildBasicQuery(params, request); // 2.2.设置size request.source().size(0); // 2.3.聚合 buildAggregation(request); // 3.发出请求 SearchResponse response = client.search(request, RequestOptions.DEFAULT); // 4.解析结果 Map<String, List<String>> result = new HashMap<>(); Aggregations aggregations = response.getAggregations(); // 4.1.根据品牌名称,获取品牌结果 List<String> brandList = getAggByName(aggregations, "brandAgg"); result.put("品牌", brandList); // 4.2.根据品牌名称,获取品牌结果 List<String> cityList = getAggByName(aggregations, "cityAgg"); result.put("城市", cityList); // 4.3.根据品牌名称,获取品牌结果 List<String> starList = getAggByName(aggregations, "starAgg"); result.put("星级", starList); return result; } catch (IOException e) { throw new RuntimeException(e); } } private void buildAggregation(SearchRequest request) { request.source().aggregation(AggregationBuilders .terms("brandAgg") .field("brand") .size(100) ); request.source().aggregation(AggregationBuilders .terms("cityAgg") .field("city") .size(100) ); request.source().aggregation(AggregationBuilders .terms("starAgg") .field("starName") .size(100) ); } private List<String> getAggByName(Aggregations aggregations, String aggName) { // 4.1.根据聚合名称获取聚合结果 Terms brandTerms = aggregations.get(aggName); // 4.2.获取buckets List<? extends Terms.Bucket> buckets = brandTerms.getBuckets(); // 4.3.遍历 List<String> brandList = new ArrayList<>(); for (Terms.Bucket bucket : buckets) { // 4.4.获取key String key = bucket.getKeyAsString(); brandList.add(key); } return brandList; }
-
Notice:
- In the case part, the code does not necessarily need to be customized manually one by one, but you need to operate it yourself to verify the final result! ! !
- At the same time, work hard to deal with the problems you encounter! ! !
Two auto-completion
- The effect is as shown in the figure:
- According to the letters entered by the user, the function of prompting complete entries is automatic completion
- Because it needs to be inferred based on the pinyin letters, the pinyin word segmentation function is used
2.1 Installation of Pinyin word breaker
- To achieve completion based on letters, it is necessary to segment the document according to pinyin. There happens to be a pinyin word segmentation plugin for elasticsearch on GitHub. address
-
download and unzip
-
Upload to the virtual machine, the plugin directory of elasticsearch
- To install the plug-in, you need to know the location of the plugins directory of elasticsearch, and we use the data volume mount, so we need to view the data volume directory of elasticsearch, and check it by the following command:
docker volume inspect es-plugins
- Then upload the decompressed file to this directory,
py
such as the pinyin word breaker after decompression and renaming
-
restart elasticsearch
docker restart es
-
test
POST /_analyze { "text": "如家酒店还不错", "analyzer": "pinyin" }
2.2 Custom tokenizer
-
The default pinyin word breaker divides each Chinese character into pinyin, but we want each entry to form a set of pinyin, so we need to customize the pinyin word breaker to form a custom word breaker.
-
The composition of the analyzer in elasticsearch consists of three parts:
- character filters : process the text before the tokenizer. e.g. delete characters, replace characters
- tokenizer : Cut the text into terms according to certain rules. For example, keyword is not participle; there is also ik_smart
- tokenizer filter : further process the entries output by the tokenizer. For example, case conversion, synonyms processing, pinyin processing, etc.
-
When document word segmentation, the document will be processed by these three parts in turn:
-
demo
#拼音分词器
DELETE /test
PUT /test
{
"settings": {
"analysis": {
"analyzer": {
"my_analyzer": {
"tokenizer": "ik_max_word",
"filter": "py"
}
},
"filter": {
"py": {
"type": "pinyin",
"keep_full_pinyin": false,
"keep_joined_full_pinyin": true,
"keep_original": true,
"limit_first_letter_length": 16,
"remove_duplicated_term": true,
"none_chinese_pinyin_tokenize": false
}
}
}
},
"mappings": {
"properties": {
"name": {
"type": "text",
"analyzer": "my_analyzer",
"search_analyzer": "ik_smart"
}
}
}
}
POST /test/_analyze
{
"text": ["如家酒店还不错"],
"analyzer": "my_analyzer"
}
- result:
- Precautions for pinyin word breaker
- In order to avoid searching for homophones, do not use the pinyin word breaker when searching
2.3 Autocomplete query
-
Elasticsearch provides Completion Suggester query to achieve automatic completion. This query will match terms beginning with the user input and return them. In order to improve the efficiency of the completion query, there are some constraints on the types of fields in the document:
- The fields participating in the completion query must be of completion type.
- The content of the field is generally an array formed by multiple entries for completion.
-
Demo:
# 自动补全查询
DELETE /test02
## 创建索引库
PUT /test02
{
"mappings": {
"properties": {
"title": {
"type": "completion"
}
}
}
}
## 示例数据
POST test02/_doc
{
"title": ["Sony", "WH-1000XM3"]
}
POST test02/_doc
{
"title": ["SK-II", "PITERA"]
}
POST test02/_doc
{
"title": ["Nintendo", "switch"]
}
## 自动补全查询
GET /test02/_search
{
"suggest": {
"title_suggest": {
"text": "s", # 关键字
"completion": {
"field": "title", #补全查询的字段
"skip_duplicates": true, #跳过重复的
"size": 10 #获取前10条结果
}
}
}
}
2.4 Java API for auto-completion query
- First look at the API constructed by the request parameter:
- Let's look at the result analysis:
- Important code:
@Override public List<String> getSuggestions(String prefix) { try { // 1.准备Request SearchRequest request = new SearchRequest("hotel"); // 2.准备DSL request.source().suggest(new SuggestBuilder().addSuggestion( "suggestions", SuggestBuilders.completionSuggestion("suggestion") .prefix(prefix) .skipDuplicates(true) .size(10) )); // 3.发起请求 SearchResponse response = client.search(request, RequestOptions.DEFAULT); // 4.解析结果 Suggest suggest = response.getSuggest(); // 4.1.根据补全查询名称,获取补全结果 CompletionSuggestion suggestions = suggest.getSuggestion("suggestions"); // 4.2.获取options List<CompletionSuggestion.Entry.Option> options = suggestions.getOptions(); // 4.3.遍历 List<String> list = new ArrayList<>(options.size()); for (CompletionSuggestion.Entry.Option option : options) { String text = option.getText().toString(); list.add(text); } return list; } catch (IOException e) { throw new RuntimeException(e); } }
Three data synchronization
- Introduce:
3.1 Thought Analysis
- There are three common data synchronization schemes:
- Solution 1: Synchronous call
- Solution 2: Asynchronous notification
- Solution 3: Monitor binlog
3.2 Solution 1: Synchronous call
- The basic steps are as follows:
- hotel-demo provides an interface to modify the data in elasticsearch
- After the hotel management service completes the database operation, it directly calls the interface provided by hotel-demo
3.3 Solution 2: Asynchronous notification
- The process is as follows:
- Hotel-admin sends MQ message after adding, deleting and modifying mysql database data
- Hotel-demo listens to MQ and completes elasticsearch data modification after receiving the message
3.4 Monitor binlog
- The process is as follows:
- Enable the binlog function for mysql
- The addition, deletion, and modification operations of mysql will be recorded in the binlog
- Hotel-demo monitors binlog changes based on canal, and updates the content in elasticsearch in real time
3.5 Comparison and summary of three schemes
plan | synchronous call | asynchronous call | monitor binlog |
---|---|---|---|
advantage | Realize simple, crude | Low coupling, generally difficult to implement | Complete decoupling between services |
shortcoming | High degree of business coupling | Rely on the reliability of mq | Enabling binlog increases the burden on the database and makes the implementation more complex |
3.6 Summary of Data Synchronization Cases
-
When the hotel data is added, deleted, or modified, the same operation is required for the data in elasticsearch.
-
step:
- Import the hotel-admin project provided by the data, start and test the CRUD of hotel data
- Complete message sending in the add, delete, and change business in hotel-admin
- Use annotations to declare exchange, queue, and RoutingKey in hotel-demo, complete message monitoring, and update data in elasticsearch
- Start and test the data sync function
-
The MQ structure is shown in the figure:
-
rely:
<!--amqp--> <dependency> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-amqp</artifactId> </dependency>
-
Declare the queue exchange name
public class MqConstants { /** * 交换机 */ public final static String HOTEL_EXCHANGE = "hotel.topic"; /** * 监听新增和修改的队列 */ public final static String HOTEL_INSERT_QUEUE = "hotel.insert.queue"; /** * 监听删除的队列 */ public final static String HOTEL_DELETE_QUEUE = "hotel.delete.queue"; /** * 新增或修改的RoutingKey */ public final static String HOTEL_INSERT_KEY = "hotel.insert"; /** * 删除的RoutingKey */ public final static String HOTEL_DELETE_KEY = "hotel.delete"; }
-
Send MQ message
@PostMapping public void saveHotel(@RequestBody Hotel hotel){ hotelService.save(hotel); rabbitTemplate.convertAndSend(MqConstants.HOTEL_EXCHANGE,MqConstants.HOTEL_INSERT_KEY,hotel.getId()); } @PutMapping() public void updateById(@RequestBody Hotel hotel){ if (hotel.getId() == null) { throw new InvalidParameterException("id不能为空"); } hotelService.updateById(hotel); rabbitTemplate.convertAndSend(MqConstants.HOTEL_EXCHANGE,MqConstants.HOTEL_INSERT_KEY,hotel.getId()); } @DeleteMapping("/{id}") public void deleteById(@PathVariable("id") Long id) { hotelService.removeById(id); rabbitTemplate.convertAndSend(MqConstants.HOTEL_EXCHANGE,MqConstants.HOTEL_DELETE_KEY,id); }
-
Receive MQ message
- Things to do when hotel-demo receives MQ messages include:
- New message: Query hotel information according to the passed hotel id, and then add a piece of data to the index library
- Delete message: Delete a piece of data in the index library according to the passed hotel id
service
Define new and deleted services in
void deleteById(Long id); void insertById(Long id);
- In its implementation class, implement the business
@Override public void deleteById(Long id) { try { // 1.准备Request DeleteRequest request = new DeleteRequest("hotel", id.toString()); // 2.发送请求 client.delete(request, RequestOptions.DEFAULT); } catch (IOException e) { throw new RuntimeException(e); } } @Override public void insertById(Long id) { try { // 0.根据id查询酒店数据 Hotel hotel = getById(id); // 转换为文档类型 HotelDoc hotelDoc = new HotelDoc(hotel); // 1.准备Request对象 IndexRequest request = new IndexRequest("hotel").id(hotel.getId().toString()); // 2.准备Json文档 request.source(JSON.toJSONString(hotelDoc), XContentType.JSON); // 3.发送请求 client.index(request, RequestOptions.DEFAULT); } catch (IOException e) { throw new RuntimeException(e); } }
- Things to do when hotel-demo receives MQ messages include:
-
Write a listener
import cn.itcast.hotel.constants.MqConstants; import cn.itcast.hotel.service.IHotelService; import org.springframework.amqp.core.ExchangeTypes; import org.springframework.amqp.rabbit.annotation.Exchange; import org.springframework.amqp.rabbit.annotation.Queue; import org.springframework.amqp.rabbit.annotation.QueueBinding; import org.springframework.amqp.rabbit.annotation.RabbitListener; import org.springframework.beans.factory.annotation.Autowired; import org.springframework.stereotype.Component; @Component public class HotelListener { @Autowired private IHotelService hotelService; /** * 监听酒店新增或修改的业务 * @param id 酒店id * queues = */ @RabbitListener(bindings = @QueueBinding(value = @Queue(name =MqConstants.HOTEL_INSERT_QUEUE), exchange = @Exchange(name = MqConstants.HOTEL_EXCHANGE,type = ExchangeTypes.TOPIC, autoDelete="false",durable = "true"), key = { MqConstants.HOTEL_INSERT_KEY} ) ) public void listenHotelInsertOrUpdate(Long id){ hotelService.insertById(id); } /** * 监听酒店删除的业务 * @param id 酒店id */ @RabbitListener(bindings = @QueueBinding(value = @Queue(name =MqConstants.HOTEL_DELETE_QUEUE), exchange = @Exchange(name = MqConstants.HOTEL_EXCHANGE,type = ExchangeTypes.TOPIC, autoDelete="false",durable = "true"), key = { MqConstants.HOTEL_DELETE_KEY} ) ) public void listenHotelDelete(Long id){ hotelService.deleteById(id); } }
3.7 Test of data synchronization case
- In rabbitMq, you can see that the queue registration is complete
- Confirm the normal operation of the project through the binding of the switch
- revised
上海希尔顿酒店
price
- The ID of the document viewed by
vue devtools
the plug-in tool上海希尔顿酒店
- Then, edit its price
- View, the message record of the queue
- check prices
3.8 Supplement: Installation of vue Devtools plugin
- I do not recommend everyone, download the source code yourself, compile and install it manually! ! ! You will encounter many mistakes, and your efforts are thankless! ! !
3.8.1 Edge browser installation method
- In the edge extension store, search and install
- The current version is stable version 6.5.0
- The current version is stable version 6.5.0
3.8.2 How to install chrome browser
- Because in China, chrome cannot access the chrome store normally, so you need to use a third-party plug-in website. The Vue Devtools download address of the minimalist plug-in website
- Download Vue Devtools and unzip it
- Turn on the developer mode of chrome: Settings->Extensions->Developer mode
- Drag the files in the decompressed folder
.crx
to the extension program page of the chrome browser, and click Add extension.
3.9 Explanation of the vue devtools tool
- Post-installation testing
- This tool, only when the vue front-end page is run locally, the console will display the plug-in, and it will not appear on other web pages
- So: the correct way to check whether the installation is successful is: start the local vue project, open the console to view vue devtools
- The value in the default configuration file of the plugin
persistent
is:true
, so no modification is required
Four elasticsearch clusters
- Stand-alone elasticsearch for data storage will inevitably face two problems: massive data storage and single point of failure.
- Massive data storage problem: Logically split the index library into N shards (shards) and store them in multiple nodes
- Single point of failure problem: back up fragmented data on different nodes (replica)
4.1 ES cluster related concepts
- Cluster (cluster): A group of nodes with a common cluster name.
- Node (node) : an Elasticearch instance in the cluster
- Shard : Indexes can be split into different parts for storage, called shards. In a cluster environment, different shards of an index can be split into different nodes
- Solve the problem: the amount of data is too large and the storage capacity of a single point is limited.
- Primary shard (Primary shard): relative to the definition of replica shards.
- Replica shard (Replica shard) Each primary shard can have one or more copies, and the data is the same as the primary shard.
- In order to find a balance between high availability and cost, we can do this:
- First shard the data and store it in different nodes
- Then back up each shard and put it on the other node to complete mutual backup
- Now, each shard has 1 backup, stored on 3 nodes:
- node0: holds shards 0 and 1
- node1: holds shards 0 and 2
- node2: saved shards 1 and 2
4.2 Build an ES cluster
- We use docker containers to run multiple es instances on a single machine to simulate es clusters. However, in the production environment, it is recommended that you only deploy one es instance on each service node.
4.3.1 Create es cluster
- First write a docker-compose.yml file with the following content:
version: '2.2' services: es01: image: elasticsearch:7.12.1 container_name: es01 environment: - node.name=es01 - cluster.name=es-docker-cluster - discovery.seed_hosts=es02,es03 - cluster.initial_master_nodes=es01,es02,es03 - "ES_JAVA_OPTS=-Xms512m -Xmx512m" volumes: - data01:/usr/share/elasticsearch/data ports: - 9200:9200 networks: - elastic es02: image: elasticsearch:7.12.1 container_name: es02 environment: - node.name=es02 - cluster.name=es-docker-cluster - discovery.seed_hosts=es01,es03 - cluster.initial_master_nodes=es01,es02,es03 - "ES_JAVA_OPTS=-Xms512m -Xmx512m" volumes: - data02:/usr/share/elasticsearch/data ports: - 9201:9200 networks: - elastic es03: image: elasticsearch:7.12.1 container_name: es03 environment: - node.name=es03 - cluster.name=es-docker-cluster - discovery.seed_hosts=es01,es02 - cluster.initial_master_nodes=es01,es02,es03 - "ES_JAVA_OPTS=-Xms512m -Xmx512m" volumes: - data03:/usr/share/elasticsearch/data networks: - elastic ports: - 9202:9200 volumes: data01: driver: local data02: driver: local data03: driver: local networks: elastic: driver: bridge
- es operation needs to modify some linux system permissions, modify
/etc/sysctl.conf
filesvi /etc/sysctl.conf
- Then execute the command to make the configuration take effect:
sysctl -p
- Start the cluster through docker-compose
docker-compose up -d
[root@kongyue tmp]# docker-compose up -d Starting es01 ... done Creating es03 ... done Creating es02 ... done
4.4 Cluster status monitoring
4.4.1 Win installation cerebro [not recommended]
-
Kibana can monitor es clusters, but the new version needs to rely on the x-pack function of es, and the configuration is more complicated.
-
It is recommended to use cerebro to monitor the status of the es cluster. The decompressed directory of the official website is as follows:
-
Enter the corresponding bin directory:
-
Double-click the cerebro.bat file to start the service.
4.4.2 Flashback problem [unresolved]
Oops, cannot start the server.
com.google.common.util.concurrent.UncheckedExecutionException: java.lang.IllegalStateException: Unable to load cache item
at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2051)
at com.google.common.cache.LocalCache.get(LocalCache.java:3951)
at com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:3974)
at com.google.common.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4958)
at com.google.common.cache.LocalCache$LocalLoadingCache.getUnchecked(LocalCache.java:4964)
at com.google.inject.internal.FailableCache.get(FailableCache.java:54)
at com.google.inject.internal.ConstructorInjectorStore.get(ConstructorInjectorStore.java:49)
at com.google.inject.internal.ConstructorBindingImpl.initialize(ConstructorBindingImpl.java:155)
at com.google.inject.internal.InjectorImpl.initializeBinding(InjectorImpl.java:592)
at com.google.inject.internal.AbstractBindingProcessor$Processor.initializeBinding(AbstractBindingProcessor.java:173)
at com.google.inject.internal.AbstractBindingProcessor$Processor.lambda$scheduleInitialization$0(AbstractBindingProcessor.java:160)
at com.google.inject.internal.ProcessedBindingData.initializeBindings(ProcessedBindingData.java:49)
at com.google.inject.internal.InternalInjectorCreator.initializeStatically(InternalInjectorCreator.java:124)
at com.google.inject.internal.InternalInjectorCreator.build(InternalInjectorCreator.java:108)
at com.google.inject.Guice.createInjector(Guice.java:87)
at com.google.inject.Guice.createInjector(Guice.java:78)
at play.api.inject.guice.GuiceBuilder.injector(GuiceInjectorBuilder.scala:200)
at play.api.inject.guice.GuiceApplicationBuilder.build(GuiceApplicationBuilder.scala:155)
at play.api.inject.guice.GuiceApplicationLoader.load(GuiceApplicationLoader.scala:21)
at play.core.server.ProdServerStart$.start(ProdServerStart.scala:54)
at play.core.server.ProdServerStart$.main(ProdServerStart.scala:30)
at play.core.server.ProdServerStart.main(ProdServerStart.scala)
Caused by: java.lang.IllegalStateException: Unable to load cache item
at com.google.inject.internal.cglib.core.internal.$LoadingCache.createEntry(LoadingCache.java:79)
at com.google.inject.internal.cglib.core.internal.$LoadingCache.get(LoadingCache.java:34)
at com.google.inject.internal.cglib.core.$AbstractClassGenerator$ClassLoaderData.get(AbstractClassGenerator.java:119)
at com.google.inject.internal.cglib.core.$AbstractClassGenerator.create(AbstractClassGenerator.java:294)
at com.google.inject.internal.cglib.reflect.$FastClass$Generator.create(FastClass.java:65)
at com.google.inject.internal.BytecodeGen.newFastClassForMember(BytecodeGen.java:258)
at com.google.inject.internal.BytecodeGen.newFastClassForMember(BytecodeGen.java:207)
at com.google.inject.internal.DefaultConstructionProxyFactory.create(DefaultConstructionProxyFactory.java:49)
at com.google.inject.internal.ProxyFactory.create(ProxyFactory.java:156)
at com.google.inject.internal.ConstructorInjectorStore.createConstructor(ConstructorInjectorStore.java:94)
at com.google.inject.internal.ConstructorInjectorStore.access$000(ConstructorInjectorStore.java:30)
at com.google.inject.internal.ConstructorInjectorStore$1.create(ConstructorInjectorStore.java:38)
at com.google.inject.internal.ConstructorInjectorStore$1.create(ConstructorInjectorStore.java:34)
at com.google.inject.internal.FailableCache$1.load(FailableCache.java:43)
at com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3529)
at com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2278)
at com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2155)
at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2045)
... 21 more
Caused by: java.lang.ExceptionInInitializerError
at com.google.inject.internal.cglib.core.$DuplicatesPredicate.evaluate(DuplicatesPredicate.java:104)
at com.google.inject.internal.cglib.core.$CollectionUtils.filter(CollectionUtils.java:52)
at com.google.inject.internal.cglib.reflect.$FastClassEmitter.<init>(FastClassEmitter.java:69)
at com.google.inject.internal.cglib.reflect.$FastClass$Generator.generateClass(FastClass.java:77)
at com.google.inject.internal.cglib.core.$DefaultGeneratorStrategy.generate(DefaultGeneratorStrategy.java:25)
at com.google.inject.internal.cglib.core.$AbstractClassGenerator.generate(AbstractClassGenerator.java:332)
at com.google.inject.internal.cglib.core.$AbstractClassGenerator$ClassLoaderData$3.apply(AbstractClassGenerator.java:96)
at com.google.inject.internal.cglib.core.$AbstractClassGenerator$ClassLoaderData$3.apply(AbstractClassGenerator.java:94)
at com.google.inject.internal.cglib.core.internal.$LoadingCache$2.call(LoadingCache.java:54)
at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:317)
at com.google.inject.internal.cglib.core.internal.$LoadingCache.createEntry(LoadingCache.java:61)
... 38 more
Caused by: com.google.inject.internal.cglib.core.$CodeGenerationException: java.lang.reflect.InaccessibleObjectException-->Unable to make protected final java.lang.Class java.lang.ClassLoader.defineClass(java.lang.String,byte[],int,int,java.security.ProtectionDomain) throws java.lang.ClassFormatError accessible: module java.base does not "opens java.lang" to unnamed module @6a988392
at com.google.inject.internal.cglib.core.$ReflectUtils.defineClass(ReflectUtils.java:464)
at com.google.inject.internal.cglib.core.$AbstractClassGenerator.generate(AbstractClassGenerator.java:339)
at com.google.inject.internal.cglib.core.$AbstractClassGenerator$ClassLoaderData$3.apply(AbstractClassGenerator.java:96)
at com.google.inject.internal.cglib.core.$AbstractClassGenerator$ClassLoaderData$3.apply(AbstractClassGenerator.java:94)
at com.google.inject.internal.cglib.core.internal.$LoadingCache$2.call(LoadingCache.java:54)
at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:317)
at com.google.inject.internal.cglib.core.internal.$LoadingCache.createEntry(LoadingCache.java:61)
at com.google.inject.internal.cglib.core.internal.$LoadingCache.get(LoadingCache.java:34)
at com.google.inject.internal.cglib.core.$AbstractClassGenerator$ClassLoaderData.get(AbstractClassGenerator.java:119)
at com.google.inject.internal.cglib.core.$AbstractClassGenerator.create(AbstractClassGenerator.java:294)
at com.google.inject.internal.cglib.core.$KeyFactory$Generator.create(KeyFactory.java:221)
at com.google.inject.internal.cglib.core.$KeyFactory.create(KeyFactory.java:174)
at com.google.inject.internal.cglib.core.$KeyFactory.create(KeyFactory.java:157)
at com.google.inject.internal.cglib.core.$KeyFactory.create(KeyFactory.java:149)
at com.google.inject.internal.cglib.core.$KeyFactory.create(KeyFactory.java:145)
at com.google.inject.internal.cglib.core.$MethodWrapper.<clinit>(MethodWrapper.java:23)
... 49 more
Caused by: java.lang.reflect.InaccessibleObjectException: Unable to make protected final java.lang.Class java.lang.ClassLoader.defineClass(java.lang.String,byte[],int,int,java.security.ProtectionDomain) throws java.lang.ClassFormatError accessible: module java.base does not "opens java.lang" to unnamed module @6a988392
at java.base/java.lang.reflect.AccessibleObject.throwInaccessibleObjectException(AccessibleObject.java:387)
at java.base/java.lang.reflect.AccessibleObject.checkCanSetAccessible(AccessibleObject.java:363)
at java.base/java.lang.reflect.AccessibleObject.checkCanSetAccessible(AccessibleObject.java:311)
at java.base/java.lang.reflect.Method.checkCanSetAccessible(Method.java:201)
at java.base/java.lang.reflect.Method.setAccessible(Method.java:195)
at com.google.inject.internal.cglib.core.$ReflectUtils$1.run(ReflectUtils.java:61)
at java.base/java.security.AccessController.doPrivileged(AccessController.java:569)
at com.google.inject.internal.cglib.core.$ReflectUtils.<clinit>(ReflectUtils.java:52)
at com.google.inject.internal.cglib.reflect.$FastClassEmitter.<init>(FastClassEmitter.java:67)
... 46 more
- Because the jdk version is too high, it is not compatible with cerebo. It seems that there is no solution at present! ! !
- Of course, the author's level is limited! If there is a solution, please let me know, thank you very much
4.4.3 install cerebo in linux
- download cerebo
- Suggestion: install the github accelerator plug-in in the browser [skip if you have a ladder]
- Then upload the compressed package to linux for installation
rpm -ivh cerebro-0.9.4-1.noarch.rpm
- Modify the configuration file
vim /usr/share/cerebro/conf/application.conf
- Check and close the startup status of cerebro: [not recommended]
- This boot method cannot be accessed from external devices
# 停止 systemctl stop cerebro # 开启 systemctl start cerebro # 查看状态 systemctl status cerebro
- Start command:
- In order to facilitate troubleshooting, you can directly use the command to start cerebro
/usr/share/cerebro/bin/cerebro
- Started by default:
[info] play.api.Play - Application started (Prod) (no global state) [info] p.c.s.AkkaHttpServer - Listening for HTTP on /0:0:0:0:0:0:0:0:9000
- visit
ip:9000
:
- Enter the address and port of any node of elasticsearch, click connect
- A green bar means the cluster is green (healthy)
4.4.5 Create an index library
4.4.6 Using kibana's DevTools to create an index library [non-practical operation]
- Enter the command in DevTools:
PUT /itcast { "settings": { "number_of_shards": 3, // 分片数量 "number_of_replicas": 1 // 副本数量 }, "mappings": { "properties": { // mapping映射定义 ... } } }
4.4.7 Using cerebro to create an index library [practical operation]
- You can also create an index library with cerebro:
-
Fill in the index library information:
-
Click the create button in the lower right corner:
-
Click the create button in the lower right corner:
4.4.8 View Fragmentation Effect
- Go back to the home page, and you can view the fragmentation effect of the index library:
4.9 Cluster split-brain problem
4.9.1 Division of Cluster Responsibilities
- Cluster nodes in elasticsearch have different responsibilities:
- The cluster must separate cluster responsibilities:
- master node: high CPU requirements, but memory requirements
- data node: high requirements for CPU and memory
- Coordinating node: high requirements for network bandwidth and CPU
- Separation of duties can allocate different hardware for deployment according to the needs of different nodes. And avoid mutual interference between services.
- Each node role in elasticsearch has its own different responsibilities, so it is recommended that each node has an independent role during cluster deployment.
4.9.2 Split brain problem
-
By default, each node is a master eligible node, so once the master node goes down, other candidate nodes will elect one to become the master node. A split-brain problem can occur when the network between the master node and other nodes fails.
-
After node3 is elected, the cluster continues to provide external services. Node2 and node3 form a cluster by itself, and node1 forms a cluster by itself. The data of the two clusters is not synchronized, resulting in data discrepancies and split-brain situations.
-
In order to avoid split-brain, it is necessary to require votes to exceed (number of eligible nodes + 1)/2 to be elected as the master, so the number of eligible nodes is preferably an odd number.
-
The corresponding configuration item is discovery.zen.minimum_master_nodes, which has become the default configuration after es7.0, so the problem of split brain generally does not occur
4.9.3 Summary
-
The role of the master eligible node:
- Participate in group election
- The master node can manage the cluster state, manage sharding information, and process requests to create and delete index libraries
-
The role of the data node:
- CRUD of data
-
The role of the coordinator node:
- Route requests to other nodes
- Combine the query results and return them to the user
4.10 Cluster Distributed Storage
- When new documents are added, they should be saved in different shards to ensure data balance, so how does the coordinating node determine which shard the data should be stored in?
4.10.1 Shard storage test
- The tool used for testing
insomnia
, Insomnia, like postman, is a free cross-platform interface testing desktop application - Insomnia official website , if you want to download, it is recommended to download from other websites, the official website is too slow
- Here the author provides the latest version of "Insomnia.Core-2023.1.0.exe"
- You can see from the test that the three pieces of data are in different shards:
4.10.2 Shard storage principle
- Elasticsearch will use the hash algorithm to calculate which shard the document should be stored in:
- illustrate:
- _routing defaults to the id of the document
- The algorithm is related to the number of shards, so once the index library is created, the number of shards cannot be modified
- The process of adding a new document is as follows
- Interpretation:
- 1) Add a document with id=1
- 2) Do a hash operation on the id, if the result is 2, it should be stored in shard-2
- 3) The primary shard of shard-2 is on node3, and the data is routed to node3
- 4) Save the document
- 5) Synchronize to replica-2 of shard-2, on the node2 node
- 6) Return the result to the coordinating-node node
4.10.3 Cluster Distributed Query
- The elasticsearch query is divided into two stages:
- scatter phase: In the scatter phase, the coordinating node will distribute the request to each shard
- gather phase: the gathering phase, the coordinating node summarizes the search results of the data node, and processes it as the final result set and returns it to the user
4.10.4 Cluster failover
- Failover: The master node of the cluster will monitor the status of the nodes in the cluster. If a node is found to be down, it will immediately migrate the fragmented data of the down node to other nodes to ensure data security.
- For example, a cluster structure is shown in the figure: node1 is the master node, and the other two nodes are slave nodes
- node1 has failed
- The first thing after the downtime is to re-elect the master, for example, select node2
- After node2 becomes the master node, it will check the cluster monitoring status and find that: shard-1 and shard-0 have no replica nodes. Therefore, the data on node1 needs to be migrated to node2 and node3
- Animation demo: