Arthas排查skywalking问题 id is too long, must be no longer than 512 bytes

id is too long, must be no longer than 512 bytes

The deployed skywalking keeps crashing and the cpu is full. Check the skywalking-oap-server.log log and find many abnormal logs, as follows:

2021-02-20 17:27:18,699 - org.apache.skywalking.oap.server.core.register.worker.RegisterPersistentWorker - 105 [DataCarrier.REGISTER_L2.BulkConsumePool.0.Thread] ERROR [] - Validation Failed: 1: id is too long, must be no longer than 512 bytes but was: 642;
org.elasticsearch.action.ActionRequestValidationException: Validation Failed: 1: id is too long, must be no longer than 512 bytes but was: 642;
	at org.elasticsearch.action.ValidateActions.addValidationError(ValidateActions.java:26) ~[elasticsearch-6.3.2.jar:6.3.2]
	at org.elasticsearch.action.index.IndexRequest.validate(IndexRequest.java:183) ~[elasticsearch-6.3.2.jar:6.3.2]
	at org.elasticsearch.client.RestHighLevelClient.performRequest(RestHighLevelClient.java:515) ~[elasticsearch-rest-high-level-client-6.3.2.jar:6.3.2]
	at org.elasticsearch.client.RestHighLevelClient.performRequestAndParseEntity(RestHighLevelClient.java:508) ~[elasticsearch-rest-high-level-client-6.3.2.jar:6.3.2]
	at org.elasticsearch.client.RestHighLevelClient.index(RestHighLevelClient.java:348) ~[elasticsearch-rest-high-level-client-6.3.2.jar:6.3.2]
	at org.apache.skywalking.oap.server.library.client.elasticsearch.ElasticSearchClient.forceInsert(ElasticSearchClient.java:241) ~[library-client-6.3.0.jar:6.3.0]
	at org.apache.skywalking.oap.server.storage.plugin.elasticsearch.base.RegisterEsDAO.forceInsert(RegisterEsDAO.java:51) ~[storage-elasticsearch-plugin-6.3.0.jar:6.3.0]
	at org.apache.skywalking.oap.server.core.register.worker.RegisterPersistentWorker.lambda$onWork$0(RegisterPersistentWorker.java:102) ~[server-core-6.3.0.jar:6.3.0]
	at java.util.HashMap$Values.forEach(HashMap.java:981) [?:1.8.0_221]
	at org.apache.skywalking.oap.server.core.register.worker.RegisterPersistentWorker.onWork(RegisterPersistentWorker.java:84) [server-core-6.3.0.jar:6.3.0]
	at org.apache.skywalking.oap.server.core.register.worker.RegisterPersistentWorker.access$100(RegisterPersistentWorker.java:35) [server-core-6.3.0.jar:6.3.0]
	at org.apache.skywalking.oap.server.core.register.worker.RegisterPersistentWorker$PersistentConsumer.consume(RegisterPersistentWorker.java:141) [server-core-6.3.0.jar:6.3.0]
	at org.apache.skywalking.apm.commons.datacarrier.consumer.MultipleChannelsConsumer.consume(MultipleChannelsConsumer.java:82) [apm-datacarrier-6.3.0.jar:6.3.0]
	at org.apache.skywalking.apm.commons.datacarrier.consumer.MultipleChannelsConsumer.run(MultipleChannelsConsumer.java:53) [apm-datacarrier-6.3.0.jar:6.3.0]

Found this article on the Internet: https://www.cnblogs.com/kebibuluan/p/13633037.html
Problem analysis:

可以看到,上面的异常输出的时间节点,以这种频率在疯狂的刷新。通过异常message,得知到是因为skywalking在写elasticsearch时,索引的id太长了。下面是elasticsearch的源码:

        if (id != null && id.getBytes(StandardCharsets.UTF_8).length > 512) {
            validationException = addValidationError("id is too long, must be no longer than 512 bytes but was: " +
                            id.getBytes(StandardCharsets.UTF_8).length, validationException);
        }
具体可见:elasticsearch/action/index/IndexRequest.java#L240

It gives the method but no specific steps. Here are the specific steps.

Install arthas

# 下载
curl -O https://alibaba.github.io/arthas/arthas-boot.jar

# 执行
java -Dfile.encoding=UTF-8 -jar arthas-boot.jar
# 另外一个需要解释的点是 -Dfile.encoding=UTF-8,这个 Java 设置是为了让 Arthas 输出中文的时候不会乱码

Execution effect

[root@skywailing-aliyun tools]# java -Dfile.encoding=UTF-8 -jar arthas-boot.jar
[INFO] arthas-boot version: 3.4.5
[INFO] Found existing java process, please choose one and input the serial number of the process, eg : 1. Then hit ENTER.
* [1]: 22721 org.elasticsearch.bootstrap.Elasticsearch
  [2]: 18622 /data/skywalking/webapp/skywalking-webapp.jar
  [3]: 18863 org.apache.skywalking.oap.server.starter.OAPServerStartUp
3     # 这里选择3 skywalking 的服务。
[INFO] arthas home: /root/.arthas/lib/3.4.6/arthas
[INFO] Try to attach process 18863
[INFO] Attach process 18863 success.
[INFO] arthas-client connect 127.0.0.1 3658
  ,---.  ,------. ,--------.,--.  ,--.  ,---.   ,---.                           
 /  O  \ |  .--. ''--.  .--'|  '--'  | /  O  \ '   .-'                          
|  .-.  ||  '--'.'   |  |   |  .--.  ||  .-.  |`.  `-.                          
|  | |  ||  |\  \    |  |   |  |  |  ||  | |  |.-'    |                         
`--' `--'`--' '--'   `--'   `--'  `--'`--' `--'`-----'                          
                                                                                

wiki      https://arthas.aliyun.com/doc                                         
tutorials https://arthas.aliyun.com/doc/arthas-tutorials.html                   
version   3.4.6                                                                 
pid       18863                                                                 
time      2021-02-20 17:38:28                                                   

[arthas@18863]$ 

Search by keyword

From the error log, there is a method to request index in the following line. The location problem is in the request. So you can locate this class.

at org.elasticsearch.action.index.IndexRequest.validate(IndexRequest.java:183) ~[elasticsearch-6.3.2.jar:6.3.2]

Execute sc search class
Execute sm search method

[arthas@18863]$ sc org.elasticsearch.action.index.*
org.apache.skywalking.oap.server.library.client.elasticsearch.ElasticSearchInsertRequest
org.elasticsearch.action.index.IndexRequest
org.elasticsearch.action.index.IndexResponse
org.elasticsearch.action.index.IndexResponse$$Lambda$340/154425708
org.elasticsearch.action.index.IndexResponse$Builder
Affect(row-cnt:5) cost in 15 ms.
#  模糊搜索出来有 org.elasticsearch.action.index.IndexRequest
# 查看具体方法
[arthas@18863]$ sm org.elasticsearch.action.index.IndexRequest
org.elasticsearch.action.index.IndexRequest <init>()V
org.elasticsearch.action.index.IndexRequest <init>(Ljava/lang/String;Ljava/lang/String;)V
org.elasticsearch.action.index.IndexRequest <init>(Ljava/lang/String;)V
org.elasticsearch.action.index.IndexRequest <init>(Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;)V
org.elasticsearch.action.index.IndexRequest validate()Lorg/elasticsearch/action/ActionRequestValidationException;
org.elasticsearch.action.index.IndexRequest getContentType()Lorg/elasticsearch/common/xcontent/XContentType;
org.elasticsearch.action.index.IndexRequest process(Lorg/elasticsearch/Version;Lorg/elasticsearch/cluster/metadata/MappingMetaData;Ljava/lang/String;)V
org.elasticsearch.action.index.IndexRequest writeTo(Lorg/elasticsearch/common/io/stream/StreamOutput;)V
org.elasticsearch.action.index.IndexRequest source()Lorg/elasticsearch/common/bytes/BytesReference;
org.elasticsearch.action.index.IndexRequest source(Ljava/util/Map;)Lorg/elasticsearch/action/index/IndexRequest;
org.elasticsearch.action.index.IndexRequest source(Lorg/elasticsearch/common/bytes/BytesReference;Lorg/elasticsearch/common/xcontent/XContentType;)Lorg/elasticsearch/action/index/IndexRequest;
org.elasticsearch.action.index.IndexRequest source([BIILorg/elasticsearch/common/xcontent/XContentType;)Lorg/elasticsearch/action/index/IndexRequest;
org.elasticsearch.action.index.IndexRequest source(Lorg/elasticsearch/common/xcontent/XContentType;[Ljava/lang/Object;)Lorg/elasticsearch/action/index/IndexRequest;
org.elasticsearch.action.index.IndexRequest source([Ljava/lang/Object;)Lorg/elasticsearch/action/index/IndexRequest;
org.elasticsearch.action.index.IndexRequest source(Lorg/elasticsearch/common/xcontent/XContentBuilder;)Lorg/elasticsearch/action/index/IndexRequest;
org.elasticsearch.action.index.IndexRequest source(Ljava/lang/String;Lorg/elasticsearch/common/xcontent/XContentType;)Lorg/elasticsearch/action/index/IndexRequest;
org.elasticsearch.action.index.IndexRequest source(Ljava/util/Map;Lorg/elasticsearch/common/xcontent/XContentType;)Lorg/elasticsearch/action/index/IndexRequest;
org.elasticsearch.action.index.IndexRequest source([BLorg/elasticsearch/common/xcontent/XContentType;)Lorg/elasticsearch/action/index/IndexRequest;
org.elasticsearch.action.index.IndexRequest routing(Ljava/lang/String;)Lorg/elasticsearch/action/index/IndexRequest;
org.elasticsearch.action.index.IndexRequest routing(Ljava/lang/String;)Ljava/lang/Object;
org.elasticsearch.action.index.IndexRequest routing()Ljava/lang/String;
org.elasticsearch.action.index.IndexRequest opType(Ljava/lang/String;)Lorg/elasticsearch/action/index/IndexRequest;
org.elasticsearch.action.index.IndexRequest opType()Lorg/elasticsearch/action/DocWriteRequest$OpType;
org.elasticsearch.action.index.IndexRequest opType(Lorg/elasticsearch/action/DocWriteRequest$OpType;)Lorg/elasticsearch/action/index/IndexRequest;
org.elasticsearch.action.index.IndexRequest versionType(Lorg/elasticsearch/index/VersionType;)Ljava/lang/Object;
org.elasticsearch.action.index.IndexRequest versionType(Lorg/elasticsearch/index/VersionType;)Lorg/elasticsearch/action/index/IndexRequest;
org.elasticsearch.action.index.IndexRequest versionType()Lorg/elasticsearch/index/VersionType;
org.elasticsearch.action.index.IndexRequest isRetry()Z
org.elasticsearch.action.index.IndexRequest setPipeline(Ljava/lang/String;)Lorg/elasticsearch/action/index/IndexRequest;
org.elasticsearch.action.index.IndexRequest getPipeline()Ljava/lang/String;
org.elasticsearch.action.index.IndexRequest sourceAsMap()Ljava/util/Map;
org.elasticsearch.action.index.IndexRequest resolveVersionDefaults()J
org.elasticsearch.action.index.IndexRequest resolveRouting(Lorg/elasticsearch/cluster/metadata/MetaData;)V
org.elasticsearch.action.index.IndexRequest readFrom(Lorg/elasticsearch/common/io/stream/StreamInput;)V
org.elasticsearch.action.index.IndexRequest onRetry()V
org.elasticsearch.action.index.IndexRequest getAutoGeneratedTimestamp()J
org.elasticsearch.action.index.IndexRequest setShardId(Lorg/elasticsearch/index/shard/ShardId;)Lorg/elasticsearch/action/support/replication/ReplicationRequest;
org.elasticsearch.action.index.IndexRequest setShardId(Lorg/elasticsearch/index/shard/ShardId;)Lorg/elasticsearch/action/index/IndexRequest;
org.elasticsearch.action.index.IndexRequest parent(Ljava/lang/String;)Lorg/elasticsearch/action/index/IndexRequest;
org.elasticsearch.action.index.IndexRequest parent()Ljava/lang/String;
org.elasticsearch.action.index.IndexRequest type(Ljava/lang/String;)Lorg/elasticsearch/action/index/IndexRequest;
org.elasticsearch.action.index.IndexRequest type()Ljava/lang/String;
org.elasticsearch.action.index.IndexRequest toString()Ljava/lang/String;
org.elasticsearch.action.index.IndexRequest create(Z)Lorg/elasticsearch/action/index/IndexRequest;
org.elasticsearch.action.index.IndexRequest version(J)Ljava/lang/Object;
org.elasticsearch.action.index.IndexRequest version(J)Lorg/elasticsearch/action/index/IndexRequest;
org.elasticsearch.action.index.IndexRequest version()J
org.elasticsearch.action.index.IndexRequest id(Ljava/lang/String;)Lorg/elasticsearch/action/index/IndexRequest;
org.elasticsearch.action.index.IndexRequest id()Ljava/lang/String;
org.apache.skywalking.oap.server.library.client.elasticsearch.ElasticSearchInsertRequest <init>(Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;)V
org.apache.skywalking.oap.server.library.client.elasticsearch.ElasticSearchInsertRequest source(Lorg/elasticsearch/common/xcontent/XContentBuilder;)Lorg/apache/skywalking/oap/server/library/client/elasticsearch/ElasticSearchInsertRequest;
org.apache.skywalking.oap.server.library.client.elasticsearch.ElasticSearchInsertRequest source(Lorg/elasticsearch/common/xcontent/XContentBuilder;)Lorg/elasticsearch/action/index/IndexRequest;
Affect(row-cnt:52) cost in 19 ms.

There is a validate method inside. Let's watch this method directly below

Positioning problem

[arthas@18863]$ watch org.elasticsearch.action.index.IndexRequest validate 
Press Q or Ctrl+C to abort.
Affect(class count: 2 , method count: 1) cost in 115 ms, listenerId: 9
method=org.elasticsearch.action.index.IndexRequest.validate location=AtExit
ts=2021-02-20 17:09:48; [cost=1.001631ms] result=@ArrayList[
    @Object[][isEmpty=true;size=0],
    @ElasticSearchInsertRequest[index {
    
    [es_endpoint_inventory][type][21_/api/platform-merchants-by-merchant-no/394/8628888495116643,5182073620535000,3443172328875926,2250786343073631,8576092672333003,6763668243141332,6492958354264545,6371978301346528,4396232269781811,7076446563756034,6243473869242754,3209554209322923,2048713085087784,3578049506120495,9089969774634252,9584107801344760,1963194682021308,6372911380583361,6052625865198842,5360949050957454,6821866195213903,5793037639974033,1524789348846301,3028126582081557,8458758411205068,3949114501690683,5959222159114331,5451554830330944,3385326099946619,6865603164164470,4112671470578858,8784451002706920,5688016656337076,2982511945470498,1435713064978363_0], source[{
    
    "sequence":9870,"last_update_time":0,"heartbeat_time":1613812188462,"service_id":21,"name":"/api/platform-merchants-by-merchant-no/394/8628888495116643,5182073620535000,3443172328875926,2250786343073631,8576092672333003,6763668243141332,6492958354264545,6371978301346528,4396232269781811,7076446563756034,6243473869242754,3209554209322923,2048713085087784,3578049506120495,9089969774634252,9584107801344760,1963194682021308,6372911380583361,6052625865198842,5360949050957454,6821866195213903,5793037639974033,1524789348846301,3028126582081557,8458758411205068,3949114501690683,5959222159114331,5451554830330944,3385326099946619,6865603164164470,4112671470578858,8784451002706920,5688016656337076,2982511945470498,1435713064978363","detect_point":0,"register_time":1613812188462}]}],
    @ActionRequestValidationException[org.elasticsearch.action.ActionRequestValidationException: Validation Failed: 1: id is too long, must be no longer than 512 bytes but was: 642;],
]
method=org.elasticsearch.action.index.IndexRequest.validate location=AtExit

Here you can directly see the specific interface: /api/platform-merchants-by-merchant-no The
next step is to connect with the development to find the specific service of the interface.
Solution to the problem: temporarily remove the skywalking agent in the application startup script that is located

Exit debugging

[arthas@18863]$ stop
Resetting all enhanced classes ...
Affect(class count: 2 , method count: 0) cost in 106 ms, listenerId: 0
Arthas Server is going to shut down...
[arthas@18863]$ session (16ba375e-c26d-4b1c-9398-2866ec3eed9b) is closed because server is going to shutdown.

Reference article:
Arthas assisted in troubleshooting the unavailability of online skywalking:
https://www.cnblogs.com/kebibuluan/p/13633037.html

Arthas watch command usage guide:
https://my.oschina.net/u/3874284/blog/4306792

One of the related commands of arthas class/classloader: sc, sm:
https://blog.csdn.net/a772304419/article/details/108432685

Use arthas' watch method to observe the input and output of the execution method:
https://www.cnblogs.com/doit8791/p/12040642.html

Guess you like

Origin blog.csdn.net/lswzw/article/details/113887840