问题背景
上一章我们使用了sleuth+zipkin+kafka+elasticsearch搭建了一个简单的分布式链路追踪系统 ,通过接口调用调试,如果已经自己试过的朋友肯定知道,默认全局的调用链是不会自动生成的 , 通过看官网的文档,我们可以看到,使用elasticsearch作为存储的话,是需要使用一个 “spark job”来生成全局的调用链的。
https://github.com/openzipkin/zipkin
访问:https://github.com/openzipkin/zipkin-dependencies
启动zipkin-dependencies.jar
找到Elasticsearch Storage的部分, 可以看到他提供了如下方式:
$ STORAGE_TYPE=elasticsearch ES_HOSTS=host1,host2 java -jar zipkin-dependencies.jar
# To override the http port, add it to the host string
$ STORAGE_TYPE=elasticsearch ES_HOSTS=host1:9201 java -jar zipkin-dependencies.jar
上面的命令在window系统上是执行不成功的,至少我没有执行成功,我是放在Linux系统上执行的
STORAGE_TYPE=elasticsearch ES_HOSTS=localhost:9200 nohup java -jar zipkin-dependencies-2.0.0.jar &
查看日志如下:
18/08/07 15:56:07 INFO ElasticsearchDependenciesJob: Processing spans from zipkin-2018-08-07/span
18/08/07 15:56:08 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
18/08/07 15:56:08 WARN Utils: Your hostname, localhost.localdomain resolves to a loopback address: 127.0.0.1; using 10.208.204.46 instead (on interface ens33)
18/08/07 15:56:08 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another address
18/08/07 15:56:08 WARN Java7Support: Unable to load JDK7 types (annotations, java.nio.file.Path): no Java7 support added
18/08/07 15:56:09 INFO ElasticsearchDependenciesJob: No spans found at zipkin-2018-08-07/span
18/08/07 15:56:09 INFO ElasticsearchDependenciesJob: Processing spans from zipkin:span-2018-08-07/span
18/08/07 15:56:10 INFO ElasticsearchDependenciesJob: Saving dependency links to zipkin:dependency-2018-08-07/dependency
18/08/07 15:56:10 INFO ElasticsearchDependenciesJob: Done
可以发现,启动这个jar包之后,他启动了一个线程ElasticsearchDependenciesJob , 然后生成调用链,最终“Done” 了。 之后再系统中查询不到这个java进程了,也就是说启动这个jar之后,默认直接执行一次之后,然后线程结束, 整个进程退出。 所以后续想要自动生成调用链,就需要手动执行这个jar了,这样是不是很扯淡,所以,本着执着的性子,决定看看源码,看看是咋回事
源码分析
源码入口
由于zipkin-dependencies.jar是一个可执行的jar,因此我们通过反编译工具,找到他的入口类
META-INF>MANIFEST.MF ,查看该文件
Manifest-Version: 1.0
Built-By: travis
Created-By: Apache Maven 3.5.0
Build-Jdk: 1.8.0_144
Main-Class: zipkin.dependencies.ZipkinDependenciesJob
从上面看到,他的启动类是zipkin.dependencies.ZipkinDependenciesJob , 接下来找到这个类
ZipkinDependenciesJob
public static void main(String[] args)
throws UnsupportedEncodingException
{
String jarPath = pathToUberJar();
// 日期
long day = args.length == 1 ? parseDay(args[0]) : System.currentTimeMillis();
String storageType = System.getenv("STORAGE_TYPE");
// 获取存储类型
if (storageType == null) {
throw new IllegalArgumentException("STORAGE_TYPE not set");
}
String zipkinLogLevel = System.getenv("ZIPKIN_LOG_LEVEL");
if (zipkinLogLevel == null) {
zipkinLogLevel = "INFO";
}
Runnable logInitializer = LogInitializer.create(zipkinLogLevel);
logInitializer.run();
switch (storageType)
{
case "cassandra": //
CassandraDependenciesJob.builder().logInitializer(logInitializer).jars(new String[] { jarPath }).day(day).build().run();
break;
case "mysql": //
MySQLDependenciesJob.builder().logInitializer(logInitializer).jars(new String[] { jarPath }).day(day).build().run();
break;
case "elasticsearch": //
ElasticsearchDependenciesJob.builder().logInitializer(logInitializer).jars(new String[] { jarPath }).day(day).build().run();
break;
default:
throw new UnsupportedOperationException("Unsupported STORAGE_TYPE: " + storageType);
}
}
上面的这个main方法就是ZipkinDependenciesJob 的入口方法,在我们使用java -jar 这种方式启动他的时候,他首先就会执行这个main方法,上面的代码逻辑上很简单,就是获取存储类型,然后调用响应的任务线程去处理。我们在这里使用的是elasticsearch做存储的,因此最终启动这个线程ElasticsearchDependenciesJob , 下面就是它的run方法
public void run() {
run( // multi-type index
index + "-" + dateStamp + "/span",
index + "-" + dateStamp + "/dependencylink",
SpanBytesDecoder.JSON_V1);
run( // single-type index
index + ":span-" + dateStamp + "/span",
index + ":dependency-" + dateStamp + "/dependency",
SpanBytesDecoder.JSON_V2);
log.info("Done");
}
可以看到执行完两个run方法之后,直接就是退出线程了, 在正式生产环境上,如果需要定时自动的生产全局调用链,可以通过源码,然后自己写定时任务,定时的去调用。