Understanding APM: How to Add Extensions to the OpenTelemetry Java Agent

By David Hope

Without code access, SREs and IT operations cannot always get the visibility they need

As an SRE, have you ever been in a situation where you're developing an application written using a non-standard framework, or you want to get some interesting business data from your application (such as the number of orders processed), but you don't have Do? Do you have access to the source code?

We all know this can be a challenging scenario, leading to gaps in visibility, inability to fully trace code end-to-end, and loss of business-critical monitoring data that helps understand the true impact of an issue.

How can we solve this problem? We discuss one approach in the following three blogs:

We're here to develop a plugin for the Elastic® APM agent to help access key business data for monitoring and add traces where nothing exists.
  
In this blog we will discuss how to do the same with the OpenTelemetry Java agent using the extension framework.

Basic Concepts: How APM Works

Before proceeding, let's first understand some basic concepts and terminology.

  • Java Agent : This is a tool that can be used to inspect (or modify) the bytecode of class files in the Java Virtual Machine (JVM). Java agents can be used for a variety of purposes such as performance monitoring, logging, security, and more.
  • Bytecode : This is the intermediate code generated by the Java compiler from the Java source code. This code is interpreted or compiled on the fly by the JVM to produce executable machine code.
  • Byte Buddy : Byte Buddy is a Java code generation and manipulation library. It is used to create, modify or tune Java classes at runtime. In the context of Java Agents, Byte Buddy provides a powerful and flexible way to modify bytecode. Both the Elastic APM agent and the OpenTelemetry agent use Byte Buddy behind the scenes .

Now, let's talk about how auto-detection works with Byte Buddy :

Auto-instrumentation is the process by which an agent modifies the bytecode of an application class, usually to insert monitoring code. The agent does not directly modify the source code, but instead modifies the bytecode loaded into the JVM. This is done when the JVM loads the class, so the modification is valid at runtime.

Here is a simplified description of the process:

1) Start the JVM with an agent : When starting a Java application, you can use the -javaagent command line option to specify the Java agent. This instructs the JVM to load the proxy before calling the application's main method. At this point, the agent has the opportunity to set the class transformer.

2) Register the file transformer with Byte Buddy : Your agent will register the file transformer with Byte Buddy. A converter is a piece of code that is called every time a class is loaded into the JVM. This converter receives the bytecode of a class and can modify that bytecode before actually using the class.

3) Convert bytecode : When the converter is called, it will use Byte Buddy's API to modify the bytecode. Byte Buddy allows you to specify transformations in a high-level, expressive way, instead of writing complex bytecodes by hand. For example, you can specify a certain class and method within that class to instrument, and provide an "interceptor" to add new behavior to that method.

  • For example, suppose you want to measure the execution time of a method. You'll instruct Byte Buddy to target specific classes and methods, then provide an interceptor that wraps the method call with timing code. Every time this method is called, it will first call your interceptor, measure the start time, then call the original method, and finally measure the end time and print the duration.

4) Use transformed classes : Once the proxy has set its transformers, the JVM continues loading classes as usual. Transformers are called every time a class is loaded, allowing them to modify the bytecode. Your application will then use these transformed classes as if they were the original classes, but they now have the extra behavior you injected via the interceptor.

 Essentially, Byte Buddy's auto-instrumentation modifies the behavior of Java classes at runtime without requiring direct source code changes. This is especially useful for cross-cutting concerns such as logging, monitoring, or security, as it allows you to centralize this code in the Java agent instead of spreading it throughout the application.

Applications, Prerequisites and Configuration

There is a very simple application in this GitHub repository that is used throughout the blog. All it does is ask you to enter some text and then count the words.

It is also listed below:

package org.davidgeorgehope;
import java.util.Scanner;
import java.util.logging.Logger;

public class Main {
    private static Logger logger = Logger.getLogger(Main.class.getName());

    public static void main(String[] args) {
        Scanner scanner = new Scanner(System.in);
        while (true) {
            System.out.println("Please enter your sentence:");
            String input = scanner.nextLine();
            Main main = new Main();
            int wordCount = main.countWords(input);
            System.out.println("The input contains " + wordCount + " word(s).");
        }
    }
    public int countWords(String input) {

        try {
            Thread.sleep(10000);
        } catch (InterruptedException e) {
            throw new RuntimeException(e);
        }

        if (input == null || input.isEmpty()) {
            return 0;
        }

        String[] words = input.split("\\s+");
        return words.length;
    }
}

For the purposes of this blog, we'll be using Elastic Cloud to capture data generated by OpenTelemetry—follow the instructions here to get started with Elastic Cloud .

Once you start using Elastic Cloud, get the OpenTelemetry configuration from the APM page:

You will need this later. If you want to deploy on your own computer, you can refer to the article " Elastic: A Developer's Guide " to deploy Elasticsearch and Kibana.

Finally, download the OpenTelemetry Agent .

Start the application and OpenTelemetry

If you're starting with this simple application, build and run it as you would with the OpenTelemetry Agent, filling in the appropriate values ​​with the variables obtained earlier.

java -javaagent:opentelemetry-javaagent.jar -Dotel.exporter.otlp.endpoint=XX -Dotel.exporter.otlp.headers=XX -Dotel.metrics.exporter=otlp -Dotel.logs.exporter=otlp -Dotel.resource.attributes=XX -Dotel.service.name=your-service-name -jar simple-java-1.0-SNAPSHOT.jar

You will find that nothing happened. The reason is that the OpenTelemetry Agent has no way of knowing what to monitor. The way APM with auto-detection works is that it " knows " about standard frameworks (such as Spring or HTTPClient) and can gain visibility by automatically " injecting " tracking code into these standard frameworks.

It doesn't know about org.davidgeorgehope.Main in our simple Java application.

Fortunately, we can add this functionality using the OpenTelemetry Extensions framework .

OpenTelemetry Extensions

In the above repository, in addition to the simple-java application, there is an Elastic APM plugin and an OpenTelemetry extension. The relevant files for the OpenTelemetry Extension are located here — WordCountInstrumentation.java and WordCountInstrumentationModule.java.

You'll notice that both OpenTelemetry Extensions and Elastic APM Plugins use Byte Buddy, a common library for code instrumentation. However, there are some key differences in how code is guided.

The WordCountInstrumentationModule class extends the OpenTeletry specific class InstrumentationModule, whose purpose is to describe a set of TypeInstrumentations that need to be applied together to correctly instrument a particular library. The WordCountInstrumentation class is one such instance of TypeInstrumentation.

Type detection is centralized in a module for shared helper classes, entry runtime checks, and applicable classloader criteria, and can only be enabled or disabled as a group.

This is a bit different from how the Elastic APM plugin works, because the default method of injecting code with OpenTelemetry is inline with OpenTelemetry (which is the default method), and you can use the InstrumentationModule configuration to inject dependencies into core application classes loader (as shown below). The Elastic APM approach, which is safer because it allows isolation of helper classes and is easier to debug with a normal IDE, was contributed to OpenTelemetry. Here we inject the TypeInstrumentation class and the WordCountInstrumentation class into the class loader.

 @Override
    public List<String> getAdditionalHelperClassNames() {
        return List.of(WordCountInstrumentation.class.getName(),"io.opentelemetry.javaagent.extension.instrumentation.TypeInstrumentation");
    }

Another interesting part of the TypeInstrumentation class is settings.

Here we give our detection "group" a name. An InstrumentationModule needs to have at least one name. Users of javaagent can suppress selected detections by referencing one of their names. The detection module names are named with dashes.

    public WordCountInstrumentationModule() {
        super("wordcount-demo", "wordcount");
    }

Among other things, we see that methods in this class can optionally specify a load order relative to other instrumentation, and we specify a class that extends TypeInstrumention and is responsible for most of the instrumentation.

Let's look at the WordCountInstrumention class, which now extends TypeInstrumention:

// The WordCountInstrumentation class implements the TypeInstrumentation interface.
// This allows us to specify which types of classes (based on some matching criteria) will have their methods instrumented.

public class WordCountInstrumentation implements TypeInstrumentation {

    // The typeMatcher method is used to define which classes the instrumentation should apply to.
    // In this case, it's the "org.davidgeorgehope.Main" class.
    @Override
    public ElementMatcher<TypeDescription> typeMatcher() {
        logger.info("TEST typeMatcher");
        return ElementMatchers.named("org.davidgeorgehope.Main");
    }

    // In the transform method, we specify which methods of the classes matched above will be instrumented, 
    // and also the advice (a piece of code) that will be added to these methods.
    @Override
    public void transform(TypeTransformer typeTransformer) {
        logger.info("TEST transform");
        typeTransformer.applyAdviceToMethod(namedOneOf("countWords"),this.getClass().getName() + "$WordCountAdvice");
    }

    // The WordCountAdvice class contains the actual pieces of code (advices) that will be added to the instrumented methods.
    @SuppressWarnings("unused")
    public static class WordCountAdvice {
        // This advice is added at the beginning of the instrumented method (OnMethodEnter).
        // It creates and starts a new span, and makes it active.
        @Advice.OnMethodEnter(suppress = Throwable.class)
        public static Scope onEnter(@Advice.Argument(value = 0) String input, @Advice.Local("otelSpan") Span span) {
            // Get a Tracer instance from OpenTelemetry.
            Tracer tracer = GlobalOpenTelemetry.getTracer("instrumentation-library-name","semver:1.0.0");
            System.out.print("Entering method");

            // Start a new span with the name "mySpan".
            span = tracer.spanBuilder("mySpan").startSpan();

            // Make this new span the current active span.
            Scope scope = span.makeCurrent();

            // Return the Scope instance. This will be used in the exit advice to end the span's scope.
            return scope; 
        }

        // This advice is added at the end of the instrumented method (OnMethodExit).
        // It first closes the span's scope, then checks if any exception was thrown during the method's execution.
        // If an exception was thrown, it sets the span's status to ERROR and ends the span.
        // If no exception was thrown, it sets a custom attribute "wordCount" on the span, and ends the span.
        @Advice.OnMethodExit(onThrowable = Throwable.class, suppress = Throwable.class)
        public static void onExit(@Advice.Return(readOnly = false) int wordCount,
                                  @Advice.Thrown Throwable throwable,
                                  @Advice.Local("otelSpan") Span span,
                                  @Advice.Enter Scope scope) {
            // Close the scope to end it.
            scope.close();

            // If an exception was thrown during the method's execution, set the span's status to ERROR.
            if (throwable != null) {
                span.setStatus(StatusCode.ERROR, "Exception thrown in method");
            } else {
                // If no exception was thrown, set a custom attribute "wordCount" on the span.
                span.setAttribute("wordCount", wordCount);
            }

            // End the span. This makes it ready to be exported to the configured exporter (e.g. Elastic).
            span.end();
        }
    }
}

The target class we detect is defined in the typeMatch method, and the method we want to detect is defined in the Transform method. Our target is the Main class and the countWords method.

As you can see, we have an inner class here that does most of the work of defining the onEnter and onExit methods, which tell us what to do when we enter the countWords method and when we exit the countWords method.

In the onEnter method, we set up a new OpenTelemetry span, and in the onExit method, we end the span. If the method ends successfully, we also get the word count and add it to the property.

Now let's see what happens when we run it. The good news is we've made this really easy by providing a single dockerfile for you to do all the work.

put it all together

Clone the GitHub repository if you haven't already, and before continuing, let's take a quick look at the dockerfile we're using.

# Build stage
FROM maven:3.8.7-openjdk-18 as build

COPY simple-java /home/app/simple-java
COPY opentelemetry-custom-instrumentation /home/app/opentelemetry-custom-instrumentation

WORKDIR /home/app/simple-java
RUN mvn install

WORKDIR /home/app/opentelemetry-custom-instrumentation
RUN mvn install

# Package stage
FROM maven:3.8.7-openjdk-18
COPY --from=build /home/app/simple-java/target/simple-java-1.0-SNAPSHOT.jar /usr/local/lib/simple-java-1.0-SNAPSHOT.jar
COPY --from=build /home/app/opentelemetry-custom-instrumentation/target/opentelemetry-custom-instrumentation-1.0-SNAPSHOT.jar /usr/local/lib/opentelemetry-custom-instrumentation-1.0-SNAPSHOT.jar

WORKDIR /

RUN curl -L -o opentelemetry-javaagent.jar https://github.com/open-telemetry/opentelemetry-java-instrumentation/releases/latest/download/opentelemetry-javaagent.jar

COPY start.sh /start.sh
RUN chmod +x /start.sh

ENTRYPOINT ["/start.sh"]

The dockerfile is divided into two parts: during the docker build process, we build the simple-java application from source and then build the custom instrumentation. After that, we download the latest OpenTelemetry Java Agent. At runtime, we simply execute the start.sh file as described below:

#!/bin/sh
java \
-javaagent:/opentelemetry-javaagent.jar \
-Dotel.exporter.otlp.endpoint=${SERVER_URL} \
-Dotel.exporter.otlp.headers="Authorization=Bearer ${SECRET_KEY}" \
-Dotel.metrics.exporter=otlp \
-Dotel.logs.exporter=otlp \
-Dotel.resource.attributes=service.name=simple-java,service.version=1.0,deployment.environment=production \
-Dotel.service.name=your-service-name \
-Dotel.javaagent.extensions=/usr/local/lib/opentelemetry-custom-instrumentation-1.0-SNAPSHOT.jar \
-Dotel.javaagent.debug=true \
-jar /usr/local/lib/simple-java-1.0-SNAPSHOT.jar

There are two important things to note about this script: The first is that we start the javaagent parameter set to opentelemetry-javaagent.jar - this starts the OpenTelemetry javaagent running, which starts before any code is executed.

In this jar, there must be a class with a premain method that the JVM will look for. This will bootstrap the java agent. As mentioned above, any compiled bytecode is essentially filtered through the javaagent code, so it can modify classes before being executed.

The second important thing here is the configuration of javaagent.extensions, which loads the extensions we built to instrument our simple-java application.

Now run the following command:

docker build -t djhope99/custom-otel-instrumentation:1 .
docker run -it -e 'SERVER_URL=XXX' -e 'SECRET_KEY=XX djhope99/custom-otel-instrumentation:1

If you use the SERVER_URL and SECRET_KEY obtained earlier here, you should see this connection to Elastic.

When it starts up, it will ask you to type a sentence, type a few sentences, and hit enter. Do this a few times - here's a sleep to force a long-running transaction:

 

Eventually you will see the service displayed in the service map:

A trace will appear:

 

Inside that span, you'll see the wordcount attribute we collected:

This can be used for further dashboarding and AI/ML, including anomaly detection (if required), which is easy to do as shown below.

First click on the Elastic icon on the left and select Dashboard to create a new dashboard:

From here, click Create Visualization.

 

Search for the wordcount tag in the APM index, as shown below:

As you can see, since we created this property in the Span code as shown below, with wordCount as an "Integer" type, we were able to automatically assign it as a numeric field in Elastic: 

span.setAttribute("wordCount", wordCount);

From here we can drag and drop it into the visualization to display on our dashboard! super easy.

In summary

This blog sheds light on the invaluable role of OpenTelemetry Java agents in bridging visibility gaps and obtaining business-critical monitoring data, especially when access to source code is not available.

This blog covers a basic understanding of Java Agent, Bytecode, and Byte Buddy, followed by a thorough examination of Byte Buddy's auto-detection process.

The implementation of the OpenTelemetry Java agent using the extension framework is demonstrated with the help of a simple Java application, which highlights the agent's ability to inject trace code into the application for monitoring purposes.

It details how to configure the agent and integrate the OpenTelemetry Extension, and outlines the operation of a sample application to help users understand the practical application of the information discussed. This enlightening blog post is an excellent resource for SREs and IT operations looking to use OpenTelemetry's auto-detection capabilities to optimize their application efforts.

原文:Understanding APM: How to add extensions to the OpenTelemetry Java Agent | Elastic Blog

Guess you like

Origin blog.csdn.net/UbuntuTouch/article/details/131931799