Complete interpretation of the javaagent principle of JVM source code analysis

For javaagent, maybe everyone has heard or even used it. The common usage is roughly as follows:

java -javaagent:myagent.jar=mode=test Test

We use -javaagent to specify the jar path of the agent we wrote (./myagent.jar), and the parameters to be passed to the agent (mode=test), and the agent can do something we want when it starts.

The main functions of javaagent are as follows:

  • You can intercept and modify the bytecode before loading the class file
  • The bytecode of the loaded class can be changed at runtime, but there are many restrictions in this case, which will be described in detail later
  • And some other niche features
    • Get all loaded classes
    • Get all classes that have been initialized (the clinit method has been executed, which is a subset of the above)
    • Get the size of an object
    • Add a jar to the bootstrap classpath as a high priority to be loaded by the bootstrapClassloader
    • Add a jar to the classpath for AppClassloard to load
    • Set the prefix of some native methods, mainly to do rule matching when looking for native methods

It sounds cool to imagine that the program can be executed according to the logic we expect.

JVMTI

JVMTI stands for JVM Tool Interface, which is a collection of interfaces exposed by JVM for users to extend. JVMTI is event-driven, and every time the JVM executes a certain logic, it will call some event callback interfaces (if any), and these interfaces can be used by developers to extend their own logic.

For example, the most common one is that we want to modify the relevant bytecode after the bytecode file of a certain class is read and before the class definition, so that the created class object is the content of the bytecode we modified, then we can achieve A callback function is assigned to the ClassFileLoadHook in the callback method set of jvmtiEnv (when JVMTI is running, usually one JVMTIAgent corresponds to one jvmtiEnv, but it can also correspond to multiple), so that this function will be called in the next class file loading process. , which is roughly implemented as follows:

    jvmtiEventCallbacks callbacks;

    jvmtiEnv *          jvmtienv = jvmti(agent);

    jvmtiError          jvmtierror;

    memset(&callbacks, 0, sizeof(callbacks));

    callbacks.ClassFileLoadHook = &eventHandlerClassFileLoadHook;

    jvmtierror = (*jvmtienv)->SetEventCallbacks( jvmtienv,

                                                 &callbacks,

                                                 sizeof(callbacks));

JVMTIagent

JVMTIAgent is actually a dynamic library. It uses some interfaces exposed by JVMTI to do things that we want to do but cannot do under normal circumstances. However, in order to distinguish it from ordinary dynamic libraries, it generally implements one of the following or Multiple functions:

JNIEXPORT jint JNICALL
Agent_OnLoad(JavaVM *vm, char *options, void *reserved);

JNIEXPORT jint JNICALL
Agent_OnAttach(JavaVM* vm, char* options, void* reserved);

JNIEXPORT void JNICALL
Agent_OnUnload(JavaVM *vm); 
  • The Agent_OnLoad function, if the agent is loaded at startup, that is, specified by -agentlib in the vm parameter, then the Agent_OnLoad function in the agent will be executed during the startup process.
  • Agent_OnAttach function, if the agent is not loaded at startup, but we attach to the target process first, and then send the load command to the corresponding target process to load, the Agent_OnAttach function will be called during the loading process.
  • The Agent_OnUnload function is called when the agent is unloaded, but it seems to be rarely implemented.

In fact, we are dealing with JVMTIAgent every day, but you may not realize it. For example, we often use tools such as Eclipse to debug Java code, which is actually achieved by using the jdwp agent that comes with JRE, but tools such as Eclipse do not let you notice it. In this case, the relevant parameters (similar to -agentlib:jdwp=transport=dt_socket,suspend=y,address=localhost:61349) are automatically added to the program startup parameter list, where the agentlib parameter is used to follow the name of the agent to be loaded, For example, jdwp here (but this is not the name of the dynamic library, the JVM will do some name extensions, for example, under Linux, it will find the dynamic library of libjdwp.so to load, that is, add the prefix lib to the name, and then Add suffix .so), followed by a bunch of related parameters, and pass these parameters to the corresponding options in the Agent_OnLoad or Agent_OnAttach function.

java agent

When it comes to javaagent, we must talk about a JVMTIAgent called instrument (the corresponding dynamic library under Linux is libinstrument.so), because the function of javaagent is realized by it. In addition, the instrument agent is also called JPLISAgent (Java Programming Language Instrumentation). Services Agent), the name also fully reflects its most essential function: it provides support for instrumentation services written in the Java language.

instrument agent

The instrument agent implements the two methods Agent_OnLoad and Agent_OnAttach, that is to say, when in use, the agent can be loaded either at startup or dynamically at runtime. Among them, the loading at startup can also indirectly load the instrument agent in a way similar to -javaagent:myagent.jar. The dynamic loading at runtime depends on the attach mechanism of the JVM ( implemented by the JVM Attach mechanism ), and the agent is loaded by sending the load command.

The core data structure of the instrument agent is as follows:

struct _JPLISAgent {
    JavaVM *                mJVM;                   /* handle to the JVM */
    JPLISEnvironment        mNormalEnvironment;     /* for every thing but retransform stuff */
    JPLISEnvironment        mRetransformEnvironment;/* for retransform stuff only */
    jobject                 mInstrumentationImpl;   /* handle to the Instrumentation instance */
    jmethodID               mPremainCaller;         /* method on the InstrumentationImpl that does the premain stuff (cached to save lots of lookups) */
    jmethodID               mAgentmainCaller;       /* method on the InstrumentationImpl for agents loaded via attach mechanism */
    jmethodID               mTransform;             /* method on the InstrumentationImpl that does the class file transform */
    jboolean                mRedefineAvailable;     /* cached answer to "does this agent support redefine" */
    jboolean                mRedefineAdded;         /* indicates if can_redefine_classes capability has been added */
    jboolean                mNativeMethodPrefixAvailable; /* cached answer to "does this agent support prefixing" */
    jboolean                mNativeMethodPrefixAdded;     /* indicates if can_set_native_method_prefix capability has been added */
    char const *            mAgentClassName;        /* agent class name */
    char const *            mOptionsString;         /* -javaagent options string */
};

struct _JPLISEnvironment {
    jvmtiEnv *              mJVMTIEnv;              /* the JVM TI environment */
    JPLISAgent *            mAgent;                 /* corresponding agent */
    jboolean                mIsRetransformer;       /* indicates if special environment */
};

Here are a few important items explained:

  • mNormalEnvironment: It mainly provides normal transform and redefine functions.
  • mRetransformEnvironment: It mainly provides retransform-like functions.
  • mInstrumentationImpl: This object is very important, and it is also the entry point for our Java agent to interact with the JVM. Maybe people who have written javaagent have noticed an Instrumentation parameter when writing the `premain` and `agentmain` methods, which is actually the object here .
  • mPremainCaller: Points to the `sun.instrument.InstrumentationImpl.loadClassAndCallPremain` method, which will be called if the agent is loaded at startup.
  • mAgentmainCaller: Points to the `sun.instrument.InstrumentationImpl.loadClassAndCallAgentmain` method, which is called when the agent is dynamically loaded by attaching.
  • mTransform:指向`sun.instrument.InstrumentationImpl.transform`方法。
  • mAgentClassName: The `Agent-Class` specified in our javaagent's MANIFEST.MF.
  • mOptionsString: Some parameters passed to the agent.
  • mRedefineAvailable: Whether the redefine function is enabled, set `Can-Redefine-Classes:true` in the MANIFEST.MF of javaagent.
  • mNativeMethodPrefixAvailable: Whether to support native method prefix setting, also set `Can-Set-Native-Method-Prefix:true` in javaagent's MANIFEST.MF.
  • mIsRetransformer: If `Can-Retransform-Classes:true` is defined in the MANIFEST.MF file of javaagent, the mIsRetransformer of mRetransformEnvironment will be set to true.

Load the instrument agent at startup

As mentioned in the previous "Overview", the instrument agent is loaded at startup. The specific process is in the `Agent_OnLoad` method of `InvocationAdapter.c`. Here is a brief description of the process:

  • Create and initialize JPLISAgent
  • Listen to the VMInit event and do the following after the vm initialization is complete:
    • Create an InstrumentationImpl object
    • Listen to the ClassFileLoadHook event
    • Call the `loadClassAndCallPremain` method of InstrumentationImpl. In this method, the premain method of the `Premain-Class` class specified in MANIFEST.MF in javaagent will be called
  • Parse the parameters in MANIFEST.MF in javaagent, and set some content in JPLISAgent according to these parameters

Load the instrument agent at runtime

The way to load at runtime is roughly as follows:

VirtualMachine vm = VirtualMachine.attach(pid); 
vm.loadAgent(agentPath, agentArgs); 

The above will request the target JVM to load the corresponding agent through the attach mechanism of the JVM. The process is roughly as follows:

  • Create and initialize JPLISAgent
  • Parse the parameters in MANIFEST.MF in javaagent
  • Create an InstrumentationImpl object
  • Listen to the ClassFileLoadHook event
  • Call the loadClassAndCallAgentmain method of InstrumentationImpl. In this method, the agentmain method of the Agent-Class class specified in MANIFEST.MF in javaagent will be called

ClassFileLoadHook callback implementation of instrument agent

Whether it is the instrument agent loaded at startup or at runtime, it all pays attention to the same jvmti event - ClassFileLoadHook, which is used when calling back after reading the bytecode file, so that the original bytecode can be modified, So how is this accomplished?

void JNICALL

eventHandlerClassFileLoadHook(  jvmtiEnv *              jvmtienv,
                                JNIEnv *                jnienv,
                                jclass                  class_being_redefined,
                                jobject                 loader,
                                const char*             name,
                                jobject                 protectionDomain,
                                jint                    class_data_len,
                                const unsigned char*    class_data,
                                jint*                   new_class_data_len,
                                unsigned char**         new_class_data) {

    JPLISEnvironment * environment  = NULL;

    environment = getJPLISEnvironment(jvmtienv);

    /* if something is internally inconsistent (no agent), just silently return without touching the buffer */

    if ( environment != NULL ) {

        jthrowable outstandingException = preserveThrowable(jnienv);
        transformClassFile( environment->mAgent,
                            jnienv,
                            loader,
                            name,
                            class_being_redefined,
                            protectionDomain,
                            class_data_len,
                            class_data,
                            new_class_data_len,
                            new_class_data,
                            environment->mIsRetransformer);

        restoreThrowable(jnienv, outstandingException);
    }

}

First obtain the corresponding JPLISEnvironment according to jvmtiEnv, because I have already mentioned above that there are actually two JPLISEnvironments (and two jvmtiEnv), one of which is dedicated to retransform, and the other is used to do other things, according to different purposes, in The registration of the specific ClassFileTransformer is also separate. For the ClassFileTransformer used as a retransform, we will register it in a separate TransformerManager.

Then call the transformClassFile method. Since the function implementation is relatively long, the code will not be posted here. The general meaning is to call the transform method of the InstrumentationImpl object, and decide which ClassFileTransformer object in the TransformerManager to choose for the transform operation according to the last parameter.

private byte[]
    transform(  ClassLoader         loader,
                String              classname,
                Class               classBeingRedefined,
                ProtectionDomain    protectionDomain,
                byte[]              classfileBuffer,
                boolean             isRetransformer) {

        TransformerManager mgr = isRetransformer?

                                        mRetransfomableTransformerManager :
                                        mTransformerManager;

        if (mgr == null) {

            return null; // no manager, no transform

        } else {

            return mgr.transform(   loader,
                                    classname,
                                    classBeingRedefined,
                                    protectionDomain,
                                    classfileBuffer);

        }

    }


  public byte[]

    transform(  ClassLoader         loader,
                String              classname,
                Class               classBeingRedefined,
                ProtectionDomain    protectionDomain,
                byte[]              classfileBuffer) {

        boolean someoneTouchedTheBytecode = false;
        TransformerInfo[]  transformerList = getSnapshotTransformerList();
        byte[]  bufferToUse = classfileBuffer;

        // order matters, gotta run 'em in the order they were added

        for ( int x = 0; x < transformerList.length; x++ ) {

            TransformerInfo         transformerInfo = transformerList[x];
            ClassFileTransformer    transformer = transformerInfo.transformer();
            byte[]                  transformedBytes = null;

            try {

                transformedBytes = transformer.transform(   loader,
                                                            classname,
                                                            classBeingRedefined,
                                                            protectionDomain,
                                                            bufferToUse);

            }

            catch (Throwable t) {

                // don't let any one transformer mess it up for the others.
                // This is where we need to put some logging. What should go here? FIXME

            }


            if ( transformedBytes != null ) {
                someoneTouchedTheBytecode = true;
                bufferToUse = transformedBytes;
            }

        }


        // if someone modified it, return the modified buffer.
        // otherwise return null to mean "no transforms occurred"

        byte [] result;

        if ( someoneTouchedTheBytecode ) {
            result = bufferToUse;
        }
        else {
            result = null;
        }

        return result;

    }   

The above is the finally transferred java code. You can see that it has been called to the javaagent code we wrote. We generally implement a ClassFileTransformer class, and then create an object to register it in the corresponding TransformerManager.

Implementation of Class Transform

The class transform mentioned here is actually in a narrow sense. It is mainly aimed at the scene where the class file is required to be transformed when it is loaded for the first time. When the class file is loaded, the ClassFileLoad event is issued, and then it is handed over to the instrumentat agent to call the implementation of the ClassFileTransformer registered in the javaagent. Bytecode modification.

Implementation of Class Redefine

Class redefinition, which is one of the basic functions provided by Instrumentation, is mainly used on classes that have already been loaded. If you want to modify them, to do this, we must know two things, one is which to modify class, and the other is what kind of structure you want to modify that class into. After you have these two information, you can operate it through the redefineClasses method under InstrumentationImpl:

public void redefineClasses(ClassDefinition[]   definitions) throws  ClassNotFoundException {

        if (!isRedefineClassesSupported()) {

            throw new UnsupportedOperationException("redefineClasses is not supported in this environment");

        }

        if (definitions == null) {

            throw new NullPointerException("null passed as 'definitions' in redefineClasses");

        }

        for (int i = 0; i < definitions.length; ++i) {

            if (definitions[i] == null) {

                throw new NullPointerException("element of 'definitions' is null in redefineClasses");

            }

        }

        if (definitions.length == 0) {

            return; // short-circuit if there are no changes requested

        }


        redefineClasses0(mNativeAgent, definitions);

    }

The corresponding implementation in the JVM is to create a VM_Operation of VM_RedefineClasses, pay attention to stop-the-world when executing it:

jvmtiError

JvmtiEnv::RedefineClasses(jint class_count, const jvmtiClassDefinition* class_definitions) {

//TODO: add locking

  VM_RedefineClasses op(class_count, class_definitions, jvmti_class_load_kind_redefine);

  VMThread::execute(&op);

  return (op.check_error());

} /* end RedefineClasses */

I try to describe this process as clearly as possible in language, and I won't post the code in detail, because the amount of code is a bit large:

  • Traverse the jvmtiClassDefinitions to be redefined in batches one by one
  • Then read the new bytecode. If you are concerned about the ClassFileLoadHook event, you will also go to the corresponding transform to modify the new bytecode.
  • After the bytecode is parsed, create a klassOop object
  • Compare the old and new classes with the following requirements:
    • parent class is the same
    • The number of implemented interfaces should also be the same, and they are the same interface
    • Class accessors must be consistent
    • The number of fields and the field name must be the same
    • New methods must be private static/final
    • Modification method can be deleted
  • Bytecode verification for new classes
  • Combining constant pools of old and new classes
  • If there are breakpoints on the old class, clear them all
  • JIT to optimize old classes
  • Update the jmethodId of the method matching the old and new methods, and update the old jmethodId to the new method
  • The holer of the constant pool of the new class points to the old class
  • Swap some properties of the new class and the old class, such as constant pool, methods, inner class
  • Initialize new vtable and itable
  • Exchange the method, field, parameter of the annotation
  • Traverse all subclasses of the current class and modify their vtable and itable

The above is the basic process. In general, only the content in the class is updated, which is equivalent to only updating the content pointed to by the pointer, and not updating the pointer, which avoids traversing a large number of existing class objects to update them. overhead.

Implementation of Class Retransform

The retransform class can be simply understood as a rollback operation. The specific version to be rolled back depends on the situation. There is a premise in the following regardless of the situation, that is, javaagent has required the ability to retransform:

  • If the class is transformed when it is loaded for the first time, then the code will be rolled back to the code after the transform when the retransform is performed.
  • If the class is loaded for the first time without any changes, then doing the retransform will roll back the code to the bytecode in the original class file
  • If the class has been loaded, the class may have been redefine several times (for example, by another agent), but then loading a new agent requires the ability to retransform, and then redefine the class, then the retransform When the code is rolled back to the bytecode after the last redefine of the previous agent

From the parameters of the retransformClasses method of InstrumentationImpl, we guessed that the rollback operation should be performed, because we only specified the class:

    public void retransformClasses(Class<?>[] classes) {

        if (!isRetransformClassesSupported()) {

            throw new UnsupportedOperationException( "retransformClasses is not supported in this environment");

        }

        retransformClasses0(mNativeAgent, classes);

    }

However, the implementation of retransform is actually implemented through the function of redefine. There is a small difference in class loading, mainly reflected in which transforms will be used. If retransform is currently performed, those registered to the normal TransformerManager will be ignored. In the ClassFileTransformer, but only the ClassFileTransformer of the TransformerManager specially prepared for retransform, otherwise imagine that the bytecode is silently changed to an intermediate state.

private:

  void post_all_envs() {

    if (_load_kind != jvmti_class_load_kind_retransform) {

      // for class load and redefine,

      // call the non-retransformable agents

      JvmtiEnvIterator it;

      for (JvmtiEnv* env = it.first(); env != NULL; env = it.next(env)) {

        if (!env->is_retransformable() && env->is_enabled(JVMTI_EVENT_CLASS_FILE_LOAD_HOOK)) {

          // non-retransformable agents cannot retransform back,

          // so no need to cache the original class file bytes

          post_to_env(env, false);

        }

      }

    }

    JvmtiEnvIterator it;

    for (JvmtiEnv* env = it.first(); env != NULL; env = it.next(env)) {

      // retransformable agents get all events

      if (env->is_retransformable() && env->is_enabled(JVMTI_EVENT_CLASS_FILE_LOAD_HOOK)) {

        // retransformable agents need to cache the original class file

        // bytes if changes are made via the ClassFileLoadHook

        post_to_env(env, true);

      }

    }

  }

Other niche features of javaagent

In addition to the modification of the bytecode, javaagent actually has some small functions, which are sometimes quite useful.

  • Get all loaded classes: Class[] getAllLoadedClasses(); 
  • Get all initialized classes: Class[] getInitiatedClasses(ClassLoader loader); 
  • Get the size of an object: long getObjectSize(Object objectToSize); 
  • Add a jar to the bootstrap classpath before other jars are loaded: void appendToBootstrapClassLoaderSearch(JarFile jarfile); 
  • Add a jar to the classpath for appclassloard to load: void appendToSystemClassLoaderSearch(JarFile jarfile); 
  • Set the prefix of some native methods, mainly to do rule matching when looking for native methods: void setNativeMethodPrefix(ClassFileTransformer transformer, String prefix).

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325896597&siteId=291194637