JVM Sandbox Introductory Tutorial and Principles

In daily business code development, we often come into contact with AOP, such as the well-known Spring AOP. We use it for business aspects, such as login verification, logging, performance monitoring, global filters, etc. But Spring AOP has a limitation. Not all classes are hosted in the Spring container . For example, many middleware codes, three-party package codes, and Java native codes cannot be proxied by Spring AOP. In this way, once the aspect logic you want to do does not belong to the jurisdiction of Spring, or you want to implement the aspect function that is not limited by Spring, it cannot be realized.

So for Java backend applications, is there a more general AOP approach? The answer is yes , Java itself provides JVM TI, Instrumentation and other functions, allowing users to complete complex control of the JVM through a series of APIs. Since then, many famous frameworks have been derived, such as Btrace, Arthas, etc., to help developers implement more and more complex Java functions.

JVM Sandbox is also one of them. Of course, the design purposes and missions of different frameworks are different. The design purpose of JVM-Sandbox is to realize an AOP solution without restarting or intruding into the target JVM application.

Are you seeing this or not knowing what I'm talking about? Don't worry, let me cite a few typical JVM-Sandbox application scenarios:

  • Traffic playback: How to record the input and output parameters of each interface request of the online application? It is certainly possible to change the application code, but the cost is too high. Through JVM-Sandbox, you can directly grab the input and output parameters of the interface without modifying the code.

  • Security vulnerability hotfix: Suppose a third-party package (such as the famous fastjson) has a vulnerability again, so many applications in the group have released new versions to fix them one by one, and the vulnerability has caused a lot of damage. Through JVM-Sandbox, directly modify and replace vulnerable codes, and stop losses in time.

  • Interface failure simulation: JVM-Sandbox can easily simulate the situation where an interface returns false after a timeout of 5s.

  • Fault localization: similar functionality as Arthas.

  • Interface current limiting: Dynamically limit the current of the specified interface.

  • log printing

  • ...

It can be seen that with the help of JVM-Sandbox, you can achieve many things that you could not do in business code before, greatly expanding the scope of operation.

This article revolves around JVM SandBox and mainly introduces the following contents:

  • The birth background of JVM SandBox

  • JVM SandBox architecture design

  • JVM SandBox code combat

  • JVM SandBox underlying technology

  • Summary and Outlook

The birth background of JVM Sandbox

The technical background of the birth of JVM Sandbox has been described in the introduction. The following is some business background of the author's development of the framework. It is a form of AOP implementation. Then some students may ask: Why does Alibaba "reinvent the wheel" when there is a mature Spring AOP solution? This question should be answered in the background of the birth of JVM SandBox. In the middle of 2016, Tmall Double Eleven prompted a large number of changes in Alibaba's internal business systems. This coincided with the adjustment of Xu Dongchen's (Alibaba test development expert) team. The guarantee of test resources was seriously insufficient, forcing them to consider more precise, A more convenient old business test regression verification solution. What the development team is facing is an old system that is newly taken over. The old business code architecture is difficult to meet the testability requirements, and many existing test frameworks cannot be applied to the old business system architecture, so new testing ideas and testing frameworks are needed.

Why not use the Spring AOP solution? The pain point of the Spring AOP solution is that not all business code is hosted in the Spring container, and the lower-level middleware code and three-party package code cannot be included in the scope of regression testing. What's worse is that the test framework will introduce the class library it depends on. It often conflicts with the class library of business code, so JVM SandBox came into being.

JVM Sandbox Overall Architecture

This chapter does not describe all the architectural designs of JVM SandBox in detail, but only a few of the most important features. For detailed architecture design, please refer to the Wiki of the original framework code warehouse.

class isolation

Many frameworks achieve class isolation by breaking parental delegation (I prefer to call it immediate family delegation), and SandBox is no exception. It breaks the parental delegation agreement through a custom SandboxClassLoader, and implements several isolation features:

  • Class isolation from the target application: Don't worry about loading the sandbox will cause class pollution and conflicts of the original application.

  • Class isolation between modules: do not interfere with each other between modules, between modules and sandboxes, and between modules and applications.

Non-intrusive AOP and event-driven

JVM-SANDBOX belongs to the AOP framework based on Instrumentation's dynamic weaving class. By carefully constructing the bytecode enhancement logic, the sandbox module can realize the runtime AOP 无侵入interception of the target application method without violating the JDK constraints .

From the figure above, you can see that the entire execution cycle of a method is "enhanced" by the code. The benefit is that you only need to process the event of the method when using JVM SandBox.

// BEFOREtry {
   
   
   /*    * do something...    */
    // RETURN    return;
} catch (Throwable cause) {
   
       // THROWS}

In the world view of the sandbox, any Java method call can be decomposed into three links BEFORE, , RETURNand THROWS, from which the event detection and process control mechanism of the corresponding link is derived from the three links.

Based on the separation of events BEFORE, RETURNand THROWSevents in the three links, the sandbox module can complete many types of AOP operations.

  1. Can perceive and change the input parameters of method calls

  2. Can sense and change method call return value and thrown exception

  3. The flow of method execution can be changed

  4. Return the custom result object directly before the method body is executed, and the original method code will not be executed

  5. Reconstruct a new result object before the method body returns, and can even be changed to throw an exception

  6. Re-throw a new exception after the method body throws an exception, and can even be changed to return normally

Everything is event-driven. You may be confused about this, but it can help you understand in the actual combat link below.

JVM Sandbox code practice

I put the actual combat chapter here in advance, so that everyone can quickly understand how comfortable it is to develop using JVM SandBox (compared to using tools such as bytecode replacement).

Version used: JVM-Sandbox 1.2.0

Official source code: https://github.com/alibaba/jvm-sandbox

Let's implement a small tool. In our daily work, we always encounter some huge Spring projects, which contain many beans and business codes. It may take 5 minutes or more to start a project, which seriously hinders development efficiency.

We try to use JVM Sandbox to develop a tool to make a statistics on the startup time of the application's Spring Bean. In this way, the main reason for the slow start of the project can be found at a glance, and the optimization of the blind man can be avoided.

The final effect is as follows:

The figure shows the time consumption of an application from startup to startup of all SpringBean, sorted from high to low. Because it is a demo application, the time consumption of beans is low (and there are not many business beans), but in actual application There will be a lot of beans that take a few seconds or even ten seconds to complete the initialization, which can be optimized in a targeted manner.

How to implement the above tools in JVMSandBox? It's actually very simple.

First paste the overall process of thinking:

First create a new Maven project, and reference JVM SandBox in Maven dependencies. The official recommendation is to use the parent method for independent projects.

<parent>    <groupId>com.alibaba.jvm.sandbox</groupId>    <artifactId>sandbox-module-starter</artifactId>    <version>1.2.0</version></parent>

Create a new class as a JVM SandBox module, as shown below:

Use @Infomation to declare the mode as AGENT mode. There are two modes, Agent and Attach.

  • Agent: Start together with the JVM startup

  • Attach: In the already running JVM process, dynamically insert

Since we are monitoring JVM startup data, we need AGENT mode.

Second, inherit com.alibaba.jvm.sandbox.api.Module and com.alibaba.jvm.sandbox.api.ModuleLifecycle.

Among them, ModuleLifecycle contains the life cycle callback function of the entire module.

  • onLoad: Module loading, called before the module starts loading! Module loading is the beginning of the module life cycle, and it will only be called once during the life of the module. Throwing an exception here will be the only way to prevent the module from being loaded. If the module determines that the loading fails, all pre-applied resources will be released, and the module will not be perceived by the sandbox.

  • onUnload: Module unloading, called before the module starts unloading! Module unloading is the end of the module life cycle, and it will only be called once during the module life cycle. Throwing an exception here will be the only way to prevent the module from being uninstalled. If the module determines that the uninstallation fails, it will not cause any resources to be closed and released in advance, and the module will continue to work normally.

  • onActive: After the module is activated, the class enhanced by the module will be activated, and all com.alibaba.jvm.sandbox.api.listener.EventListener will start to receive corresponding events

  • onFrozen: After the module is frozen, all com.alibaba.jvm.sandbox.api.listener.EventListeners held by the module will be silent and cannot receive corresponding events. It should be noted that although the relevant events are no longer received after the module is frozen, the enhanced code woven into the corresponding class by the sandbox is still there.

  • loadCompleted: The module is loaded and is called after the module is loaded! Module loading completion is a callback after the module completes all resource loading and allocation, and it will only be called once during the life of the module. Throwing an exception here will not affect the result of the module being loaded successfully. After the module is loaded, all module-based operations can be performed in this callback

The most commonly used is loadCompleted, so we rewrite the loadCompleted class and start our monitoring class SpringBeanStartMonitor thread in it.

The core code of SpringBeanStartMonitor is as follows:

Use Sandbox's doClassFilter to filter out matching classes, here we are BeanFactory.

Use doMethodFilter to filter out the method to be monitored, here is initializeBean.

The initializeBean is used as the time-consuming entry method for statistics. The specific reason for choosing this method involves the startup life cycle of SpringBean, which is beyond the scope of this article. (The author of this article: Man Sandaojiang)

then usemoduleEventWatcher.watch(springBeanFilter, springBeanInitListener, Event.Type.BEFORE, Event.Type.RETURN);

Bind our springBeanInitListener listener to the observed method. In this way, every time initializeBean is called, it will go to our listener logic.

The main logic of the listener is as follows:

The code is a bit long, so you don’t need to look at it carefully. The main thing is to execute the above-mentioned aspect logic in the BeforeEvent (before entering) and ReturnEvent (after returning normally) of the original method. Here I use a MAP to store the initialization start and end time of each Bean , and finally count the initialization time.

Finally, we also need a way to know that our original Spring application has been started, so that we can manually uninstall our Sandbox module. After all, he has completed his historical mission and does not need to be attached to the main process.

We use a simple method to check http://127.0.0.1:8080/whether a status code less than 500 will be returned to determine whether the Spring container has started. Of course, if your Spring does not use a web framework, you cannot use this method to judge the completion of the startup. You may be able to achieve it through Spring's own lifecycle hook function. I am being lazy here.

The development of the entire SpringBean monitoring module is completed, and you can feel that there is almost no difference between your development and daily business development. This is the biggest benefit that JVM Sandbox brings to you.

The above source code is placed in my Github warehouse:

https://github.com/monitor4all/javaMonitor

JVM Sandbox underlying technology

The introductory use of the entire JVM Sandbox is basically finished. Some JVM technical terms are mentioned above, which may be heard but not particularly understood by friends. Here is a brief explanation of several important concepts, and clarify the relationship between these concepts, so that everyone can better understand the underlying implementation of JVM Sandbox.

jvmti

JVMTI (JVM Tool Interface) is a native programming interface provided by the Java virtual machine . JVMTI can be used to develop and monitor the virtual machine, view the internal state of the JVM, and control the execution of the JVM application. The functions that can be realized include but are not limited to: debugging, monitoring, thread analysis, coverage analysis tools, etc.

Many java monitoring and diagnostic tools work based on this form. For arthas, jinfo, brace, etc., although the bottom layer of these tools is JVM TI, they also use the upper tool JavaAgent.

JavaAgent 和 Instrumentation

Javaagent is an argument to the java command. The parameter javaagent can be used to specify a jar package.

-agentlib:<libname>[=<选项>] 加载本机代理库 <libname>, 例如 -agentlib:hprof  另请参阅 -agentlib:jdwp=help 和 -agentlib:hprof=help-agentpath:<pathname>[=<选项>]  按完整路径名加载本机代理库-javaagent:<jarpath>[=<选项>]  加载 Java 编程语言代理, 请参阅 java.lang.instrument

-javaagentThe reference mentioned in the above parameters java.lang.instrumentis rt.jara package defined in , which provides some tools to help developers dynamically modify the Class type in the system when the Java program is running. Among them, a key component of using this package is Javaagent. From the name, it seems to be a Java agent, but in fact, its function is more like a Class type converter, which can accept new external requests and modify the Class type at runtime.

The underlying implementation of Instrumentation depends on JVMTI.

The JVM will load the method with Instrumentationthe signature . If the loading is successful, the second method will be ignored. If the first method is not available, the second method will be loaded.

Interfaces supported by Instrumentation:

public interface Instrumentation {
   
       //添加一个ClassFileTransformer    //之后类加载时都会经过这个ClassFileTransformer转换    void addTransformer(ClassFileTransformer transformer, boolean canRetransform);
    void addTransformer(ClassFileTransformer transformer);    //移除ClassFileTransformer    boolean removeTransformer(ClassFileTransformer transformer);
    boolean isRetransformClassesSupported();    //将一些已经加载过的类重新拿出来经过注册好的ClassFileTransformer转换    //retransformation可以修改方法体,但是不能变更方法签名、增加和删除方法/类的成员属性    void retransformClasses(Class<?>... classes) throws UnmodifiableClassException;
    boolean isRedefineClassesSupported();
    //重新定义某个类    void redefineClasses(ClassDefinition... definitions)        throws  ClassNotFoundException, UnmodifiableClassException;
    boolean isModifiableClass(Class<?> theClass);
    @SuppressWarnings("rawtypes")    Class[] getAllLoadedClasses();
    @SuppressWarnings("rawtypes")    Class[] getInitiatedClasses(ClassLoader loader);
    long getObjectSize(Object objectToSize);
    void appendToBootstrapClassLoaderSearch(JarFile jarfile);
    void appendToSystemClassLoaderSearch(JarFile jarfile);
    boolean isNativeMethodPrefixSupported();
    void setNativeMethodPrefix(ClassFileTransformer transformer, String prefix);}

Limitations of Instrumentation:

  • You cannot redefine a non-existing class through bytecode files and custom class names

  • The enhanced class and the old class must follow many restrictions: for example, the parent class of the new class and the old class must be the same; the number of interfaces implemented by the new class and the old class must be the same, and they are the same interface; the accessors of the new class and the old class must be consistent . The number of fields and field names of the new class and the old class must be the same; the method of adding or deleting the new class and the old class must be modified by private static/final;

Talk about Attach and Agent

The difference between attach and agent has been mentioned in the actual combat chapter above, let's talk about it here.

In Instrumentation, the Agent mode -javaagent:<jarpath>[=<选项>]starts with the application by instrumenting it from the start of the application. It requires that there must be a premain() method in the specified class, and it also has requirements for the signature of the premain method. The signature must meet the following two formats:

public static void premain(String agentArgs, Instrumentation inst)    public static void premain(String agentArgs)

-javaagentThere is no limit to the number of parameters in a java program , so any number of javaagents can be added. All java agents will be executed in the order you define, for example:

java -javaagent:agent1.jar -javaagent:agent2.jar -jar MyProgram.jar

The Instrumentation of the Agent mode introduced above is provided in JDK 1.5. In 1.6, the Instrumentation of the attach mode is provided. What you need is the agentmain method, and the signature is as follows:

public static void agentmain (String agentArgs, Instrumentation inst)
public static void agentmain (String agentArgs)

These two methods have different purposes. Generally speaking, the Attach method is suitable for dynamically modifying the functions of the code, and is often used when troubleshooting problems. The Agent mode starts with the application, so it is often used to implement some enhanced functions in advance, such as the startup observation, application firewall, traffic limiting strategy, etc. in my actual combat above.

Summarize

This article spends a short space focusing on the functions, practical usage, and basic principles of JVM Sandbox. By encapsulating some underlying JVM-controlled frameworks, it makes the AOP development at the JVM level extremely simple, as the author himself said, " JVM-SANDBOX can also help you do a lot, depending on how big your brain is. "

The author also uses it to implement many small tools in the company, such as the above application startup data observation (the company has a more stable and complex version, and also monitors a large number of middleware data), which has helped colleagues in many departments to optimize their The app's startup speed. So if you are interested in JVM, you might as well open your mind and think about where JVM Sandbox can help your work and add luster to your work.

Guess you like

Origin blog.csdn.net/dongjia9/article/details/130105535