Vibrato BoostMultiDex optimization practice: the low version of Android APP first start time is reduced by 80% (a)

We know, Android low version (4.X and below, SDK <21) equipment, the use of the Java runtime environment is the Dalvik virtual machine. It is high compared to the version, the biggest problem is that after installing the update or upgrade, first cold start time-consuming long. It often takes several tens of seconds or even minutes, users have to face a blank screen, this time to get through normal use APP.

This is a very poor user experience. We can also be found from online data, Android 4.X and below models, the new users also accounted for a certain percentage, but retained compared to the number of users will have to add very much less. Especially overseas, such as Southeast Asia and Latin America, but also save a significant amount of low-end machines with. The following low 4.X version of the user although relatively small, but to vibrato and so has a billion users TikTok size of APP, even accounting for 10%, the number also has tens of millions. So if you want to open up the market to sink, this part of the user's experience and upgrades are absolutely can not be ignored.

The root cause of the problem is that the first time after installation or upgrade MultiDex it takes too long. To solve this problem, we dig the underlying system mechanisms Dalvik virtual machine for DEX-related processing logic has been redesigned, and ultimately launched BoostMultiDex program, it can reduce more than 80 percent of black waiting time, save the low version of Android users upgrade installation experience.

We first take a brief look at the installation for the first time after a cold start time of comparative data loading DEX:

Android version Firm model Original MultiDex consuming (s) BoostMultiDex consuming (s)
4.4.2 LG LGMS323 33.545 5.014
4.3.0 Samsung SGH-T999 30.331 3.791
4.2.1 HUAWEI G610-U00 36.465 4.981
4.1.2 Samsung I9100 30.962 5.345

You can see the original MultiDex program actually spent more than half a minute to complete DEX loading, and the time BoostMultiDex program requires only 5 seconds. Optimization effect is very significant!

Next, we have to explain in detail the entire development process BoostMultiDex program and Solutions.

Due

Let's look at the root cause of the problem. There is a common cause of a number of reasons.

You first need to understand is that in which you want to access a Java class, necessarily need to load them to access via ClassLoader. On Android, APP inside the class are made PathClassLoaderresponsible for loading. The classes are dependent DEX file exists, only the corresponding DEX loaded, in order for one of the classes to use.

Early designs for Android DEX instruction format is not perfect, the total number of Java methods referenced in a single DEX files can not be more than 65,536.

For now the APP, as long as the functional logic a little more, it's easy to touch up to this limit.

Thus, if a number of methods APP Java code exceeds 65,536, the APP code will not be fully fit in a DEX file, then we have to generate more DEX file during compilation. We wound vibrato of the APK you can see, which does contain a number of DEX file:

  8035972  00-00-1980 00:00   classes.dex
  8476188  00-00-1980 00:00   classes2.dex
  7882916  00-00-1980 00:00   classes3.dex
  9041240  00-00-1980 00:00   classes4.dex
  8646596  00-00-1980 00:00   classes5.dex
  8644640  00-00-1980 00:00   classes6.dex
  5888368  00-00-1980 00:00   classes7.dex
复制代码

Android 4.4 and below uses the Dalvik virtual machine, under normal circumstances, Dalvik virtual machine can be executed only did OPT optimized DEX file, that is, we often say ODEX file.

When a APK installation, which classes.dexwill automatically do ODEX optimization and startup default by the APP system loaded directly into the PathClassLoaderinside, so classes.dexthe classes can certainly direct access, we do not need to worry about.

In addition to its DEX file, that is classes2.dex, classes3.dex, classes4.dexand other DEX file (here we collectively Secondary DEX files), these files need to be on our own ODEX optimization, and loaded into ClassLoader in order to work properly among classes. Otherwise when accessing these classes will throw ClassNotFoundan exception causing crashes.

So, Android officially launched MultiDex program. APP only needs to be performed in the program's first entrance, which is Application.attachBaseContextdirectly inside the tune MultiDex.install, it will unlock the APK package for the second and subsequent DEX files do ODEX optimization and load. In this way, APK with multiple DEX files can successfully execute any longer.

This operation will be the first time a cold start takes place after APP install or update, it is precisely because this process takes a long, time-consuming black screen problem was a result of our most mentioned at the beginning.

The original implementation

Knowing this background, we will look at MultiDex implementation logic is relatively clear.

First, APK which all the classes2.dex, classes3.dex, classes4.dexand other DEX files are unzipped.

Then, for each dex conduct ZIP compression. Generate classesN.zip file.

Then, do ODEX optimize each ZIP file to generate classesN.zip.odex file.

Specifically, we can see the files under the APP code_cache catalog:

com.bytedance.app.boost_multidex-1.apk.classes2.dex
com.bytedance.app.boost_multidex-1.apk.classes2.zip
com.bytedance.app.boost_multidex-1.apk.classes3.dex
com.bytedance.app.boost_multidex-1.apk.classes3.zip
com.bytedance.app.boost_multidex-1.apk.classes4.dex
com.bytedance.app.boost_multidex-1.apk.classes4.zip
复制代码

This is done by DexFile.loadDeximplemented method, only need to specify the original and ODEX ZIP file path of the file, it is possible to generate the corresponding ODEX product of DEX according to the ZIP, this method will return a final DexFileobject.

Finally, APP these DexFileobjects are added to PathClassLoaderthe pathListinside, you can let the APP during operation, by ClassLoaderusing these classes in DEX loaded.

In this whole process, generate and ODEX ZIP files are relatively time-consuming process, if there are many Secondary DEX file APP, it will exacerbate the problem. Especially ODEX generation process, Dalvik virtual machine to DEX file format to traverse scanning and optimization rewrite process, which is converted to ODEX file, which is one of the biggest time-consuming bottleneck.

Optimizations commonly used

Currently the industry has had some method of MultiDex optimized, let's look at how we usually optimize this process.

Asynchronous loading of

The start-up phase to be used as much as possible based packaged inside the main Dex, do not rely as much as possible to run the service code Secondary DEX. Then asynchronous calls MultiDex.install, and in a subsequent point in time when the need to use Secondary DEX, if MultiDex has not been executed, stopped synchronously wait for it to complete before continuing with the follow-up code.

This does install performed while the down part of the code will not be completely blocked. However, to do this, we must first sort out the good start logic code, know exactly what can be executed in parallel. Further, since the main Dex able to put the code itself is relatively limited, operations in start-up phase if there are too dependent, can not be fully inserted inside main Dex, thus requiring reasonably peeled dependent.

Therefore, this program effectiveness under realistic circumstances more limited, if the start-up phase involves too much business logic, it is not possible to perform many parallel code, install soon to be blocked.

Lazy Load module

This article first appeared in the United States program group, can be said to be an upgraded version of the previous program.

It is also to do asynchronous DEX loaded, but the difference is that, we need to be split by DEX module during compilation.

Activity is generally to an interface, Service, Receiver, Provider codes related to a first DEX are put in, and the two, three pages Activity frequency interface and a non-code into the Secondary in DEX.

When behind the need to implement a module, first determine whether the Class of this module has been loaded, if not complete, then wait for the install to complete before continuing execution.

Visible, the extent of the transformation of the business of this package is very great, and already have a number of prototype plug-in framework. Further, we want to be able to load the case of Class module determines, have their injection through reflection Instrumentation ActivityThread, insert their own judgment before performing logic Activity. This will be a corresponding model introduced compatibility problems.

Multithreading Load

Native MultiDex the order is made for each DEX file ODEX optimization. The multi-threaded idea is to make every DEX respectively OPT with each thread.

So at first glance, it seems to be able to play in parallel do ODEX optimization results. However, we project a total of six Secondary DEX files found found that the way to optimize almost no effect. The reason may be ODEX itself is actually a severe I / O type of operation, for concurrent, multiple threads at the same time I / O operations and can not bring obvious benefits, and multi-thread switch itself will also have some losses.

Background process loads

This program is mainly to prevent the main process do ODEX for too long leads to ANR. When you click APP when the first individual initiated a process to do first ODEX non-primary, primary and other non-finish process ODEX cried out after the main course, so do the main course made up of direct ODEX can be performed directly. However, this is only to avoid the problem of the main process ANR, the first time you start the overall waiting time has not diminished.

A more thorough optimization program

Above several programs at all levels are trying to do the optimization, but careful analysis will find that they did not touch the fundamental problem, which is on the MultiDex.installoperation itself.

MultiDex.installODEX file generation process, method call is DexFile.loadDex, it starts a process of dexopt DEX file inputted ODEX transformation. So, this ODEX optimization of time whether it be avoided?

Our BoostMultiDex program, it is from this point to start and optimize time-consuming install from nature.

Our approach is, at the time of the first start, has not been loaded directly OPT optimization of the original DEX, before making APP can start properly. And then start a separate process in the background, slowly work done OPT DEX, to avoid affecting the normal use of the foreground APP as possible.

Breach

The difficulty here, of course - how to do can be loaded directly original DEX, ODEX optimization to avoid time-consuming to bring obstruction.

If you want to avoid ODEX optimization, but also want to be able to run APP normal, it means there is no need to perform the Dalvik virtual machine OPT done, the original DEX files directly. Virtual machine supports direct execution of DEX file it? After all, Dalvik virtual machine that can perform a raw DEX byte code directly, ODEX DEX compared to just do some additional analysis and optimization. So even if DEX is not optimized, in theory, it should be able to properly execute.

Hard work pays off, after some digging our, and she found the hidden entrance dalvik source inside the system:

/*
 * private static int openDexFile(byte[] fileContents) throws IOException
 *
 * Open a DEX file represented in a byte[], returning a pointer to our
 * internal data structure.
 *
 * The system will only perform "essential" optimizations on the given file.
 *
 */
static void Dalvik_dalvik_system_DexFile_openDexFile_bytearray(const u4* args,
    JValue* pResult)
{
    ArrayObject* fileContentsObj = (ArrayObject*) args[0];
    u4 length;
    u1* pBytes;
    RawDexFile* pRawDexFile;
    DexOrJar* pDexOrJar = NULL;

    if (fileContentsObj == NULL) {
        dvmThrowNullPointerException("fileContents == null");
        RETURN_VOID();
    }

    /* TODO: Avoid making a copy of the array. (note array *is* modified) */
    length = fileContentsObj->length;
    pBytes = (u1*) malloc(length);

    if (pBytes == NULL) {
        dvmThrowRuntimeException("unable to allocate DEX memory");
        RETURN_VOID();
    }

    memcpy(pBytes, fileContentsObj->contents, length);

    if (dvmRawDexFileOpenArray(pBytes, length, &pRawDexFile) != 0) {
        ALOGV("Unable to open in-memory DEX file");
        free(pBytes);
        dvmThrowRuntimeException("unable to open in-memory DEX file");
        RETURN_VOID();
    }

    ALOGV("Opening in-memory DEX");
    pDexOrJar = (DexOrJar*) malloc(sizeof(DexOrJar));
    pDexOrJar->isDex = true;
    pDexOrJar->pRawDexFile = pRawDexFile;
    pDexOrJar->pDexMemory = pBytes;
    pDexOrJar->fileName = strdup("<memory>"); // Needs to be free()able.
    addToDexFileTable(pDexOrJar);

    RETURN_PTR(pDexOrJar);
}
复制代码

This method can be done to make the loading of the original DEX files without relying ODEX file, it in fact has done so a few things:

  1. It accepts a byte[]parameter, which is the original DEX byte code file.
  2. Call the dvmRawDexFileOpenArrayfunction to process byte[], generate RawDexFileobjects
  3. By the RawDexFilegeneration of an object DexOrJar, by addToDexFileTableadding to the virtual machine, so that it can be used normally follow the
  4. The return DexOrJaraddress to the upper layer, so that the upper use it as a cookie to construct a legitimate DexFiletarget

In this way, the upper Seconary DEX in obtaining all of DexFilethe object, call makeDexElements inserted into the ClassLoader inside to complete the install operation. In this way, we can avoid ODEX perfectly optimized, so that the normal execution APP anymore.

Looking entrance

It seems very well, however we have encountered an unexpected condition.

We from Dalvik_dalvik_system_DexFile_openDexFile_bytearrayclear that the name of this function, which is a JNI method, from 4.0 to 4.3 version can be found in its Java prototype:

/*
 * Open a DEX file based on a {@code byte[]}. The value returned
 * is a magic VM cookie. On failure, a RuntimeException is thrown.
 */
native private static int openDexFile(byte[] fileContents);
复制代码

However, we in the 4.4 version, Java native methods layer it does not have a corresponding. So we can not call directly on top.

Of course, we can easily think of, you can use dlsym to sign direct search function to call. But unfortunately, Dalvik_dalvik_system_DexFile_openDexFile_bytearraythis method is static, so it has not been exported. We actually went to parse libdvm.sothe time, also did not find Dalvik_dalvik_system_DexFile_openDexFile_bytearraythis symbol.

However, since it is the JNI functions are also registered to the virtual machine inside through the normal way. Therefore, we can find a function that corresponds to the registry:

const DalvikNativeMethod dvm_dalvik_system_DexFile[] = {
    { "openDexFileNative",  "(Ljava/lang/String;Ljava/lang/String;I)I",
        Dalvik_dalvik_system_DexFile_openDexFileNative },
    { "openDexFile",        "([B)I",
        Dalvik_dalvik_system_DexFile_openDexFile_bytearray },
    { "closeDexFile",       "(I)V",
        Dalvik_dalvik_system_DexFile_closeDexFile },
    { "defineClassNative",  "(Ljava/lang/String;Ljava/lang/ClassLoader;I)Ljava/lang/Class;",
        Dalvik_dalvik_system_DexFile_defineClassNative },
    { "getClassNameList",   "(I)[Ljava/lang/String;",
        Dalvik_dalvik_system_DexFile_getClassNameList },
    { "isDexOptNeeded",     "(Ljava/lang/String;)Z",
        Dalvik_dalvik_system_DexFile_isDexOptNeeded },
    { NULL, NULL, NULL },
};
复制代码

dvm_dalvik_system_DexFileThis array needs to be dynamically at runtime virtual machine registered in, so this sign is sure to be exported.

This way, we can also obtain this array by dlsym, in accordance with the elements one by one string matching to search for a way openDexFilecorresponding Dalvik_dalvik_system_DexFile_openDexFile_bytearrayway to go.

Specific code to achieve the following:

    const char *name = "openDexFile";
    JNINativeMethod* func = (JNINativeMethod*) dlsym(handler, "dvm_dalvik_system_DexFile");;
    size_t len_name = strlen(name);
    while (func->name != nullptr) {
        if ((strncmp(name, func->name, len_name) == 0)
            && (strncmp("([B)I", func->signature, len_name) == 0)) {
            return reinterpret_cast<func_openDexFileBytes>(func->fnPtr);
        }
        func++;
    }
复制代码

Stroke clear steps

Summary of what, bypassing ODEX loaded directly DEX programs, mainly in the following steps:

  1. Secondary DEX file decompressed to obtain the original bytecode from APK
  2. Available through dlsym dvm_dalvik_system_DexFilearray
  3. In the array to get the query Dalvik_dalvik_system_DexFile_openDexFile_bytearrayfunction
  4. This function is called, one by one before the incoming obtained from the APK DEX byte code, complete DEX loaded, get a legitimate DexFiletarget
  5. The DexFileobjects are added to the APP of PathClassLoaderthe pathList in

Completion of the above few steps, we can access to the inside of the class of Secondary DEX

getDex problem

However, just when we successfully injected into the original DEX execution down, but on 4.4 models will now immediately encountered a crash:

JNI WARNING: JNI function NewGlobalRef called with exception pending
             in Ljava/lang/Class;.getDex:()Lcom/android/dex/Dex; (NewGlobalRef)
Pending exception is:
java.lang.IndexOutOfBoundsException: index=0, limit=0
 at java.nio.Buffer.checkIndex(Buffer.java:156)
 at java.nio.DirectByteBuffer.get(DirectByteBuffer.java:157)
 at com.android.dex.Dex.create(Dex.java:129)
 at java.lang.Class.getDex(Native Method)
 at libcore.reflect.AnnotationAccess.getSignature(AnnotationAccess.java:447)
 at java.lang.Class.getGenericSuperclass(Class.java:824)
 at com.google.gson.reflect.TypeToken.getSuperclassTypeParameter(TypeToken.java:82)
 at com.google.gson.reflect.TypeToken.<init>(TypeToken.java:62)
 at com.google.gson.Gson$1.<init>(Gson.java:112)
 at com.google.gson.Gson.<clinit>(Gson.java:112)
... ...
复制代码

Can be seen, the use to which Gson Class.getGenericSuperclassmethod, and it is the final call Class.getDex, which is a native method, corresponding to achieve the following:

JNIEXPORT jobject JNICALL Java_java_lang_Class_getDex(JNIEnv* env, jclass javaClass) {
    Thread* self = dvmThreadSelf();
    ClassObject* c = (ClassObject*) dvmDecodeIndirectRef(self, javaClass);

    DvmDex* dvm_dex = c->pDvmDex;
    if (dvm_dex == NULL) {
        return NULL;
    }
    // Already cached?
    if (dvm_dex->dex_object != NULL) {
        return dvm_dex->dex_object;
    }
    jobject byte_buffer = env->NewDirectByteBuffer(dvm_dex->memMap.addr, dvm_dex->memMap.length);
    if (byte_buffer == NULL) {
        return NULL;
    }

    jclass com_android_dex_Dex = env->FindClass("com/android/dex/Dex");
    if (com_android_dex_Dex == NULL) {
        return NULL;
    }

    jmethodID com_android_dex_Dex_create =
            env->GetStaticMethodID(com_android_dex_Dex,
                                   "create", "(Ljava/nio/ByteBuffer;)Lcom/android/dex/Dex;");
    if (com_android_dex_Dex_create == NULL) {
        return NULL;
    }

    jvalue args[1];
    args[0].l = byte_buffer;
    jobject local_ref = env->CallStaticObjectMethodA(com_android_dex_Dex,
                                                     com_android_dex_Dex_create,
                                                     args);
    if (local_ref == NULL) {
        return NULL;
    }

    // Check another thread didn't cache an object, if we've won install the object.
    ScopedPthreadMutexLock lock(&dvm_dex->modLock);

    if (dvm_dex->dex_object == NULL) {
        dvm_dex->dex_object = env->NewGlobalRef(local_ref);
    }
    return dvm_dex->dex_object;
}

复制代码

Binding the stack and code view point of collapse is performed inside the JNI com.android.dex.Dex.createtime:

jobject local_ref = env->CallStaticObjectMethodA(com_android_dex_Dex,
                                                 com_android_dex_Dex_create,
                                                 args);
复制代码

Because it is JNI method, this call after an exception occurs if there is no check, in the implementation of the follow-up to the env->NewGlobalReftime of the call checks to the front exception occurs, thus thrown.

The com.android.dex.Dex.createreason fails, mainly due to a problem with the parameters, the arguments are dvm_dex->memMaptaken to a map of memory. dvm_dex is made from the inside of this Class. Inside the virtual machine code, each corresponding to Class is the structure ClassObjectin which the field:

struct ClassObject : Object {
... ...
    /* DexFile from which we came; needed to resolve constant pool entries */
    /* (will be NULL for VM-generated, e.g. arrays and primitive classes) */
    DvmDex*         pDvmDex;
... ...
复制代码

Here pDvmDexis where the process of loading class assignment:

static void Dalvik_dalvik_system_DexFile_defineClassNative(const u4* args,
    JValue* pResult)
{
... ...

    if (pDexOrJar->isDex)
        pDvmDex = dvmGetRawDexFileDex(pDexOrJar->pRawDexFile);
    else
        pDvmDex = dvmGetJarFileDex(pDexOrJar->pJarFile);

... ...
复制代码

pDvmDexFrom dvmGetRawDexFileDexobtaining inside the method, and the parameters here pDexOrJar->pRawDexFileit is in front of us openDexFile_bytearrayinside created pDexOrJaris to return to the top before the cookie.

Then according to dvmGetRawDexFileDex:

INLINE DvmDex* dvmGetRawDexFileDex(RawDexFile* pRawDexFile) {
    return pRawDexFile->pDvmDex;
}
复制代码

Could eventually push, dvm_dex->memMapis the corresponding openDexFile_bytearraytime to get pDexOrJar->pRawDexFile->pDvmDex->memMap. When we had loaded DEX byte array, whether missing on memMapassignment it?

We analyze the code, and found that indeed the case, memMapthis field will be assigned only in case of ODEX:

/*
 * Given an open optimized DEX file, map it into read-only shared memory and
 * parse the contents.
 *
 * Returns nonzero on error.
 */
int dvmDexFileOpenFromFd(int fd, DvmDex** ppDvmDex)
{
... ...

    // 构造memMap
    if (sysMapFileInShmemWritableReadOnly(fd, &memMap) != 0) {
        ALOGE("Unable to map file");
        goto bail;
    }

... ...

    // 赋值memMap
    /* tuck this into the DexFile so it gets released later */
    sysCopyMap(&pDvmDex->memMap, &memMap);

... ...
}
复制代码

The case load only DEX byte array and will not take this approach and, therefore, can not be assigned to a memMap. It seems, Android from the beginning of the official openDexFile_bytearrayis no good support, system code which did not used anywhere, so when we will be forced to use this method to expose the problem.

Although this is the official pit, but since we need to use, you have to think of ways to fill.

Re-analysis Java_java_lang_Class_getDexapproach, we noticed this:

    if (dvm_dex->dex_object != NULL) {
        return dvm_dex->dex_object;
    }
复制代码

dvm_dex->dex_objectIf not empty, it will return to direct, will not execute down to take place memMap, so it will not throw an exception. Thus, to solve the thinking is very clear, after we finished loading DEX array immediately generate yourself a dex_objecttarget, and injected pDvmDexinside.

Detailed code is as follows:

jclass clazz = env->FindClass("com/android/dex/Dex");
jobject dex_object = env->NewGlobalRef(
        env->NewObject(clazz),
        env->GetMethodID(clazz, "<init>", "([B)V"),
        bytes));
dexOrJar->pRawDexFile->pDvmDex->dex_object = dex_object;
复制代码

After such settings go, getDex abnormal really no longer there.

summary

At this point, without waiting for ODEX optimization of direct DEX loading scheme has been completely opened up, APP's first start-up time can thus be significantly reduced.

Our ultimate extreme distance there is a complete solution for a short distance, however, it is this short way, only the most dangerous severe. Greater challenges lie ahead, we will carefully as we break down the next article, but also to show a detailed plan to bring the final earnings . We can first think about what issues here are not taken into account.

Vibrato / TikTok Android base technology team is a pursuit of the ultimate depth technical team, in Shanghai, Beijing, Shenzhen, Hangzhou has a lot of talent needs, and we welcome all students to come together to build one hundred million users of globalization APP!

You can click to read the original text, enter bytes beating recruitment official website vibrato Android-related jobs , you can also contact [email protected] consult the relevant information within or directly send your resume to push!

Stay tuned, vibrato BoostMultiDex optimization practice: APP on the Android version low first start time is reduced by 80% (b).

Welcome concern byte beating technical team

Guess you like

Origin juejin.im/post/5e5b9466518825494b3cd5aa