Application of dynamic libraries made so thin in terms of installation package Android APK

Android developers now have created a Friends of the water circle, the circle will from time to time to update some Android-class advanced data, are welcome to discuss with technical problems, common development and progress! (UI includes senior engineers, Android bottom of the development engineer, Android architect, native and hybrid optimization performance optimization, flutter specialization); technology heavyweights want to join, the greater the more the right circle to solve the problem of water available!

As we all know Android file itself is loaded so dynamic behavior of executable code to load a run-time, so it made the hair under dynamic no technical risk, but should stabilize this technology to actual production floor projects still have a lot troublesome. Based on actual project experience, to share some so dynamic key technical points and the need to avoid the pit.

Demand Value

In general, the more mature Android project, the more the contribution of Native code, in the past the main volume of the large proportion of APK is the resource file, but now Native code that brings so volume accounted for is also very impressive, so so dynamic value more and more prominent. On the other hand, now supports Android project arm64 more and more, Google Play support arm64 is mandatory, so some Android project requires two or even more built-abi support (such as a client station B project is to support arm32 / arm64 / x86 are three in the past also supports arm5), the result is so volume increases exponentially. Therefore, it can not be non-essential abi relevant documents so dynamic, has become a national weight-loss optimized Android project had priority issues.

In addition, some third-party SDK library also comes with so many libraries (such as Tencent video SDK, before when I access the SDK, the project itself is only 15 MB volume, while the SDK itself so already accounts for 17 MB), perhaps in order to streamline the volume is brought third-party SDK, perhaps to isolate third-party SDK's API (dependent on business-related projects only their own definition of API, third-party access to the SDK to achieve dependency injection by the way, so that future replacement of the SDK switching only when necessary to rely on the form of injection), require specific programs so dynamic technical support.

Dynamic problem to be solved

Issued so dynamic library fancy just had to run even when dynamic loading so file, APK installation package pulled from the inside out, little change on the workflow, but in fact this is a complete plug-in technology, but also That is all plug-ins need issues facing the problems we all need to be considered. Below I explain one by one analysis for problems encountered during actual production.

1. Security issues

Prior to loading the executable code is run on a dynamic nature, the executable code and all mounted to a secure path (such as Android of data / data path inside) in the copy, or has been hijacked risk of damage. so dynamic also have to consider the security issues, the best practice is to have it before so every time you load the library to do a security check. Considering the cost of time to bring the inspection can be assumed that the internal path is unconditionally trusted (for Android is, data / data path is unsafe in the case of root device; and in addition the risk of hijacking, the internal path of the file is likely to be the application itself of some improper file operation to destroy the cause plug-in is not complete, so if you want to be considered absolutely safe, internal plugins are loaded must also do security checks), in so files are copied to the internal path alone do a check, the check fails discarded files fail go logic, by checking a flag file is generated as a sign of the future to decide whether to perform a security check by judging flag flag exists.

Security check how it? The easiest way is to record CRC or the like so MD5 Hash information file (particle size may be so each individual file or group of files compressed so), the built-in information to the internal server or APK (if stored on the server, the client needs to obtain these data through HTTPS similar credible channel and the like), so the file by checking whether the information is consistent Hash to ensure safety. But Hash information is generally also will change so the file is changed, need to be adjusted every time the data is too much trouble, I am thinking of the optimization program "through similar APK installation package signature verification approach to ensure security": the so file whether packaged as plug-in package and use the APK format Android Keystore sign, the Keystore fingerprint information stored in the internal host inclusion, safety testing links only need to verify the signature information and add-on pack built-in fingerprint information matches can be. (An optimized solution is to use the same package and host to the plug-in package signature Keystore, testing links only need to check whether the plug-in and host the same signature information.)

Specific code implementation may be acquired into the ring

2. version control issues

And general scheme of plug-ins, like, so dynamic must also deal with version control issues: After the APK in the so stripped out, except we want to ensure the safety of so documents, files, and also to ensure so dependent on its host code API-compatible (must be strict requirements on the same version, so at least forward compatible). If you do not generally considered as plug-relegation issue, it must do so and APK file package version is the same: after the host download the appropriate version of the file so, install the version specified path; the host must download the new version upgrade again version of the file can not be disturbed so the stock of the old version so the file (if required to do dynamic relegation, also need to keep the last one or two versions of the file so the stock, the need for fallback logic).

In addition to API-compliant version control plug-in problem solving, you can achieve "real revoked" strategy. Imagine that we released a version of APK host and so the corresponding plug-in package, and so is this version of the Bug could lead to the collapse of APP. Version control process, we can disable the server side so this version of the plug-in, so that the client enters "so the plug is not available," the logic of it, rather than the implementation of the code in question. (If so plug-in supports dynamic relegation, you can configure the client to force an update to fix plug-in version, or return fallback stock legacy no problem.)

From the frame design, to dynamic control of version Update Install and two links, a specific implementation code can reference circle link

3. abi compatibility determination

abi compatibility plug-ins is so unique dynamic problem, in addition to considering whether the plug so safe, we also need to check whether the plug so the library abi information so the bag is consistent with abi host currently running. Consider such a situation: the host APK which built ARM32 and so AMR64 two kinds of files, the same plug-in package also built two so the file, when the host APK installed on ARM32 devices, plug-in dynamically loaded so, we must just unpack and load the appropriate AMR32 of so plug for equipment ARM64 is also the same reason. That is: the same APK host, so the same plug-in, when installed on different abi devices, plug-in processing dynamic behavior is not the same framework.

This problem may also be said to be a branch of version control problem above problems. Considering the completeness of the frame, the frame itself should be able to automatically set up and do not deal with abi compatibility issues, rather than to circumvent this problem (fault-tolerant) by packing process so plug-ins.

4. System # load problem loading code intrusion

The problem is so intrusive plug-specific problems, related to the specific way this issue so loaded library with Android Framework. General Framework let users directly load the dynamic link library via dlopen function, but encapsulates two ways to load so the library (in fact the second ultimately need to find a specific file path so by libName, reloading so by file path libraries, and the first way the same thing):

public final class System {
    // way: by loading so the file path
    public static void load(String filename) {
Runtime.getRuntime().load0(VMStack.getStackClass1(), filename);
    }
    // Second way: through so loaded library name
    public static void loadLibrary(String libname) {
Runtime.getRuntime().loadLibrary0(VMStack.getCallingClassLoader(), libname);
    }
}

Under normal circumstances, we are by way of two to System.loadLibrary ( "xxx") to load the file so the way libxxx.so after, but the file so dynamic, so we need to install the files to the internal security path, by the way one to ( "secure path {} /libxxx.so") System.load way to load. This approach is most so dynamic program adopted by the project, it has been also can be stable, but we also find this program in a lot of trouble.

Using Method 1 as a dynamic program so, to write code means dead System.load ( "{secure path /libxxx.so}") . As a result, our first stage in the code regulating egg pain, Native the code can be debugged using conventional built-in program development phase, then during the integration phase of dynamic program package, which means we have to frequently in mode 1 and Second way directly back and forth to modify the code invasive problem is very serious. However, this is not the most troublesome problem for the dynamic problem of third-party SDK project, the project itself so if the SDK library is the way two ways to load (the normal development mode, some on their own with so logical file download the SDK project, it is likely that is the way a load, but not a big problem in this case), you may need to use ASM this "Quxianjiuguo" approach to the project in the relevant code SDK so loaded into a modified way a ; or select so when you are ready immediately to plug the way a all so the plugin file is loaded into the host, so you can plug in Douzhu Second way to load code (if the target so the library once it has been loaded through the Second way to load Code into a empty implementation).

Solve so dynamic of System # load the code intrusion problems, to learn ideas Android hot fix technical solutions: According to Second way , namely by System # loadLibrary ( "xxx") to load so Gallery, Android Framework will traverse ClassLoader instance the current context in the nativeLibraryDirectories array, find the file named in the array of file paths all libxxx.so file, so our idea is to solve the after installed so the plug, it will be in the path of internal security nativeLibraryDirectories injected into this array, that is, It can be achieved by way of two loading. (Although the idea is clear and simple, but in practice there are still many problems later described in detail specific solutions.)

The following pages give our analysis and answers to so dynamic programs and specific technical details.

specific plan

1. The system load so Gallery workflow

When we call System # loadLibrary ( "xxx") after, Android Framework have done some of what?

In simple terms, Android is so loading mechanism, it can be divided into the following four areas:

  1. PMS install: the installation package APK, PackageManagerService abi information according to the current device, copying from the corresponding APK bag so files.

  2. Native classpath: startup of APP, Android Framework to create a ClassLoader instance of the application and the current application directory so all files related to the current injected into the ClassLoader related fields.

  3. so loading: call System.loadLibrary ( "xxx") , Android Framework seek from the current context ClassLoader instance (or user-specified) directories named array and load libxxx.so file.

  4. jni calling: call so relevant JNI method.

General process diagram is as follows:

v2-b91cd6143ed0097b73c9e3ac54594692_720w.jpg

Specific processes and chain method calls do not make in-depth discussion based on this process, and "load code intrusion problem" mentioned above, in accordance with System.loadLibrary ( "xxx") to load code and JNI methods related classes (hereinafter referred to as JNI code) where examples of different ClassLoader, so dynamic techniques can be divided into "the JNI isolate the code" and "code for the JNI built-in" two solutions.

Code isolation scheme 2. JNI

v2-4167af6307752228aae3f93f040854ca_720w.jpg

As the name suggests, it is to involve dismantling the JNI code into a separate module, so plug packed into the bag together. When dynamically loaded at runtime library so, so give plug-in creates a plug ClassLoader, the implementation of "so loading" and "jni calling" within the plug ClassLoader. Advantage of the code isolation scheme is able to do plug-in modules compiled isolation, the code can not be other modules related JNI Reference method plug inside, not easy to interfere with the life cycle of JNI calls, follow-up maintenance cost is low (which is also the general scheme of plug-ins required to achieve the goal). At the same time also very obvious shortcomings: the project according to the specific circumstances of the historical burden, the cost of dismantling the module may be larger than the dynamic transformation of the return. Thus, the code for the new isolation scheme Native comparison module, a dynamic start ran, to the direction of loading delays.

3. JNI Code Internal Solutions

v2-8600d1cf879c69927205ac8f7053ce14_720w.png

Taking into account the cost of dismantling JNI module technology, so it can consider a separate file separately packaged into the plug-in package, the code remains in the host JNI internal code, so the common plug host ClassLoader instance, "so loading" and "jni calling" still retained performed within the host. This "lazy" JNI code built-in program with respect to the isolation scheme is difficult transformation is much smaller, since there is no corresponding disassemble and clean the code, the code is very likely to cause pollution problems, a large follow-up maintenance costs. Considering the time cost, I believe that most of the projects can only choose JNI code Internal Solutions. After all, the code pollution, could be strengthened "access codes" threshold by Code Review, Lint static checking, etc., to alleviate the problem. It should be emphasized that, compared to the code isolation scheme, JNI code that has a built-in program-specific technical problems had to solve: to nativeLibraryDirectories injected so bring a collection of plug-path concurrent modification. Due to the specific implementation nativeLibraryDirectories is an ArrayList instance whose elements are read and write operation itself is not thread-safe, and we load the plug-in part so the last thread Worker will need to file a new path so injected into the ArrayList collection, if this time just because there is another thread execution "so loading" operation is through the collection of elements, thrown ConcurrentModificationException (internal ArrayList realized).

Concurrent modification ideas to solve the problem in two ways:

  1. A "so loading" and "file path so injection" do both locked, the lock instance is associated so ClassLoader instance.

  2. Before all "so loading" operation (such as cold start initialization links) will pre-implantation reserved good so the file path.

1 idea is relatively simple and reasonable, but locked operation requires "invade" all other relevant System.loadLibrary ( "xxx") calls, the code is also likely to cause pollution; and always feel a little idea 2 violation of the general principles of programming (some so basically plug-in might not have access, not worthwhile at the outset injection comes in its path), the specific choice depends on the actual situation of the project. As a supplement, an idea of what can be re-optimization: In order to avoid contamination of the code locking operation caused to be in a more indirect means to compile stage by ASM automatically to all "so loading" locked; or when the injection path to ClassLoader not in the original nativeLibraryDirectories making changes on the set, but again a new List instance all the paths are copied to the new collection, and finally the whole plug back ClassLoader, avoid concurrent modification exceptions, the cost of allowing concurrent read dirty data problem (imploding). We have to try these two ideas, the idea is actually put into use 2, in addition to the pollution problem, mainly because it comes below "dlopen problem."

4. deal with the problem dlopen

dlopen is Native develop a more familiar function, and its function is to specify the mode load the specified dynamic link library (using dlclose open to unload the library). In fact, Android Framework load so the library System.loadLibrary ( "xxx") call, the last by dlopen to achieve roughly the call path as follows:

Sysytem#loadLibrary --> Sysytem#load --> Runtime#nativeLoad
                                           Java  +
                                                 |  Native
                                          dvmLoadNativeCode --> dlopen

In NDK development, so if we have two files: libxxx.so and liblog.so (the latter is the basis of the Treasury, the former need to rely on the latter's API), xxx dynamically link log, embodied in the CMake configuration is as follows:

...
TARGET_LINK_LIBRARIES(xxx liblog.so)
...

Then when we call System.loadLibrary ( "xxx") eventually loaded by dlopen libxxx.so file, and then by its dependency information is automatically loaded with dlopen liblog.so time, Android Framework chain by calling the above-mentioned (first two-step did not return System # load, but directly in Native execution level). For developers familiar with Native students who may be commonplace, but only contacted so files in third-party SDK in students, it should not be too aware of this. Yet it is precisely this, to add a dynamic so great difficulties, so we eat a lot of losses in a particular practice projects.

According to the project experience, and now both the plug-in technology, or hot repair technology, which dynamically loaded on technical solutions so documents should be fairly mature, have all stepped pit 7788, even if the pit is not resolved, it should be not to the point of not seriously affect the viability of projects and programs. So first, we have a major dynamic risk assessment program on the module code dismantling, and did not worry about technical risks. In fact, Android N before long you will libxxx.so and liblog.so directory path where the files are injected into the nativeLibraryDirectories in the current ClassLoader, then when loading plug-ins so that the two documents can normally be found. From the start the situation is different N: libxxx.so loads properly, but liblog.so fail to load error. Specific abnormalities as follows:

E/ExceptionHandler: Uncaught Exception java.lang.UnsatisfiedLinkError: dlopen failed: library "liblog.so" not found
at java.lang.Runtime.loadLibrary0(Runtime.java:xxx)
at java.lang.System.loadLibrary(System.java:xxx)
...

The main reason is, Android Native to specific links so Linker.cpp dlopen function library to achieve relatively large changes (mainly the introduction of Namespace mechanism ): to achieve in the past, Linker will be instances in ClassLoder of nativeLibraryDirectories in all Pathfinder so the appropriate documents; after the update, Linker in the search path after ClassLoader instance is created by Namespace mechanism to bind the system, when we inject new path, although ClassLoader in the path increases, but Linker has been tied in Namespace given set of paths is not updated simultaneously, so there libxxx.so file can be found and the situation liblog.so found.

As for the mechanism works Namespace, it can simply considered a ClassLoader instance HashCode as Key to the Map, Native layer acquire Value Map in storage (that is, so the file path set) by ClassLoader instance.

I was wondering before, Tinker has no reason to expose dlopen problems, mainly because Tinker is a hot fix frame, patch plug-in needed liblog.so files, often in the host already has a built-in, so will only lead to hot fix partial failure, and will not appear liblog.so not find the problem. In fact well it happens, Tinker in solving thermal Android N mixed compilation brings repair failures, when injected into the ClassLoader plug so the file path, creates a new instance of AndroidNClassLoader to replace APP's own ClassLoader this replacement operation together Douzhu just a dlopen problem. As to why there are no other plug-in framework mentioned it, probably because generally suitable for plug-in dynamic transformation of the relatively light volume, is generally not a Native Code (even if there is often not so dependent).

Dlopen to solve the problem mainly in the following ideas:

Custom System # load, before loading libxxx.so, first resolve dependencies libxxx.so information, and then load it recursive so dependent files (Recommended reference open source program SoLoader [7]).

Custom Linker, so complete their control logic to retrieve the file (refer to open recommended program Relinker [. 8]).

Similarly Tinker, replace ClassLoader instance (which we now put into the program) at the right time.

5. so dependent Analyzer

Specific technical problems mentioned above are so dynamic scheme, and the rest are some complicated project problems (technical debt), such as mentioned above so dependent analysis. Want to use so dynamic technology to APK weight-loss programs, in addition to analyzing what so the file size is relatively large proportion of outside, best practice is to file its all so dependent must be moved to plug the bag. Learn how APK files in all so dependent on specific information about it? According to the model file so dependent on the hand line and code analysis information is certainly feasible, but it was all God's doing live large, Wudeng from mediocrity chose to stand on the shoulders of giants.

Here recommend a Google open source analysis tool APK Android-classyshark [9], in addition to providing analysis APK dex / so dependent on information, it provides a GUI visual interface, is ideal for quick start.

v2-6a4ee7fd48c8f8ce1cd1e681d9e830e8_720w.jpg

other problems

JNI pollution problems related to class

JNI method needs to be done in the appropriate library to load properly so called, there are so many developers choose System # loadLibrary ( "xxx") like code written in static code block JNI class to be guaranteed before accessing the JNI be completed so the library is loaded. But this is actually very "Best Practice": on the one hand, load so originally belonging to a dynamic technology, the possibility of failure exists on its own, but there has been a lot of Native Development "gotcha" on Android, the most all so good idea to consider the possibility of failure load and JNI method call; on the other hand, the file itself is loaded so there is little loss in performance, load performance problems will intensify in the static code block. The most trouble is, after so dynamic transformation, if the follow-up project development careless references related JNI classes (such as access to static methods) before installation so the plug has not been completed, even if no actual method call occurs, it can lead to JNI class is loaded in advance ClassLoader, thus triggering advance System # loadLibrary ( "xxx") logic, trigger Crash.

For existing projects JNI code, if the "static block of code to load so the problem" exists, when transformed into dynamic, it is best to set aside the relevant loading code static code block, and so increase the load onFail logic failure time, ensure that all so loaded JNI method calls and will not appear crash.

Code ongoing maintenance costs

This is what I currently troublesome issue, the use of "JNI code built-in program," no JNI code is compiled isolation, very easily lead to subsequent code maintenance process, visit the dynamic so in the wrong life cycle related JNI methods, increase the risk of Crash.

According to past experience dynamic project, "relatively stable, little change in the code, the comparison module boundary cohesion" of business are more suitable for dynamic transformation, it is so dynamic should prefer this type of module, whether it is transformed into "JNI Code isolation scheme "or follow-up maintenance costs are relatively much smaller. For those couples the code more serious version of the iterative very active service module, which is a typical "replacement engine on a high-speed train" problem: the dynamic transformation at the same time, FT parallel iterative code is still bound to generate many conflicts; for coupling more serious code, consider the input-output ratio, then the general will choose "JNI code built-in" program, no JNI code is compiled isolation, it is very easily lead to Crash; transformation is complete, subsequent FT code that changes frequently, follow-up code maintenance pressure, and probably is.

Currently I think the solution is more reliable processing between business team with quality teams from the project management process to find a breakthrough point, the main direction better to let themselves be responsible for the development of FT dynamic transformation of its own modules, lower maintenance costs (taking into account conflict performance targets may be difficult to push). Also we need to try to improve dynamic framework and guidance documents related to distribution according to the actual needs of the project, reducing the cost of access to the FT. As an auxiliary, also we need to place the code prone to conflict with the corresponding static checking Case, in order to detect problems.

Continuous integration, deployment issues (CI / CD)

Step on a series of pits above, seeing dynamic technology solutions to improve 7788, in fact we only just begun!

First of all, how to compile a so plug-in package is a technology live, which is to select the appropriate option depending on the specific project situation (we use the Gradle plugin to extract the target resource file PackageApplication stage). The CI is a problem, in other words, we need a stable and flexible assembly line, used to stabilize the specified version so we compiled plug-in package, rather than manually each time by a very clumsy way braided. Secondly, after the plug-in packages are built, by hand should not be a way to upload files to the back-end, configuration information fill in the relevant version of dependence. This CD is a problem (Continuous Deployment), we should be using automated means (even if only the script), the need to collect configuration information automatically uploaded to an internal environment after integration stage management platform (platform we can view each version data), in test / release stages needed to specify the version of the configuration information, "one-click import" to test pre-release environment, each link should try to avoid manual operation.

Therefore, from the perspective of project management point of view, a complete dynamic scheme, must cover the integration, deployment, content loading frame three processes, and in front of the two most dynamic projects or technical articles did not mention to the , often easily overlooked.

Play Store dynamic code disables issue

Due to some well-known reasons, APK package contains dynamic code can not be uploaded to the Play Store. But Google is not actually forbid dynamic code, but to bypass the prohibition issued under the Play channel dynamic code is not audited. After consultation, provided by Play APK Expansion Files service package to expand the resource, the client may be issued by the relevant plug-resources package, no policy risk (the service is mainly for the game client can be sent to the lower end of the binding APK client version "one of the main resource package + a patch package", a volume limit 1G. Note that you must first bind the resource pack before releasing the specific version of the APK, once released can not be modified).

tail

This article is based on Android project experience dynamic (SDK plug-ins, dynamic component-based) and so dynamic recent practice related to share some of the issues to consider when dynamically loaded so the actual production of the library of my own. The main contents include common problem of plug-in programs, abi compatibility issues, the code-invasive problem, complicated by the question of amending, as well as dlopen issues most important and most easily overlooked. A thousand words merged into one sentence:

Plug-risk, be careful investment!

Think can help focus the next point a praise, if you have any questions can contact me via private message or letter, message may not return

Interested parties can private letter I circled, circled or poke my home, there is a free Android circled advanced data and interviews with thematic collar.


Guess you like

Origin blog.51cto.com/14557669/2478851