Introduction
Matrix is an APM (Application Performance Management) system developed and used by WeChat terminals. As part of the Matrix system, Matrix-ApkChecker is an analysis and detection tool for android installation packages. It detects whether apk has specific problems according to a series of set rules, and outputs a more detailed test result report for analysis and troubleshooting. Version tracking.
For specific use, you can view the Matrix Android ApkChecker documentation
Features of Matrix-ApkChecker
Matrix-ApkChecker currently mainly includes the following functions:
- Read manifest information
Read the global information of apk from the AndroidManifest.xml file, such as packageName, versionCode, etc.
- List files included in apk sorted by file size
List files over a certain size, filter by file suffix, and sort by file size
- Number of statistical methods
Count the number of methods contained in dex, and support grouping the output results according to class name (class) or package name (package)
- Check for resource obfuscation (AndResGuard)
Check whether the apk has undergone resource obfuscation. It is recommended to use resource obfuscation to further reduce the size of the apk
- Search for png files without alpha channel
For png files without alpha channel, you can convert to jpg format to reduce file size
- Check if a dynamic library contains multiple ABI versions
The size of the so file may account for a large proportion of the apk file size. Consider including only one ABI version of the dynamic library in the apk
- Search for uncompressed file types
All files of a certain file type are not compressed, you can consider whether compression is required
- Count the R classes included in the apk and the field count in the R class
After compilation, references to resources in the code will be optimized into int constants. Except for R.styleable, other R classes can actually be deleted.
- Search for redundant files
对于两个内容完全相同的文件,应该去冗余
- 检查是否有多个动态库静态链接了STL
如果有多个动态库都依赖了STL,应该采用动态链接的方式而非多个动态库都去静态链接STL
- 搜索apk中包含的无用资源
apk中未经使用到的资源,应该予以删除
- 搜索apk中包含的无用assets文件
apk中未经使用的assets文件,应该予以删除
- 搜索apk中未经裁剪的动态库文件
动态库经过裁剪之后,文件大小通常会减小很多
Matrix-ApkChecker 的使用
通过matrix配置文件的方式使用
下载 Matrix 源码,编译 matrix-apk-canary 部分的源码,该项目是一个 java 项目,以下的使用示例采用 matrix 配置文件的方式进行。相比较命令行而言,配置文件比较简单和实用。
我们可以打开 APKChecker.java
文件,替换 Main 函数的内容为:
public static void main(String... args) {
String arr[]= new String[2];
arr[0]="--config";
// 配置文件的目录
arr[1]="/Users/codelang/Desktop/matrix/matrix/matrix-android/matrix-apk-canary/src/main/java/com/tencent/matrix/apk/config.json";
// if (ages.length > 0) {
ApkChecker m = new ApkChecker();
m.run(arr);
// } else {
// System.out.println(INTRODUCT + HELP);
// System.exit(0);
// }
}
复制代码
复制代码
通过ApkChecker.jar
运行的方式使用
直接在命令行执行
java -jar ApkChecker.jar
复制代码
-
首先下载ApkChecker.jar,目前最新版本是matrix-apk-canary-2.0.5.jar(点击下载)
-
使用命令行运行ApkChecker.jar文件
3.我们可以根据官方文档给的配置文件进行设置,配置文件是一个 .json 文件 :
config.json
{
"--apk":"/Users/codelang/mesh/android-test/app/build/outputs/apk/onLine/release/onLine-release-v1.2.0.apk",
"--mappingTxt":"/Users/codelang/mesh/android-test/app/build/outputs/mapping/onLine/release/mapping.txt",
"--output":"/Users/codelang/Desktop/matrix/matrix/matrix-android/matrix-apk-canary/src/main/java/com/tencent/matrix/apk/",
"--format":"mm.html,mm.json",
"--formatConfig":
[
{
"name":"-countMethod",
"group":
[
{
"name":"Android System",
"package":"android"
},
{
"name":"java system",
"package":"java"
},
{
"name":"com.tencent.test.$",
"package":"com.tencent.test.$"
}
]
}
],
"options": [
{
"name":"-manifest"
},
{
"name":"-fileSize",
"--min":"10",
"--order":"desc",
"--suffix":"png, jpg, jpeg, gif, arsc"
},
{
"name":"-countMethod",
"--group":"package"
},
{
"name":"-checkResProguard"
},
{
"name":"-findNonAlphaPng",
"--min":"10"
},
{
"name":"-checkMultiLibrary"
},
{
"name":"-uncompressedFile",
"--suffix":"png, jpg, jpeg, gif, arsc"
},
{
"name":"-countR"
},
{
"name":"-duplicatedFile"
},
{
"name":"-checkMultiSTL",
"--toolnm":"/Users/codelang/Library/Android/sdk/ndk-bundle/toolchains/aarch64-linux-android-4.9/prebuilt/darwin-x86_64/bin/aarch64-linux-android-nm"
},
{
"name":"-unusedResources",
"--rTxt":"/Users/codelang/mesh/android-test/app/build/intermediates/symbols/pre/release/R.txt",
"--ignoreResources"
:["R.raw.*",
"R.style.*",
"R.attr.*",
"R.id.*",
"R.string.ignore_*"
]
},
{
"name":"-unusedAssets",
"--ignoreAssets":["*.so" ]
},
{
"name":"-unstrippedSo",
"--toolnm":"/Users/codelang/Library/Android/sdk/ndk-bundle/toolchains/aarch64-linux-android-4.9/prebuilt/darwin-x86_64/bin/aarch64-linux-android-nm"
}
]
}
复制代码
配置文件有几个地方是需要我们去替换的:
- --apk : 需要分析的 apk 文件的路径
- --mappingTxt :mapping.txt 文件
- --output : 分析后的输出目录
- --formatConfig 下的 name 和 pacakge :替换成自己的包名,分析结果会统计包名下类的方法数量
- --toolnm : 替换成自己 NDK 下对应的文件即可
- --rTx : apk 文件生成时,对应的 R 文件目录
以上俩个文件准备好以后执行一下咋电脑终端执行以下命令:
java -jar /(ApkChecker.jar存放路径)/matrix-apk-canary-2.0.5.jar --config /(apk_config.json文件存放路径)/apk_config.json
复制代码
例如: java -jar /Users/xianicai/Desktop/apm/matrix-apk-canary-2.0.5.jar --config /Users/xianicai/Desktop/apm/apk_config.json
运行 Apkchecker.java
,会在对应设置的 output
目录生成 .json 和 .html 文件
.json 文件看起来会有点麻烦,可以打开 .html 文件进行查看分析结果:
实现原理
首先来看下Matrix-ApkChecker的整体工作流程
各Task作用:
ApkTask
- ApkTask主要就是具体apk检测项目的执行基类
- ApkTask实现Callable接口,可以线程池执行得到执行结果TaskResult
- TaskResult最后通过JobResult写入文件
- ApkTask主要通过TaskFactory集中生成.
- ApkTask相关的结构图
UnzipTask
- 负责解压apk,读取类混淆文件和,资源混淆文件
- ManifestAnalyzeTask
- 负责Manifest.xml解析工作,主要利用apktool.jar里的 AXmlResourceParser类解析
- 最后以键值对的形式保存到TaskJsonResult里面
CountClassTask
- 对所有dex文件进行类分析
- 解析每一个dex文件,以dex文件名为基础,解析dex里的包名,包对应的类的层次去解析
- 这里注意代码混淆文件的利用,获取真正的类名。
- CountRTask
- 统计所有R类的数量,和R里面的资源数量
- 解析每一个dex文件,先找到R类(也有可能是资源混淆之后的.R结尾),统计里面的R类字段的数量
DuplicateFileTask
- 重复文件检测
- 遍历解压后的文件夹,对每一个文件内容求md5值,md5作为key,文件名集合作为value
- (感觉可以优化一下,没必要对所有文件内容都求md5,可以先对比收尾等字节的数据.过滤掉一部分)然后再,通过md5求剩下的文件
FindNonAlphaPngTask
- 找到非透明的png文件
- 主要遍历png 和.9.png文件
- bufferedImage.getColorModel().hasAlpha()通过这个函数检测
MethodCountTask
- 解析外部类对应的方法数
- 解析内部类对应的方法数
MultiLibCheckTask
- Multi-architecture so library detection
- Mainly detect whether there are multiple directories under the lib folder
- MultiSTLCheckTask
- Detect whether the so library is repeatedly imported into the stl library
ResProguardCheckTask
- Detect if resources are obfuscated
- Detect whether resource obfuscation directory is configured
- Detect whether the resource file name is obfuscated, if it is not obfuscated, it is useless resource obfuscation
ShowFileSizeTask
- Count the size of each file entry in the decompressed directory
UnCompressedFileTask
- Statistics file size according to suffix name
- You can configure the --suffix parameter to specify the suffix name
UnstrippedSoCheckTask
- Detect whether to strip the symbol table of the so file
- Check the symbol table through the nm command, whether there is a symbol table
UnusedAssetsTask
- Find unused asset files
- First find the absolute path of all asset file storage
- Read the smali code, look for the string declared by const-string, and find the assets path at the end of the asset file name
- All assets file paths remove the referenced paths and the rest is the unreferenced assets path
UnusedResourceTask
- Find unused resource files
- The idea here is the same as the idea of UnusedAssetsTask
- That is, when smali looks for references, not only the variables declared by const-string, but also the sget sput, and array-data instructions to find the resource id