The realization principle of Android memory leak detection LeakCanary2.0 (Kotlin version)

This article introduces the implementation principle of the open source Android memory leak monitoring tool LeakCanary2.0 version, and at the same time introduces the implementation principle of the new version of the new hprof file parsing module, including the hprof file protocol format, partial implementation source code, etc.

I. Overview

LeakCanary is a very common memory leak detection tool. After a series of changes and upgrades, LeakCanary came to version 2.0. The basic principle of memory monitoring in version 2.0 is not much different from previous versions. The more important change is that version 2.0 uses its own hprof file parser and no longer depends on HAHA. The language used by the entire tool has also been switched from Java to Kotlin. This article combines the source code to make a simple analysis and introduction to the basic principles of the 2.0 version of memory leak monitoring and the implementation principle of the hprof file parser.

LeakCanary official link: https://square.github.io/leakcanary/

1.1 Differences between old and new

1.1.1. Access method

New version:  only need to configure in gradle.

dependencies {  // debugImplementation because LeakCanary should only run in debug builds.
  debugImplementation 'com.squareup.leakcanary:leakcanary-android:2.5'}

Old version:  1) Gradle configuration; 2) Initialize LeakCanary.install(this) in Application.

Knock on the blackboard:

1) The initialization of Leakcanary2.0 version is automatically completed when the App process is started;

2) Initialize the source code:

internal sealed class AppWatcherInstaller : ContentProvider() {
 
  /**
   * [MainProcess] automatically sets up the LeakCanary code that runs in the main app process.
   */
  internal class MainProcess : AppWatcherInstaller()
 
  /**
   * When using the `leakcanary-android-process` artifact instead of `leakcanary-android`,
   * [LeakCanaryProcess] automatically sets up the LeakCanary code
   */
  internal class LeakCanaryProcess : AppWatcherInstaller()
 
  override fun onCreate(): Boolean {
    val application = context!!.applicationContext as Application
    AppWatcher.manualInstall(application)
    return true
  }
  //....
}

3) Principle: ContentProvider's onCreate is executed before Application's onCreate, so the onCreate life cycle of AppWatcherInstaller will be automatically executed when the App process is pulled up, and automatic initialization can be completed by using the Android mechanism;

4) Extension: ContentProvider's onCreate method is called in the main process, so you must not perform time-consuming operations, otherwise it will slow down the App startup speed.

1.1.2 Overall function

Leakcanary2.0 version open sourced its own implementation of hprof file parsing and leaked reference chain search function module (named shark). The following chapters will focus on the implementation principle of this part.

1.2 Overall architecture

Leakcanary2.0 version mainly adds the shark part.

Two, source code analysis

LeakCananry automatic detection steps:

  1. Detect objects that may leak;

  2. Heap snapshot, generate hprof file;

  3. Analyze hprof files;

  4. Classify leaks.

2.1 Detection realization

The automatically detected objects include the following four categories:

  • Activity instance destroyed

  • Fragment instance destroyed\

  • Destroyed View instance

  • Cleared ViewModel instance

In addition, LeakCanary will also detect   the objects monitored by AppWatcher :

AppWatcher.objectWatcher.watch(myDetachedView, "View was detached")

2.1.1 LeakCanary initialization

AppWatcher.config  : It contains switches for whether to monitor instances of Activity, Fragment, etc.;

Activity life cycle monitoring: register  Application.ActivityLifecycleCallbacks  ;

Fragment life cycle monitoring: Similarly, register  FragmentManager.FragmentLifecycleCallbacks  , but Fragment is more complicated, because there are three kinds of Fragment, namely android.app.Fragment, androidx.fragment.app.Fragment, android.support.v4.app.Fragment, so Need to register FragmentManager.FragmentLifecycleCallbacks under their respective packages;

ViewModel monitoring: Since ViewModel is also a feature of androidx, it depends on the monitoring of androidx.fragment.app.Fragment;

Monitor the visibility of the Application: trigger HeapDump when it is not visible, and check whether there is a leak in the surviving object. If Activity triggers onActivityStarted, the program is visible, and Activity triggers onActivityStopped, then the program is invisible. Therefore, monitoring visibility is also  achieved by registering  Application.ActivityLifecycleCallbacks .

//InternalAppWatcher初始化fun install(application: Application) {
     
    ......
     
    val configProvider = { AppWatcher.config }
    ActivityDestroyWatcher.install(application, objectWatcher, configProvider)
    FragmentDestroyWatcher.install(application, objectWatcher, configProvider)
    onAppWatcherInstalled(application)
  } 
//InternalleakCanary初始化override fun invoke(application: Application) {
    _application = application    checkRunningInDebuggableBuild()
 
    AppWatcher.objectWatcher.addOnObjectRetainedListener(this)
 
    val heapDumper = AndroidHeapDumper(application, createLeakDirectoryProvider(application))
 
    val gcTrigger = GcTrigger.Default
 
    val configProvider = { LeakCanary.config }    //异步线程执行耗时操作
    val handlerThread = HandlerThread(LEAK_CANARY_THREAD_NAME)
    handlerThread.start()
    val backgroundHandler = Handler(handlerThread.looper)
 
    heapDumpTrigger = HeapDumpTrigger(
        application, backgroundHandler, AppWatcher.objectWatcher, gcTrigger, heapDumper,
        configProvider
    )    //Application 可见性监听
    application.registerVisibilityListener { applicationVisible ->      this.applicationVisible = applicationVisible
      heapDumpTrigger.onApplicationVisibilityChanged(applicationVisible)
    }
    registerResumedActivityListener(application)
    addDynamicShortcut(application)
 
    disableDumpHeapInTests()
  }

2.1.2 How to detect leaks

1) The listener of the object ObjectWatcher

The key code of ObjectWatcher:

@Synchronized fun watch(
    watchedObject: Any,
    description: String
  ) {    if (!isEnabled()) {      return
    }
    removeWeaklyReachableObjects()
    val key = UUID.randomUUID()
        .toString()
    val watchUptimeMillis = clock.uptimeMillis()
    val reference =
      KeyedWeakReference(watchedObject, key, description, watchUptimeMillis, queue)
    SharkLog.d {      "Watching " +
          (if (watchedObject is Class<*>) watchedObject.toString() else "instance of ${watchedObject.javaClass.name}") +
          (if (description.isNotEmpty()) " ($description)" else "") +          " with key $key"
    }
 
    watchedObjects[key] = reference
    checkRetainedExecutor.execute {
      moveToRetained(key)
    }
  }

Key class KeyedWeakReference: Weak reference to the joint use of WeakReference and ReferenceQueue, refer to the parent class of KeyedWeakReference

The construction method of WeakReference.
This use can realize that if the object associated with the weak reference is recycled, the weak reference will be added to the queue, and this mechanism can be used to determine whether the object is recycled in the future. 

2) Detect remaining objects

private fun checkRetainedObjects(reason: String) { 
    val config = configProvider() // A tick will be rescheduled when this is turned back on. 
    if (!config.dumpHeap) { 
      SharkLog.d {"Ignoring check for retained objects scheduled because $ reason: LeakCanary.Config.dumpHeap is false"} return 
    }  
    //Remove unreachable objects for the first time 
    var retainedReferenceCount = objectWatcher.retainedObjectCount  
    if (retainedReferenceCount> 0) {// 
      Start GC actively gcTrigger.runGc() //Second Removal of unreachable objects 
      retainedReferenceCount = objectWatcher.retainedObjectCount 
    }  
    //Determine whether there are remaining monitoring objects alive, and whether the number of surviving exceeds the threshold
    if (checkRetainedCount(retainedReferenceCount, config.retainedVisibleThreshold)) return
 
    ....
 
    SharkLog.d { "Check for retained objects found $retainedReferenceCount objects, dumping the heap" }
    dismi***etainedCountNotification()
    dumpHeap(retainedReferenceCount, retry = true)
  }

The main steps of detection:

  • Remove the unreachable object for the first time: remove  the KeyedWeakReference  object recorded in the  ReferenceQueue (referring to the monitored object instance);

  • Actively trigger GC: reclaim unreachable objects;

  • Remove unreachable objects for the second time: After a GC, only objects held by WeakReference can be recycled, so  the KeyedWeakReferenc e object recorded in ReferenceQueue is removed again ;

  • Determine whether there are remaining monitoring objects alive, and whether the number of surviving objects exceeds the threshold;

  • If the above conditions are met, the Hprof file will be grabbed, and the actual call is the android native Debug.dumpHprofData(heapDumpFile.absolutePath)  ;

  • Start the asynchronous HeapAnalyzerService to  analyze the hprof file and find the leaked GcRoot link. This is also the main content behind.

//HeapDumpTriggerprivate fun dumpHeap(
    retainedReferenceCount: Int,
    retry: Boolean
  ) {
      
   ....
      
    HeapAnalyzerService.runAnalysis(application, heapDumpFile)
  }

2.2 Hprof file analysis

Analysis entry:

//HeapAnalyzerServiceprivate fun analyzeHeap(
    heapDumpFile: File,
    config: Config
  ): HeapAnalysis {
    val heapAnalyzer = HeapAnalyzer(this)
 
    val proguardMappingReader = try {        //解析混淆文件
      ProguardMappingReader(assets.open(PROGUARD_MAPPING_FILE_NAME))
    } catch (e: IOException) {      null
    }    //分析hprof文件
    return heapAnalyzer.analyze(
        heapDumpFile = heapDumpFile,
        leakingObjectFinder = config.leakingObjectFinder,
        referenceMatchers = config.referenceMatchers,
        computeRetainedHeapSize = config.computeRetainedHeapSize,
        objectInspectors = config.objectInspectors,
        metadataExtractor = config.metadataExtractor,
        proguardMapping = proguardMappingReader?.readProguardMapping()
    )
  }

Regarding the parsing details of the Hprof file, the Hprof binary file protocol needs to be involved:

http://hg.openjdk.java.net/jdk6/jdk6/jdk/raw-file/tip/src/share/demo/jvmti/hprof/manual.html#mozTocId848088

By reading the protocol document, the binary file structure of hprof is roughly as follows:

Resolution process:

fun analyze(
   heapDumpFile: File,
   leakingObjectFinder: LeakingObjectFinder,
   referenceMatchers: List<ReferenceMatcher> = emptyList(),
   computeRetainedHeapSize: Boolean = false,
   objectInspectors: List<ObjectInspector> = emptyList(),
   metadataExtractor: MetadataExtractor = MetadataExtractor.NO_OP,
   proguardMapping: ProguardMapping? = null
 ): HeapAnalysis {
   val analysisStartNanoTime = System.nanoTime() 
   if (!heapDumpFile.exists()) {
     val exception = IllegalArgumentException("File does not exist: $heapDumpFile")     return HeapAnalysisFailure(
         heapDumpFile, System.currentTimeMillis(), since(analysisStartNanoTime),
         HeapAnalysisException(exception)
     )
   } 
   return try {
     listener.onAnalysisProgress(PARSING_HEAP_DUMP)
     Hprof.open(heapDumpFile)
         .use { hprof ->
           val graph = HprofHeapGraph.indexHprof(hprof, proguardMapping)//建立gragh
           val helpers =
             FindLeakInput(graph, referenceMatchers, computeRetainedHeapSize, objectInspectors)
           helpers.analyzeGraph(//分析graph
               metadataExtractor, leakingObjectFinder, heapDumpFile, analysisStartNanoTime
           )
         }
   } catch (exception: Throwable) {
     HeapAnalysisFailure(
         heapDumpFile, System.currentTimeMillis(), since(analysisStartNanoTime),
         HeapAnalysisException(exception)
     )
   }
 }

When LeakCanary establishes the object instance Graph, it mainly parses the following types of tags:

TAG

meaning

content

STRING

String

Character ID, string content

LOAD CLASS

Loaded classes

Serial number, class object ID, stack serial number, class name string ID

CLASS DUMP

Class snapshot

Class object ID, stack sequence number, parent class object ID, class loader object ID, signs object ID, protection domain object ID, 2 reserved, object size (byte), constant pool, static domain, instance domain

INSTANCE DUMP

Object instance snapshot

Object ID, stack sequence number, class object ID, the size of the instance field (byte), the value of each field of the instance

OBJECT ARRAY DUMP

Object array snapshot

Array object ID, stack sequence number, number of elements, array object ID, ID of each element object

PRIMITIVE ARRAY DUMP

Snapshot of primitive type array

Array object ID, stack sequence number, number of elements, element type, each element

Each GCRoot



The GCRoot objects involved are as follows:

TAG

Remarks

content

ROOT UNKNOWN


Object ID

ROOT JNI GLOBAL

Global variables in JNI

Object ID, object ID referenced by jni global variables

ROOT JNI LOCAL

Local variables and parameters in JNI

Object ID, thread serial number, stack frame number

ROOT JAVA FRAME

Java stack frame

Object ID, thread serial number, stack frame number

ROOT NATIVE STACK

The input and output parameters of the native method

Object ID, thread serial number

ROOT STICKY CLASS

Sticky

Object ID

ROOT THREAD BLOCK

Thread block

Object ID, thread serial number

ROOT MONITOR USED

Objects that have been called wait() or notify() or are synchronized

Object ID

ROOT THREAD OBJECT

Threads that are started without stop

Thread object ID, thread sequence number, stack sequence number


2.2.1 Build memory index (Graph content index)

LeakCanary will build a HprofHeapGraph object based on the Hprof file, which records the following member variables:

interface HeapGraph {
  val identifierByteSize: Int  /**
   * In memory store that can be used to store objects this [HeapGraph] instance.
   */
  val context: GraphContext  /**
   * All GC roots which type matches types known to this heap graph and which point to non null
   * references. You can retrieve the object that a GC Root points to by calling [findObjectById]
   * with [GcRoot.id], however you need to first check that [objectExists] returns true because
   * GC roots can point to objects that don't exist in the heap dump.
   */
  val gcRoots: List<GcRoot>  /**
   * Sequence of all objects in the heap dump.
   *
   * This sequence does not trigger any IO reads.
   */ 
  val objects: Sequence<HeapObject> //Sequence of all objects, including class objects, instance objects, object arrays, primitive type arrays 
 
  val classes: Sequence<HeapClass> //Class object sequence 
 
  val instances: Sequence<HeapInstance> // Instance object array 
 
  val objectArrays: Sequence<HeapObjectArray> //Object array sequence 
   
  val primitiveArrays: Sequence<HeapPrimitiveArray> //Primitive type array sequence}

In order to quickly locate the corresponding object in the hprof file, LeakCanary provides the memory index HprofInMemoryIndex:

  1. Establish string index hprofStringCache (Key-value): key is a character ID, value is a string;

    Function: You can find the character ID according to the class name, or find the class name according to the character ID.

  2. Establish class name index classNames (Key-value): key is the class object ID, value is the class string ID;

    Function: Query the class string ID according to the class object ID.

  3. Establish an instance index instanceIndex ( Key-value): key is the instance object ID, value is the position of the object in the hprof file and the class object ID;

    Function: Quickly locate the location of the instance, and facilitate the analysis of the value of the instance field.

  4. Establish class object index classIndex (Key-value): key is the ID of the class object, and value is the binary combination of other fields (parent class ID, instance size, etc.);

    Function: Quickly locate the position of the class object to facilitate the analysis of the class field type.

  5. Establish an object array index objectArrayIndex (Key-value): key is the class object ID, value is the binary combination of other fields (hprof file location, etc.);

    Function: Quickly locate the position of the object array to facilitate the analysis of the objects referenced by the object array.

  6. Establish primitive array index primitiveArrayIndex (Key-value): key is the class object ID, value is the binary combination of other fields (hprof file location, element type, etc.);

2.2.2 Find the leaked object

1) Because the object to be detected is

com.squareup.leakcanary.KeyedWeakReference holds, so it can be based on

com.squareup.leakcanary.KeyedWeakReference class name query to class object ID;

2) Analyze the instance domain of the corresponding class, find the field name and the referenced object ID, that is, the leaked object ID;

2.2.3 Find the shortest GCRoot reference chain

According to the parsed GCRoot objects and leaked objects, search for the shortest reference chain in the graph. Here, the breadth-first traversal algorithm is used to search:

//PathFinderprivate fun State.findPathsFromGcRoots(): PathFindingResults {
    enqueueGcRoots()//1
 
    val shortestPathsToLeakingObjects = mutableListOf<ReferencePathNode>()
    visitingQueue@ while (queuesNotEmpty) {
      val node = poll()//2
 
      if (checkSeen(node)) {//2
        throw IllegalStateException(            "Node $node objectId=${node.objectId} should not be enqueued when already visited or enqueued"
        )
      } 
      if (node.objectId in leakingObjectIds) {//3
        shortestPathsToLeakingObjects.add(node)        // Found all refs, stop searching (unless computing retained size)
        if (shortestPathsToLeakingObjects.size == leakingObjectIds.size) {//4
          if (computeRetainedHeapSize) {
            listener.onAnalysisProgress(FINDING_DOMINATORS)
          } else {            break@visitingQueue
          }
        }
      }
 
      when (val heapObject = graph.findObjectById(node.objectId)) {//5
        is HeapClass -> visitCla***ecord(heapObject, node)
        is HeapInstance -> visitInstance(heapObject, node)
        is HeapObjectArray -> visitObjectArray(heapObject, node)
      }
    }    return PathFindingResults(shortestPathsToLeakingObjects, dominatedObjectIds)
  }

1) GCRoot objects are all entered into the team;

2) The objects in the queue are dequeued in order to determine whether the objects have been visited, if they have been visited, an exception will be thrown, and if they have not been visited, continue;

3) Determine whether the object id of the team is the object that needs to be detected, if it is, record it, if it is not, continue;

4) Determine whether the number of recorded object IDs is equal to the number of leaked objects, if they are equal, the search ends, otherwise, it continues;

5) According to the object type (class object, instance object, object array object), access the object in different ways, resolve the objects referenced in the object into the team, and repeat 2).


The enqueued elements have a corresponding data structure ReferencePathNode. The principle is a linked list, which can be used to infer the reference chain.


Three, summary

The biggest change of Leakcanary2.0 from the previous version is that it is implemented by kotlin and open sourced the code of hprof parsing. The overall idea is to parse the content of the file into a graph data structure according to the binary protocol of the hprof file. Of course, this structure Need a lot of detailed design, this article does not cover everything, and then traverse the graph to find the shortest path. The beginning of the path is the GCRoot object, and the end is the leaked object. As for the identification principle of the leaked object, there is no difference from the previous version.

Author: vivo Internet Client team -Li Peidong


Guess you like

Origin blog.51cto.com/14291117/2677216
Recommended