I. Introduction
The Dart language used by Flutter has a garbage collection mechanism. With garbage collection, memory leaks cannot be avoided. There is a memory leak detection tool LeakCanary [1] on the Android platform , which can easily detect whether the current page is leaked in the debug environment. This article will take you to implement a LeakCanary that Flutter can use, and describe how to use the tool to detect two leaks on the 1.9.1 Framework.
2. Weak references in Dart
In languages with garbage collection, weak references are a good way to detect leaks. We just need to weak reference object of observation, waiting for the next Full GC , GC if after the object null
, indicating that was recovered, if not null
it could be leaked.
There are also weak references in the Dart language. It is called Expando<T>
. Take a look at its API:
class Expando<T> {
external T operator [](Object object "");
external void operator []=(Object object, T value);
}
You may be wondering where the weak references of the above code are reflected? In fact, in expando[key]=value
this assignment. Expando
Will be held in the form of weak references key
, and here is where the weak references are made.
So the question is, the Expando
weak reference is holding key
, but does not itself provide getKey()
such an API, we will not start to know key
whether the object is recovered.
To solve this problem, we look at the Expando
concrete realization of the specific code in the expando_path.dart [2] :
@path
class Expando<T> {
// ...
T operator [](Objet object "") {
var mask = _size - 1;
var idx = object._identityHashCode & mask;
// sdk 是把 key 放到了一个 _data 数组内,这个 wp 是个 _WeakProperty
var wp = _data[idx];
// ... 省略部分代码
return wp.value;
// ... 省略部分代码
}
}
Note : This patch code is not applicable to the web platform
We can see that key
the object is placed _data
within the array, with one _WeakProperty
to wrap, then this _WeakProperty
is the key class, and look to achieve it, the code weak_property.dart [3] :
@pragma("vm:entry-point")
class _WeakProperty {
get key => _getKey();
// ... 省略部分代码
_getKey() native "WeakProperty_getKey";
// ... 省略部分代码
}
This class has what we want key
and can be used to determine whether the object is still there.
How to get such private properties and variables? Dart in Flutter does not support reflection (in order to optimize the packaging volume, reflection is turned off), is there any other way to get this private property?
The answer is definitely " Yes " . In order to solve the above problems, I will introduce a service that comes with Dart - Dart VM Service .
Three, Dart vm_service
Dart VM Service [4] (abbreviated hereafter vm_service
) is a set of web services provided inside the Dart virtual machine, and the data transmission protocol is JSON-RPC 2.0. But we don't need to implement data request parsing by ourselves. The official Dart SDK has been written for us: `vm_service` [5] .
ObjRef
, Obj
And id
the role of
Introduce vm_service
the core ObjRef
content: Obj
, ,id
vm_service
The returned data is mainly divided into two categories, ObjRef
(reference type) and Obj
(object instance type). Wherein Obj
complete contains ObjRef
data, and to increase its basis additional information ( ObjRef
only contains some basic information, such as: id
, name
etc.).
Substantially all of the API
data is returned ObjRef
, when ObjRef
inside information can not meet you, then call getObject(,,,)
to get Obj
.
About id
: Obj
and ObjRef
contain id
, this id
is the object instance vm_service
an identifier inside, vm_service
nearly all of the API will need id
to operate, such getInstance(isolateId, classId, ...)
as: getIsolate(isolateId)
, getObject(isolateId, objectId, ...)
, .
How to use the vm_service
service
vm_service
Will open a locally when it starts WebSocket
service, service URI can be obtained in the corresponding platform:
Android in the
FlutterJNI.getObservatoryUri()
middle;iOS in the
FlutterEngine.observatoryUrl
middle.
Once you have URI we can use vm_service
the service, and to help us have an official written SDK: vm_service [6] , directly inside the vmServiceConnectUri
can get a usable VmService
object.
vmServiceConnectUri
Parameters need to be aws://
URI of the protocol, obtain the defaulthttp
protocol by the needconvertToWebSocketUrl
methods of transforming the
Fourth, the realization of leak detection
With vm_service
After that, we can use it to make up for Expando
the lack of. According to previous analysis, we have to obtain Expando
private field _data
, where you can use getObject (isolateId, objectId) [7] API, its return value is Instance [8] , the internal fields
field holds all the current properties of the object. In this way, we can traverse the attributes to obtain the _data
reflection effect.
The question now is API parameters isoateId
and objectId
what is it? I described earlier according to id
relevant content, it is the object vm_serive
identifier. That is, we only pass vm_service
can get to these two parameters only.
IsolateId
Get
Isolate
(Quarantine) is a very important concept Dart inside, basically an isolate
equivalent to a thread, and the thread but we usually contact is different: different isolate
memory between not shared.
Because of the above features, we also need to bring them when looking for objects isolateId
. Through vm_service
the getVM()
API you can get to the target virtual machine data, and through isolates
you can get to all the fields of the current virtual machine isolate
.
So how do we want to filter out of isolate
it? For the sake of simplicity, only the main filter is selected here. For isolate
this part of the filter, you can view the source code of dev_tools [9] : service_manager.dart#\_initSelectedIsolate [10] function.
ObjectId
Get
We want to get objectId
that expando
in vm_service
the id
, where you can extend the problem:
How to obtain the specified object in vm_service
the id
?
The problem is too much trouble, vm_service
there is no instance of an object and id
converting the API, there is getInstance(isolateId, classId, limit)
the API, you can get an classId
all subclass instance, they will not speak how to get to the desired classId
performance of the API and limit
are worrying.
Is there no good way? In fact, we can use the Library of top-level functions (written directly in the current file, not in the class, for example, the main function) to achieve this function.
Simple instructions under Library is something, subcontract management Dart is based on Library to come, with a Library class name can not be repeated, in general, a
.dart
document is a Library, of course, there are exceptions, such as: part of and export.
vm_service
There is the Invoke (isolateId, targetId, Selector, argumentIds) [11] API, can be used to perform a routine function ( getter
, , setter
constructors, private function unconventional function), which if targetId
is the Library id
, then the invoke
execution is the Library The top-level function.
With invoke
Library top path function, it can be implemented using object- id
a, as follows:
int _key = 0;
/// 顶级函数,必须常规方法,生成 key 用
String generateNewKey() {
return "${++_key}";
}
Map<String, dynamic> _objCache = Map();
/// 顶级函数,根据 key 返回指定对象
dynamic keyToObj(String key) {
return _objCache[key];
}
/// 对象转 id
String obj2Id(VMService service, dynamic obj) async {
// 找到 isolateId。这里的方法就是前面讲的 isolateId 获取方法
String isolateId = findMainIsolateId();
// 找到当前 Library。这里可以遍历 isolate 的 libraries 字段
// 根据 uri 筛选出当前 Library 即可,具体不展开了
String libraryId = findLibraryId();
// 用 vm service 执行 generateNewKey 函数
InstanceRef keyRef = await service.invoke(
isolateId,
libraryId,
"generateNewKey",
// 无参数,所以是空数组
[]
);
// 获取 keyRef 的 String 值
// 这是唯一一个能把 ObjRef 类型转为数值的 api
String key = keyRef.valueAsString;
_objCache[key] = obj;
try {
// 调用 keyToObj 顶级函数,传入 key,获取 obj
InstanceRef valueRef = await service.invoke(
isolateId,
libraryId,
"keyToObj",
// 这里注意,vm_service 需要的是 id,不是值
[keyRef.id]
)
// 这里的 id 就是 obj 对应的 id
return valueRef.id;
} finally {
_objCache.remove(key);
}
return null;
}
Object leakage judgment
Now we have can get to expando
instances vm_service
in id
the next simple.
By first vm_service
acquired Instance
, traversing the inside of the fields
attributes found _data
field (note _data
is ObjRef
the type), in the same way to _data
the field turn into Instance
type ( _data
is an array, Obj
which has an array of child information).
Traversing the _data
field, if we are null
, we show that the observed key
object has been released. If item
not null
, again item
into Instance
an object, take it propertyKey
(because the item is _WeakProperty
type, Instance
which is specially _WeakProperty
opened this field).
Force GC
As mentioned at the beginning of the article, if you want to determine whether the object is leaked, you need to determine whether the weak reference is still there after Full GC. Is there a way to manually trigger the GC?
The answer is yes. vm_service
Although there is no API to force GC, there is a GC button in the upper right corner of the memory icon of Dev Tools. We just follow it to operate it! Dev Tools is called vm_service
the getAllocationProfile (isolateId, gc: to true) [12] API to implement the GC manual.
As for whether this API triggers FULL GC or not, it is not stated. My test triggers are FULL GC. If you want to be sure to detect leaks after FULL GC, you can monitor the gc event stream and vm_service
provide this function.
Thus far, we have been able to achieve control leakage, but also get to leak goals vm_serive
in id
the following analysis began to get a leak path.
Five, get the leak path
Regarding the acquisition of the leak path, vm_service
an API called getRetainingPath(isolateId, objectId, limit) [13] is provided . By directly using this API, you can obtain the reference chain information of the leaked object to the GC Roots. Does it feel simple? But this is not enough, because it has the following pitfalls:
Expando holding issues
If the execution getRetainingPath
time of the leak object is expando
held by the following two issues that would arise
Because there is only one reference chain returned by the API, the returned reference chain will pass through
expando
, which makes it impossible to obtain the real leaked node information;Native crash will appear on ARM devices, and the specific error will appear in utf8 character decoding.
This problem solved, then note the finish in front of leak detection, freed expando
on the line.
id
Expiration problem
Instance
Type id
and Class
, Library
, Isolate
this id
is not the same, it will expire. vm_service
In respect of such a temporary id
default cache capacity size 8192
is a circular queue.
Because of this problem, when we detect a leak, we can't just save the leaked object id
, we need to save the original object, and we can't strongly reference the holding object. So here we still need expando
to save the leak we detected the object, until the need to analyze leakage path, then the object designed for id
.
Six, memory leaks on the 1.9.1 Framework
After completing the leak detection and path acquisition, a simple leakcanary tool was obtained. When I tested this tool under Framework version 1.9.1, I found that it leaked a page when I observed it! ! !
Judging from the objects dumped by dev_tools, it is indeed leaked!
That is, there is a leak in the 1.9.1 Framework, and this leak will leak the entire page.
Next, I started to investigate the cause of the leak, and here was a problem: the leak path was too long: getRetainingPath
the length of the returned link was 300+, and the root of the problem was not found after an afternoon of investigation.
Conclusion: The direct vm_service
data returned is difficult to analyze the source of the problem, the need for information leakage path under secondary treatment.
How to shorten the reference chain
First look at why such a long leak path, after the return of link discovery through observation, most of the nodes are nodes Flutter UI components (such as: widget
, , element
, ).state
renderObject
In other words, the reference chain has passed Flutter's widget tree. Developers familiar with Flutter should know that the level of Flutter's widget tree is very deep. Since the reason for the length of the reference chain is that it contains the widget tree, and the widget tree basically appears in blocks, then we only need to classify and aggregate the nodes in the reference chain according to their types to greatly shorten the leak path.
classification
According to Flutter's component types, nodes are divided into the following types:
element
: Corresponding toElement
the node;widget
: Corresponding toWidget
the node;renderObject
: Corresponding toRenderObject
the node;state
: Corresponding toState<T extends StatefulWdget>
the node;collection
: Node corresponding set type, forList
example:Map
,Set
, ;other: corresponds to other nodes.
polymerization
After the node classification is done, the nodes of the same type can be aggregated. Here is my aggregation method:
The collection
same node type look into the connection node of the node, the adjacent merged into one set, if the collection center by two of the same type collection
connected to the node, to continue a set of these two merged into one set, recursively.
By classification - the polymerization process, the original link length 300+, 100+ can be shortened.
1.9.1 Framework continue troubleshooting the problem of leakage, although the path is shortened, the problem can be found in roughly appear FocusManager
on the node! However, specific problems are still difficult to locate, mainly due to the following two points:
Missing codes CCN reference position : since
RetainingObject
the data onlyparentField
,parentIndex
andparentKey
three fields represent the current object to the next object reference information, through the information to find location code is low efficiency;Flutter can not know the current component node information : for example,
Text
text information,element
the widget is valid and where the life cycle status of the state, the current component belongs to which page, and so on.
Between the above two pain points, it is also necessary to expand the information of the leaked node:
Code positions : the position of the reference node code parse fact, only
parentField
the line, throughvm_serive
parsingclass
, taking insidefield
, find the correspondingscript
information. This method can get the source code;Components node information : Flutter UI components are inherited from
Diagnosticable
, that is, as long as theDiagnosticable
type of node can get to very detailed information (dev_tools debugging time, assembly tree information is through theDiagnosticable.debugFillProperties
method of acquisition). In addition to this, you also need to expand the information of the route where the current component is located. This is very important to determine the page where the component is located.
Troubleshoot the root cause of the 1.9.1 Framework leak
After above all the optimization, I got the following tools in the two _InkResponseState
found a problem node:
Leakage path, there are two _InkResponseState
different route information node belongs, indicating that the two nodes in two different pages. The top of _InkResponseState
the description of the display lifecycle not mounted
, indicating that components have been destroyed, but still be FocusManager
referenced with! The problem is here, look at this part of the code
The code can clearly see the addListener
time of StatefulWidget
the life cycle of misunderstanding. didChangeDependencies
It is called many times, and dispose
will only be called once, so there will be listener
removal of unclean conditions.
After fixing the above-mentioned leak, another leak was found. After investigation found the leak source TransitionRoute
in:
When you open a new page when the page Route
(that is, the code nextRoute
) is a front page of animation
the holding, if the page is jump TransitionRoute
, then all Route
will leak!
The good news is that the above leaks were all fixed after version 1.12.
After completion of the repair of the two leak tested again, Route
and Widget
can be recovered, so far 1.9.1 Framework investigation is completed.
Author: Qi Gengxin
Currently working in the Flutter team of the Kuaishou application development platform group, responsible for the development and research of the APM direction. I have been exposed to Flutter since 2018 and have a lot of experience in Flutter hybrid stack, engineering landing, UI components, etc.
Contact: [email protected]
"Flutter Chinese Community Tutorial" is contributed by community developers, and the content is simultaneously published to the flutter.cn website and various social platforms of the "Flutter Community". During the internal test of this project, submissions will be open after the preparation is completed. Please click "Read the original text" to view the links contained in the corner of the text.