Go garbage collection

A. What is garbage collection
Once upon a time, memory management is a major problem programmers to develop applications. The traditional system-level programming languages ​​(mainly C / C ++), one must be careful of memory management operations, control application and release memory. The slightest mistake, it is possible to generate a memory leak problems that difficult to find and difficult to locate, has been a nightmare plagued developers. How to solve this headache problem?
Over the past generally use two methods:
Memory leak detection tools. The principle of this tool is generally static code scanning, code segment of memory leaks can occur through the scanner detects. However, detection tools will inevitably be omissions and deficiencies, can only play a supporting role. (Valgrind)
Smart pointers. This is an automatic memory management method introduced in c ++, the object referenced by the object pointer with automatic memory management functions, a programmer need not be too concerned about the release of memory, and memory to achieve the purpose of automatic release. This method is the most widely adopted practice, but programmers have some learning costs (not the native language support level), and once there is forget to use scenes still can not avoid memory leaks.
To solve this problem, and later developed almost all the new languages ​​(java, python, php, etc.) have introduced the language level of automatic memory management - that is, language users apply only concern with memory without having to be concerned about the release of memory memory is released automatically manage or runtime (runtime) by the virtual machine (virtual machine). And this memory resources are no longer used for automatic recovery of behavior is known as garbage collection.
 
II. Common garbage collection algorithm
① reference count
This is the simplest kind of garbage collection algorithms, smart pointers and similar to previously mentioned. Maintaining a reference count for each object, the referenced object reference to the object when the object is destroyed or updating a reference count decremented, the reference count when the assignment to other objects referenced objects are created or incremented by one. When the reference count is zero immediately be recovered.
The advantage of this method is simple and timely recovery of memory. This algorithm is used more widely in memory is tight and relatively high real-time systems, such as ios cocoa framework, php, python and so on.
Simple reference counting algorithms have significant disadvantages:
Frequently updated reference count reduces performance. A simple solution is to compiler adjacent reference count update incorporated into the next update; there is a way for the temporary variable frequent references will not be counted, but verify by scanning stacks in the reference reaches 0 there are temporary object references and decide whether to release. Etc. There are many other methods.
Circular reference problem: When the circular reference occurs between object references in the chain of objects can not be released. The most obvious solution is to avoid circular references, such as the introduction of strong cocoa weak pointer and a pointer two kinds of pointer type. Or the system reference cycle detection and active chain to break the cycle. Of course, this also increases the complexity of garbage collection.
② labeled Clear
The two-step method, starting from the root tag variable iteration through all objects have been referenced object can be accessed by traversing the application are marked as "Referenced"; flag is cleared after the operation is completed, there is no marked for memory is recovered (recovery may be accompanied by simultaneous defragmentation operation).
This approach solves the problem of reference counting, but there are obvious problems: every time you start garbage collection will suspend all current normal code execution, recovery system responsiveness is greatly reduced! Of course, follow-up, there have been many variants mark & ​​sweep algorithms (such as three-color notation) optimization problem.
③ generational collection
After a lot of real observe that, in object-oriented programming language, the life cycle of the vast majority of objects are very short. Generational collection basic idea is that the stack is divided into two or more called Generation (Generation) space. The newly created object is stored in is called the Cenozoic (young generation) in (generally speaking, the size of the new generation will be much smaller than the old one's), with the repetition of the implementation of garbage collection, long life cycle objects will be promoted ( promotion) to the old era. Therefore, the new generation garbage collection and garbage recall of two different old age garbage collection came into being, respectively, for the garbage collection on their space objects. The new generation garbage collection is very fast, faster than the old year several orders of magnitude, even higher frequency new generation garbage collection, the efficiency is still stronger than the old one's garbage collection, because the life cycle of most objects are short-lived , no need to upgrade to the old era.
 
Three. Go the GC mechanism
Go GC ever since the outset criticized by many people, after so many years of development Go GC has become very good, the following is the GC algorithm milestone Go
v1.1 STW (stop the world)
v1.3 Mark STW, Sweep parallel
three-color labeling v1.5
v1.8 hybrid write barrier
 
Go Language GC algorithm is mainly based on Mark and Sweep (mark cleared) algorithm, improved and optimized on this basis.
 
① Mark and Sweep
 
Mark and Sweep (flag clearing) algorithm mainly the following two steps
Mark (Mark): find all unreachable objects, and then cook mark
Clear (Sweep): The labeled objects recovered
 
We explained by the following illustration mark clear how the algorithm works:
1. Start flag, the STW program is suspended, and this time relationship between the object program as shown below
2. find all reachable objects, and Mark the
3. After marking the completion of start clearing Not Tagged
4. After the purge is complete, the object as shown in FIG.
 
Clear labeling problems following algorithm
1.STW (stop the world) marked object when the program needs to be suspended, causing the program Caton (the main problem)
2. Mark the need to scan the entire heap
3. Clear objects will produce heap fragmentation
STW refers to the runtime all the coroutine are frozen, meaning that the user logic is suspended, so that all the objects will not be changed, this time to scanning is absolutely safe.
 
② Tri-color Marking
In order to solve arithmetic problems caused by clear labeling, Go to clear the mark (three-color notation) algorithm proposed Tri-color Marking on the basis of algorithms to optimize the GC process, the general process is as follows
 
1. Initially all objects are white
2.GC start scanning all reachable objects, marked in gray
3. Locate the object and its reference mark is gray from gray objects, themselves marked in black
4. Monitoring object modification, the circulating step 3 until no gray objects
5.GC recycling white Objects
6. Finally, all black to white objects
 
Three color labeling STW optimization problem is how it is mainly the following two points:
1. Mark parallel operation and user logic: Logic users will often generate or change an object reference, how to ensure that users mark and parallel logic of it? Go To solve this problem the introduction of write barrier mechanism monitors objects in memory during GC modify, and re-mark the object, this time the user can also perform logic (in fact, is very short STW, and then re-target marker), the marking operation can be done parallel to a certain extent and the user logic.
2. Clear and user logic operation in parallel: We know that three-color black and white objects finally left notation, and black is the object of the program is then used to restore, if the object is not to touch the black, white clear only object, certainly it does not affect the program logic, so remove white objects and user logic in parallel.
By allowing the user to shorten the time STW logic to enhance the performance of the overall GC achieve parallel processing in the mark and sweep operations.
 
Original link: https: //blog.csdn.net/chenguolinblog/article/details/90665034

Guess you like

Origin www.cnblogs.com/leadership/p/11598875.html