The principle of Java reference types is often asked in interviews, and I will take you to analyze it in depth

There are a total of 4 reference types in Java (in fact, there are some other reference types such as FinalReference): strong reference, soft reference, weak reference, virtual reference.

Among them, the strong reference is the form of Object a = new Object(); we often use, and there is no corresponding Reference class in Java.

This article mainly analyzes the implementation of soft references, weak references, and virtual references. These three reference types are inherited from the Reference class, and the main logic is also in Reference.

question

Before the analysis, ask a few questions?

1. The introduction of soft references in most articles on the Internet is that they will be recycled when there is insufficient memory. How is insufficient memory defined? What is out of memory?

2. The introduction of virtual references in most articles on the Internet is: virtual references do not determine the life cycle of objects. Primarily used to track the activity of objects being reclaimed by the garbage collector. Is it really?

3. In which scenarios are virtual references used in Jdk?

Reference

Let's first look at several fields in Reference.java

public abstract class Reference { //The referenced object private T referent; //The recycle queue is specified by the user in the constructor of Reference volatile ReferenceQueue<? super T> queue; //When the reference is added to the queue , this field is set to the next element in the queue to form a linked list structure volatile Reference next; //During GC, the bottom layer of the JVM will maintain a linked list called DiscoveredList, which stores the Reference object, and the discovered field points to the linked list The next element of the JVM is set transient private Reference discovered; //The lock object for thread synchronization static private class Lock { } private static Lock lock = new Lock(); //The Reference object waiting to be added to the queue, is set by the GC during GC JVM settings, there will be a java layer thread (ReferenceHandler) continuously extracting elements from pending and adding them to queue private static Reference pending = null; }













The life cycle of a Reference object is as follows:

image.png

It is mainly divided into two parts: Native layer and Java layer.

The Native layer adds the Reference objects that need to be recycled to the DiscoveredList during GC (the code is in the process_discovered_references method in referenceProcessor.cpp), and then moves the elements of the DiscoveredList to the PendingList (the code is in the enqueue_discovered_ref_helper method in referenceProcessor.cpp), the PendingList team The first is the pending object in the Reference class.

Look at the code of the Java layer
private static class ReferenceHandler extends Thread { public void run() { while (true) { tryHandlePending(true); } } } static boolean tryHandlePending(boolean waitForNotify) { Reference r; Cleaner c; try { synchronized (lock) { if (pending != null) { r = pending; //If it is a Cleaner object, record it and do special processing below c = r instanceof Cleaner ? (Cleaner) r : null; //Point to the lower part of PendingList An object pending = r.discovered; r.discovered = null; } else { //If pending is null, wait first. When an object is added to PendingList, jvm will execute notify if (waitForNotify) {






















lock.wait();
}
// retry if waited
return waitForNotify;
}
}
}

    // 如果时CLeaner对象,则调用clean方法进行资源回收
    if (c != null) {
        c.clean();
        return true;
    }
    //将Reference加入到ReferenceQueue,开发者可以通过从ReferenceQueue中poll元素感知到对象被回收的事件。
    ReferenceQueue<? super Object> q = r.queue;
    if (q != ReferenceQueue.NULL) q.enqueue(r);
    return true;

}
The process is relatively simple: it is to continuously extract elements from the PendingList, and then add them to the ReferenceQueue. The developer can perceive the event of the object being recycled through the poll element in the ReferenceQueue.

It should also be noted that there will be additional processing for objects of type Cleaner (inherited from virtual references): when the object pointed to by it is recycled, the clean method will be called. This method is mainly used for corresponding resource recycling. The off-heap memory DirectByteBuffer uses Cleaner to recycle off-heap memory, which is also a typical application of virtual references in java.

After reading the implementation of Reference, let's take a look at the differences in several implementation classes.

SoftReference

public class SoftReference extends Reference {

static private long clock;

private long timestamp;

public SoftReference(T referent) {
    super(referent);
    this.timestamp = clock;
}

public SoftReference(T referent, ReferenceQueue<? super T> q) {
    super(referent, q);
    this.timestamp = clock;
}

public T get() {
    T o = super.get();
    if (o != null && this.timestamp != clock)
        this.timestamp = clock;
    return o;
}

}

The implementation of soft references is very simple, with two more fields: clock and timestamp. clock is a static variable, and this field is set to the current time every time a GC occurs. The timestamp field will be assigned to clock each time the get method is called (if not equal and the object is not collected).

**What is the role of these two fields? ** What does this have to do with soft references being reclaimed when memory is not enough?

These also have to look at the source code of the JVM, because the decision whether the object needs to be recycled is implemented in the GC.

size_t
ReferenceProcessor::process_discovered_reflist(
DiscoveredList refs_lists[],
ReferencePolicy* policy,
bool clear_referent,
BoolObjectClosure* is_alive,
OopClosure* keep_alive,
VoidClosure* complete_gc,
AbstractRefProcTaskExecutor* task_executor)
{ //Remember the DiscoveredList mentioned above? refs_lists is DiscoveredList. //The processing of DiscoveredList is divided into several stages, and the processing of SoftReference is in the first stage ... for (uint i = 0; i < _max_num_q; i++) { process_phase1(refs_lists[i], policy, is_alive, keep_alive, complete_gc) ; } }









//The main purpose of this stage is to remove the corresponding SoftReference from the refs_list when the memory is sufficient.
void
ReferenceProcessor::process_phase1(DiscoveredList& refs_list,
ReferencePolicy* policy,
BoolObjectClosure* is_alive,
OopClosure* keep_alive,
VoidClosure* complete_gc) {

DiscoveredListIterator iter(refs_list, keep_alive, is_alive);
// Decide which softly reachable refs should be kept alive.
while (iter.has_next()) { iter.load_ptrs(DEBUG_ONLY(!discovery_is_atomic() /* allow_null_referent */)); / / Determine whether the referenced object is alive bool referent_is_dead = (iter.referent() != NULL) && !iter.is_referent_alive(); //If the referenced object is no longer alive, it will call the corresponding ReferencePolicy to determine whether the object is To be recycled from time to time if (referent_is_dead && !policy->should_clear_reference(iter.obj(), _soft_ref_timestamp_clock)) { if (TraceReferenceGC) { gclog_or_tty->print_cr(“Dropping reference (” INTPTR_FORMAT “: %s” “) by policy” , (void *)iter.obj(), iter.obj()->klass()->internal_name()); }










// Remove Reference object from list
iter.remove();
// Make the Reference object active again
iter.make_active();
// keep the referent around
iter.make_referent_alive();
iter.move_to_next();
} else {
iter.next();
}
}

}

refs_lists stores a certain reference type (virtual reference, soft reference, weak reference, etc.) discovered by this GC, and the function of the process_discovered_reflist method is to remove objects that do not need to be recycled from refs_lists, and the last remaining elements of refs_lists All the elements that need to be recycled, and finally the first element will be assigned to the Reference.java#pending field mentioned above.

There are 4 implementations of ReferencePolicy: NeverClearPolicy, AlwaysClearPolicy, LRUCurrentHeapPolicy, LRUMaxHeapPolicy.

Among them, NeverClearPolicy always returns false, which means that SoftReference will never be recycled. In the JVM, this class is not used, and AlwaysClearPolicy always returns true. In the referenceProcessor.hpp#setup method, you can set the policy to AlwaysClearPolicy. As for when AlwaysClearPolicy will be used, Anyone who is interested can do their own research.

The should_clear_reference methods of LRUCurrentHeapPolicy and LRUMaxHeapPolicy are exactly the same:

bool LRUMaxHeapPolicy::should_clear_reference(oop p,
jlong timestamp_clock) {
jlong interval = timestamp_clock - java_lang_ref_SoftReference::timestamp§;
assert(interval >= 0, “Sanity check”);

// The interval will be zero if the ref was accessed since the last scavenge/gc.
if(interval <= _max_interval) {
return false;
}

return true;
}

timestamp_clock is the static field clock of SoftReference, and java_lang_ref_SoftReference::timestamp§ corresponds to the field timestamp. If SoftReference#get is called after the last GC, the interval value is 0, otherwise it is the time difference between several GCs.

_max_interval represents a critical value, and its value is different between LRUCurrentHeapPolicy and LRUMaxHeapPolicy.

void LRUCurrentHeapPolicy::setup() {
_max_interval = (Universe::get_heap_free_at_last_gc() / M) * SoftRefLRUPolicyMSPerMB;
assert(_max_interval >= 0,“Sanity check”);
}

void LRUMaxHeapPolicy::setup() {
size_t max_heap = MaxHeapSize;
max_heap -= Universe::get_heap_used_at_last_gc();
max_heap /= M;

_max_interval = max_heap * SoftRefLRUPolicyMSPerMB;
assert(_max_interval >= 0,“Sanity check”);
}

Among them, SoftRefLRUPolicyMSPerMB defaults to 1000. The calculation method of the former is related to the available heap size after the last GC, and the calculation method of the latter is related to (heap size - heap usage size at the last GC). Here I recommend an architecture learning exchange circle to everyone. Communication study guide pseudo-xin: 1253431195 (there are a lot of interview questions and answers), which will share some video recordings recorded by senior architects: there are Spring, MyBatis, Netty source code analysis, high concurrency, high performance, distributed, micro-service architecture The principle of JVM performance optimization, distributed architecture, etc. have become the necessary knowledge system for architects. You can also receive free learning resources, which are currently benefiting a lot

Seeing this, you will know when the SoftReference is reclaimed, and it is related to the policy used (the default should be LRUCurrentHeapPolicy), the available heap size, and the time when the SoftReference last called the get method.

WeakReference

public class WeakReference extends Reference {

public WeakReference(T referent) {
    super(referent);
}

public WeakReference(T referent, ReferenceQueue<? super T> q) {
    super(referent, q);
}

}

It can be seen that WeakReference only inherits Reference in the Java layer without any changes. When is the referent field set to null? To figure this out, let's look at the process_discovered_reflist method mentioned above:

size_t
ReferenceProcessor::process_discovered_reflist(
DiscoveredList refs_lists[],
ReferencePolicy* policy,
bool clear_referent,
BoolObjectClosure* is_alive,
OopClosure* keep_alive,
VoidClosure* complete_gc,
AbstractRefProcTaskExecutor* task_executor)
{

//Phase 1: Remove all soft references that are not alive but cannot be recycled from refs_lists (only when refs_lists is a soft reference, the policy here is not null)
if (policy != NULL) { if (mt_processing) { RefProcPhase1Task phase1(*this, refs_lists, policy, true / marks_oops_alive /); task_executor->execute(phase1); } else { for (uint i = 0; i < _max_num_q; i++) { process_phase1(refs_lists[i], policy , is_alive, keep_alive, complete_gc); } } } else { // policy == NULL assert(refs_lists != _discoveredSoftRefs, “Policy must be specified for soft references.”); }












// Phase 2:
// Remove all references to objects that are still alive
if (mt_processing) { RefProcPhase2Task phase2(*this, refs_lists, !discovery_is_atomic() / marks_oops_alive /); task_executor->execute(phase2); } else { for (uint i = 0; i < _max_num_q; i++) { process_phase2(refs_lists[i], is_alive, keep_alive, complete_gc); } }






// Phase 3:
// Determine whether to recycle dead objects according to the value of clear_referent
if (mt_processing) { RefProcPhase3Task phase3(*this, refs_lists, clear_referent, true / marks_oops_alive /); task_executor->execute(phase3); } else { for (uint i = 0; i < _max_num_q; i++) { process_phase3(refs_lists[i], clear_referent, is_alive, keep_alive, complete_gc); } }







return total_list_count;
}

void
ReferenceProcessor::process_phase3(DiscoveredList& refs_list,
bool clear_referent,
BoolObjectClosure* is_alive,
OopClosure* keep_alive,
VoidClosure* complete_gc) { ResourceMark rm; DiscoveredListIterator iter(refs_list, keep_alive, is_alive); while (iter.has_next()) { iter.update_discovered (); iter.load_ptrs(DEBUG_ONLY(false /* allow_null_referent */)); if (clear_referent) { // NULL out referent pointer //Set the referent field of Reference to null, and then it will be recycled by GC iter.clear_referent() ; } else { // keep the referent around // mark the referenced object as alive, the object will not be recycled in this GC iter.make_referent_alive(); }















}

}

Whether it is a weak reference or other reference types, the operation of setting the field referent to null occurs in process_phase3, and the specific behavior is determined by the value of clear_referent. The value of clear_referent is related to the reference type.

ReferenceProcessorStats ReferenceProcessor::process_discovered_references(
BoolObjectClosure* is_alive,
OopClosure* keep_alive,
VoidClosure* complete_gc,
AbstractRefProcTaskExecutor* task_executor,
GCTimer* gc_timer) {
NOT_PRODUCT(verify_ok_to_handle_reflists());

//process_discovered_reflist方法的第3个字段就是clear_referent
// Soft references
size_t soft_count = 0;
{
GCTraceTime tt(“SoftReference”, trace_time, false, gc_timer);
soft_count =
process_discovered_reflist(_discoveredSoftRefs, _current_soft_ref_policy, true,
is_alive, keep_alive, complete_gc, task_executor);
}

update_soft_ref_master_clock();

// Weak references
size_t weak_count = 0;
{
GCTraceTime tt(“WeakReference”, trace_time, false, gc_timer);
weak_count =
process_discovered_reflist(_discoveredWeakRefs, NULL, true,
is_alive, keep_alive, complete_gc, task_executor);
}

// Final references
size_t final_count = 0;
{
GCTraceTime tt(“FinalReference”, trace_time, false, gc_timer);
final_count =
process_discovered_reflist(_discoveredFinalRefs, NULL, false,
is_alive, keep_alive, complete_gc, task_executor);
}

// Phantom references
size_t phantom_count = 0;
{ GCTraceTime tt(“PhantomReference”, trace_time, false, gc_timer); phantom_count = process_discovered_reflist(_discoveredPhantomRefs, NULL, false, is_alive, keep_alive, complete_gc, task_executor); } } As you can see, For Soft references and Weak references, the clear_referent field is passed in true, which is also in line with our expectations: after the object is unreachable, the reference field will be set to null, and then the object will be reclaimed (for soft references, if If there is enough memory, in Phase 1, the relevant reference will be removed from the refs_list, and in Phase 3, the refs_list will be an empty set).







But for Final references and Phantom references, the clear_referent field is passed in false, which means that the objects referenced by these two reference types, if there is no other additional processing, as long as the Reference object is still alive, the referenced object will not be recycled. Final references are related to whether the object overrides the finalize method, which is not within the scope of this article's analysis. Let's take a look at Phantom references next.

PhantomReference

public class PhantomReference<T> extends Reference<T> {

    public T get() {
        return null;
    }

    public PhantomReference(T referent, ReferenceQueue<? super T> q) {
        super(referent, q);
    }

}

You can see that the get method of the virtual reference always returns null, let's see a demo.

  public static void demo() throws InterruptedException {
        Object obj = new Object();
        ReferenceQueue<Object> refQueue =new ReferenceQueue<>();
        PhantomReference<Object> phanRef =new PhantomReference<>(obj, refQueue);

        Object objg = phanRef.get();
        //这里拿到的是null
        System.out.println(objg);
        //让obj变成垃圾
        obj=null;
        System.gc();
        Thread.sleep(3000);
        //gc后会将phanRef加入到refQueue中
        Reference<? extends Object> phanRefP = refQueue.remove();
         //这里输出true
        System.out.println(phanRefP==phanRef);
    }

As can be seen from the above code, a virtual reference can get a 'notification' when the pointed object is unreachable (in fact, all classes that inherit References have this function). It should be noted that after the GC is completed, phanRef.referent still points to the previous creation Object, which means that the Object object has not been recycled!

The reason for this phenomenon has been mentioned at the end of the previous section: for Final references and Phantom references, the clear_referent field is passed in as false, which means that the objects referenced by these two reference types, if there is no other additional processing , will not be recycled in the GC.

For virtual references, after getting the reference object from refQueue.remove();, you can call the clear method to forcibly remove the relationship between the reference and the object, so that the object can be recycled next time it can be GCed.

End

In response to the questions raised at the beginning of the article, after reading the analysis, we have been able to give answers:

1. We often see the introduction of soft references on the Internet: it will only be recycled when there is insufficient memory. How is insufficient memory defined? Why is it called out of memory?

Soft references will be reclaimed when memory is insufficient. The definition of insufficient memory is related to the get time of the reference object and the current heap available memory size. The calculation formula is also given above.

2. The introduction of virtual reference on the Internet is: it is just in name, different from other references, virtual reference does not determine the life cycle of the object. Primarily used to track the activity of objects being reclaimed by the garbage collector. Is it really?

Strictly speaking, virtual references will affect the life cycle of objects. If nothing is done, as long as the virtual references are not recycled, the objects they refer to will never be recycled. So in general, after obtaining the PhantomReference object from the ReferenceQueue, if the PhantomReference object will not be recycled (for example, it is referenced by other objects reachable by GC ROOT), you need to call the clear method to release the reference relationship between the PhantomReference and its reference object.

3. In which scenarios are virtual references used in Jdk?

In DirectByteBuffer, the subclass of virtual reference Cleaner.java is used to achieve off-heap memory recovery. I will write an article to talk about the inside and outside of off-heap memory.

Guess you like

Origin blog.csdn.net/m0_54828003/article/details/127197202