Deep Thoughts on the method of Java in HashCode

Foreword

In recent language learning Go, Go language has pointer object, a pointer variable points to a memory address value. C language learning apes friends should know that the concept of pointers. Go to the C language syntax is similar, it can be said to be a class C programming language, so the Go language has pointer is normal. We can take the address character &will give the corresponding memory address of a variable on the variables used before.

package main

import "fmt"

func main() {
   var a int= 20   /* 声明实际变量 */
   var ip *int        /* 声明指针变量 */

   ip = &a  /* 指针变量的存储地址 */
   fmt.Printf("a 变量的地址是: %x\n", &a  )

   /* 指针变量的存储地址 */
   fmt.Printf("ip 变量储存的指针地址: %x\n", ip )
   /* 使用指针访问值 */
   fmt.Printf("*ip 变量的值: %d\n", *ip )
}
复制代码

Because it mainly development language is Java, so I think Java is not a pointer, then the Java how to get the memory address of the variable it?

If you can get the memory address of the variable then we can clearly know whether the two objects are the same object, if two objects are equal then no doubt the memory address is the same object is different objects and vice versa.

Many people say HashCode method of the object returned is the memory address of the object, including my content in Chapter 5, "Java core programming, volume I" is also said to be found HashCode its value is the memory address of the object.

But HashCode method really is the memory address? Before answering this question let's review some basics.

== and equals

Compare whether two objects are equal in Java mainly through the ==numbers, comparing their storage address in memory. Java Object class is the superclass of all classes inherit the default, if a class is not Object rewrite equalsmethod, then by equalsthe method can determine whether two objects are the same, because it is through internal ==achieved.

//Indicates whether some other object is "equal to" this one.
public boolean equals(Object obj) {
    return (this == obj);
}
复制代码

Tips: Here a doubt additional explanation

When we know that learning Java, Java's single inheritance is inherited, if all classes inherit Object class, then why create a class when you can also extend other classes?

This involves direct and indirect inherit the question of succession, when there is no class created by keywords extendyou inherit the specified class display, the default class directly inherited Object, A -> Object. When you create a class by keyword extendwhen inherited specified class display, it indirectly inherits the Object class, A -> B -> Object .

Here is the same, is that if two objects of comparison are the same object, that is, the address in memory of equality. And sometimes we need to compare the contents of two objects are the same, namely class has its own unique "logic equal" concept, but do not want to know whether they point to the same object.

For example if the two strings are the same as comparison String a = "Hello"and String b = new String("Hello"), where there are two cases is the same, to compare whether a and b are the same object (the same memory address), the content thereof is quite the same? The specific needs of how to distinguish it?

If you use ==it is to compare whether they are the same object in memory, but the default parent class String object is Object, so the default equalsmethod is to compare memory address, so we have to rewrite the equalsmethod as String source code as written .

public boolean equals(Object anObject) {
    if (this == anObject) {
        return true;
    }
    if (anObject instanceof String) {
        String anotherString = (String)anObject;
        int n = value.length;
        if (n == anotherString.value.length) {
            char v1[] = value;
            char v2[] = anotherString.value;
            int i = 0;
            while (n-- != 0) {
                if (v1[i] != v2[i])
                    return false;
                i++;
            }
            return true;
        }
    }
    return false;
}
复制代码

So that when we a == bwhen judging whether a and b are the same object, a.equals(b)it is more a and b of the content is the same, it should be well understood.

JDK in more than String class overrides the equals method, as well as the data type Integer, Long, Double, Float also rewrite the basic equalsmethod. Long used in the code so we do business or Integer parameters of time, if you want to compare them for equality, need to remember to use the equalsmethod, instead ==.

Because use ==number have unexpected pit occurs, such as many types of data are a constant pool in the package, e.g. IntegerCache, LongCache like. When the time will be available directly within a certain range of data values from the constant pool and not to create a new object.

If you want to use ==later, you can convert these packed data types as the basic type, then by ==comparing, since the basic type by ==comparing the values in the conversion process but note NPE (NullPointException) occurs.

Object of HashCode

equals method can compare two content objects are equal, can be used to find whether an object in the collection container, is usually substantially one by one to get each element in the collection of objects with an object to be queried equalscomparison, when found in a when elements of the object you are looking for equality equals method to compare the results, stop and continue the search returns a positive message, otherwise, it returns a negative message.

However, by way of this comparison is very low efficiency, time complexity is relatively high. Can we then coded in some way, each object has a specific code values, according to the code value and then to group objects into different areas, so that when we need, an object in the collection, we can be determined that the first object is stored in which region, then to the zone by the value of the object based on the code equalscomparison contents are equal manner, can know whether the object exists in the collection.

In this way we reduce the number of queries compared to optimize the efficiency of queries while also reducing the time of the query.

This encoding method in Java that hashCode, Object class defines the default method, which is a modified native native method, the value returned is of type int.

/**
 * Returns a hash code value for the object. This method is
 * supported for the benefit of hash tables such as those provided by
 * {@link java.util.HashMap}.
 * ...
 * As much as is reasonably practical, the hashCode method defined by
 * class {@code Object} does return distinct integers for distinct
 * objects. (This is typically implemented by converting the internal
 * address of the object into an integer, but this implementation
 * technique is not required by the
 * Java™ programming language.)
 *
 * @return  a hash code value for this object.
 * @see     java.lang.Object#equals(java.lang.Object)
 * @see     java.lang.System#identityHashCode
 */
public native int hashCode();
复制代码

Notes can be known from the description, hashCode method returns the hash code value for this object. It may be beneficial to like HashMap hash table. HashCode method defined in the class Object returns a different integer values for different objects. Local confusing objection is This is typically implemented by converting the internal address of the object into an integerthis one, which means the way to achieve under normal circumstances is to translate the internal address of the object to integer values.

If you do not think it will get to the bottom return is the object of the memory address, we can continue to look at its implementation, but because it is native method so we can not directly see here how the interior is achieved. java achieve non-native method itself, if you want to see the source code, only jdk download the complete source code, Oracle's JDK is invisible, OpenJDK JRE or other open source can be found in the corresponding C / C ++ code. We find OpenJDK in Object.c file, you can see hashCode methods point to JVM_IHashCodeways to deal with.

static JNINativeMethod methods[] = {
    {"hashCode",    "()I",                    (void *)&JVM_IHashCode},
    {"wait",        "(J)V",                   (void *)&JVM_MonitorWait},
    {"notify",      "()V",                    (void *)&JVM_MonitorNotify},
    {"notifyAll",   "()V",                    (void *)&JVM_MonitorNotifyAll},
    {"clone",       "()Ljava/lang/Object;",   (void *)&JVM_Clone},
};
复制代码

The JVM_IHashCodemethod is implemented in jvm.cpp is defined as:

JVM_ENTRY(jint, JVM_IHashCode(JNIEnv* env, jobject handle))  
  JVMWrapper("JVM_IHashCode");  
  // as implemented in the classic virtual machine; return 0 if object is NULL  
  return handle == NULL ? 0 : ObjectSynchronizer::FastHashCode (THREAD, JNIHandles::resolve_non_null(handle)) ;  
JVM_END 
复制代码

Here is a trinocular expression, the true values are obtained by calculation hashCode ObjectSynchronizer :: FastHashCode , its concrete realization in synchronizer.cpp , the part of the key code fragment taken.

intptr_t ObjectSynchronizer::FastHashCode (Thread * Self, oop obj) {
  if (UseBiasedLocking) {
  
  ......
  
  // Inflate the monitor to set hash code
  monitor = ObjectSynchronizer::inflate(Self, obj);
  // Load displaced header and check it has hash code
  mark = monitor->header();
  assert (mark->is_neutral(), "invariant") ;
  hash = mark->hash();
  if (hash == 0) {
    hash = get_next_hash(Self, obj);
    temp = mark->copy_set_hash(hash); // merge hash code into header
    assert (temp->is_neutral(), "invariant") ;
    test = (markOop) Atomic::cmpxchg_ptr(temp, monitor, mark);
    if (test != mark) {
      // The only update to the header in the monitor (outside GC)
      // is install the hash code. If someone add new usage of
      // displaced header, please update this code
      hash = test->hash();
      assert (test->is_neutral(), "invariant") ;
      assert (hash != 0, "Trivial unexpected object/monitor header usage.");
    }
  }
  // We finally get the hash
  return hash;
}
复制代码

From the above code snippet can be found in the actual calculation of hashCode is get_next_hash, we are still in this document search get_next_hash, get his key code.

static inline intptr_t get_next_hash(Thread * Self, oop obj) {
  intptr_t value = 0 ;
  if (hashCode == 0) {
     // This form uses an unguarded global Park-Miller RNG,
     // so it's possible for two threads to race and generate the same RNG.
     // On MP system we'll have lots of RW access to a global, so the
     // mechanism induces lots of coherency traffic.
     value = os::random() ;
  } else
  if (hashCode == 1) {
     // This variation has the property of being stable (idempotent)
     // between STW operations.  This can be useful in some of the 1-0
     // synchronization schemes.
     intptr_t addrBits = cast_from_oop<intptr_t>(obj) >> 3 ;
     value = addrBits ^ (addrBits >> 5) ^ GVars.stwRandom ;
  } else
  if (hashCode == 2) {
     value = 1 ;            // for sensitivity testing
  } else
  if (hashCode == 3) {
     value = ++GVars.hcSequence ;
  } else
  if (hashCode == 4) {
     value = cast_from_oop<intptr_t>(obj) ;
  } else {
     // Marsaglia's xor-shift scheme with thread-specific state
     // This is probably the best overall implementation -- we'll
     // likely make this the default in future releases.
     unsigned t = Self->_hashStateX ;
     t ^= (t << 11) ;
     Self->_hashStateX = Self->_hashStateY ;
     Self->_hashStateY = Self->_hashStateZ ;
     Self->_hashStateZ = Self->_hashStateW ;
     unsigned v = Self->_hashStateW ;
     v = (v ^ (v >> 19)) ^ (t ^ (t >> 8)) ;
     Self->_hashStateW = v ;
     value = v ;
  }

  value &= markOopDesc::hash_mask;
  if (value == 0) value = 0xBAD ;
  assert (value != markOopDesc::no_hash, "invariant") ;
  TEVENT (hashCode: GENERATE) ;
  return value;
}
复制代码

From get_next_hashthat we can see that the method, if starting from 0 count, then, here are six kinds of programs calculated hash values, there are a variety of ways from increasing sequence of random numbers, associated memory address, etc., which by default is the last official species, i.e., random number generation. Maybe we can see hashCode and memory addresses are related, but not directly on behalf of the memory address of the specific need to see the virtual machine version and settings.

equals和hashCode

Object classes are equals and hashCode method has, including the contents of the Object class toString method also includes printing hashCode unsigned hexadecimal value.

public String toString() {
    return getClass().getName() + "@" + Integer.toHexString(hashCode());
}
复制代码

As the need to compare the contents of the object, so we usually override equals method, but override equals methods also need to override the hashCode method, ever wondered why?

Because if you do not do so would violate common convention hashCode, resulting in the class can not combine all work together properly hash-based collections, including collections such HashMap and HashSet.

Here common convention , annotation from class Object hashCode method may be understood, including the following aspects,

  • During the execution of the application, as long as the information comparison operation equals method of the object being used has not been modified, so many calls to the same object, hashCode method must always return the same value.

  • If two objects are equal according to the equals method comparison, then calling the hashCode method of these two objects must produce the same integer result.

  • If two objects according to the equals method of comparison is not equal, then the two hashCode method caller object, is not necessarily required hashCode method must produce different results. However, to produce unequal objects different integers hash value, it is possible to improve the performance of the hash table (hash table) a.

Theoretically, if the equals method override does not override hashCode method is contrary to the above-described second agreement, equal objects must have equal hash values .

But the rule is that everyone tacit agreement, if we would like to take the unusual way, no method has been rewritten to cover hashCode equals method, what would be the consequences of it?

We customize a Student class, and override the equals method, but we did not override the hashCode method, then calling the hashCode method when the Student class, the default hashCode method is to call the superclass of Object, an integer random number returned type value.

public class Student {

    private String name;

    private String gender;

    public Student(String name, String gender) {
        this.name = name;
        this.gender = gender;
    }

    //省略 Setter,Gettter
    
    @Override
    public boolean equals(Object anObject) {
        if (this == anObject) {
            return true;
        }
        if (anObject instanceof Student) {
            Student anotherStudent = (Student) anObject;

            if (this.getName() == anotherStudent.getName()
                    || this.getGender() == anotherStudent.getGender())
                return true;
        }
        return false;
    }
}
复制代码

We create two objects and set the property values, the next test results:

public static void main(String[] args) {

    Student student1 = new Student("小明", "male");
    Student student2 = new Student("小明", "male");

    System.out.println("equals结果:" + student1.equals(student2));
    System.out.println("对象1的散列值:" + student1.hashCode() + ",对象2的散列值:" + student2.hashCode());
}
复制代码

The results obtained

equals结果:true
对象1的散列值:1058025095,对象2的散列值:665576141
复制代码

We rewrite the equals method to determine the object's properties by name and gender equality content, but because it is calling the hashCode method hashCode Object class, so printing is not equal two integer values.

If the object we use HashMap to store the object as a key, well-known principle of the students should know HashMap, HashMap is a structure consisting of an array + linked list, this result is because they are not equal hashCode, so on a different array indices when we go according to the Key query result is null.

public static void main(String[] args) {

    Student student1 = new Student("小明", "male");
    Student student2 = new Student("小明", "male");
    
    HashMap<Student, String> hashMap = new HashMap<>();
    hashMap.put(student1, "小明");

    String value = hashMap.get(student2);
    System.out.println(value); 
}
复制代码

Output

null
复制代码

We certainly are not satisfied with the results obtained, and student2 student1 different here though memory address, but they have the same logical content, we believe they should be the same.

If this is not well understood, ape Friends of the Student class can be replaced under the String class thinking, we often String class is used as the Key value HashMap, imagine if only the String class overrides the equals method does not override HashCode way here to a string new String("s")as a Key then put a value, but then based on new String("s")it's time to get the results null get, it is difficult for people to accept.

So whether it is agreed on theoretical or actual programming, we rewrite the equals method while always override the hashCode method, please keep this in mind .

Although hashCode method is overridden, but if we want to get the original class Object a hash code, we can System.identityHashCode(Object a)acquire, the method returns the default value hashCode method of Object, hashCode method even if the object is rewritten It does not affect.

public static native int identityHashCode(Object x);
复制代码

to sum up

If HashCode not a memory address, the memory address in Java how to obtain it? Looking around and found no direct method available.

Then think maybe this is the Java language writers that there is no direct access to memory addresses necessary, because Java is a high-level language to machine language with assembly language or C is more abstract and hides the complexity, because, after all, in C and C ++ based further on the package. And because the automatic garbage collection mechanism and target age generation problem, the address in Java object will change, so little sense to get the actual memory address.

Of course, the above bloggers is my own point of view, if there are other friends ape different views or opinions may also leave a message, we discuss together.


Personal Public Number: dishes also cattle

Welcome to long press drawing public attention number: dishes also cow!

Explanation and analysis of distributed regularly offer micro-tier Internet companies and other service-related technology for you.

Guess you like

Origin juejin.im/post/5d50c1826fb9a06b1777a795