Better override hashCode method

Specification for rewriting hashCode

Every class that overrides the equals method must also override the hashCode method.

If the hashCode is not overridden, it will not work properly with hash-based collections, such as HashMap, HashSet, and Hashtable, etc. In other words, if the correct hashCode is implemented, the instance of the object can be used as the key of the Hash collection. The following is Specification for rewriting hashCode:

  • During application execution, the hashCode method must consistently return the same integer when called multiple times on the same object, as long as the information used for the comparison operation of the object's equals method has not been modified. During multiple executions of the same application, the integers returned by each execution may be inconsistent;
  • If two objects compare equal according to the equals method, then calling the hashCode method of either object must produce the same integer result;
  • If two objects are not equal according to the equals method, then calling the hashCode method of either of the two objects does not necessarily produce different results. But programmers should be aware that producing distinct integer results for unequal objects may improve hash table performance;

Equal objects must have equal hash codes.

Two different instances may be logically equal (equals), but the hashCode method should return two different random integers. Consider the following PhoneNumber class, which will fail when trying to use it with a HashMap:

package test.ch02;

public class PhoneNumber {

    private final int areaCode;
    private final int prefix;
    private final int lineNumber;

    public PhoneNumber(int areaCode, int prefix, int lineNumber) {
        this.areaCode = areaCode;
        this.prefix = prefix;
        this.lineNumber = lineNumber;
    }

    public int getAreaCode() {
        return areaCode;
    }

    public int getPrefix() {
        return prefix;
    }

    public int getLineNumber() {
        return lineNumber;
    }

    @Override
    public boolean equals(Object o) {
        if (this == o) {
            return true;
        }
        if (!(o instanceof PhoneNumber)) {
            return false;
        }
        PhoneNumber pn = (PhoneNumber) o;
        return pn.lineNumber == lineNumber && pn.prefix == prefix && pn.areaCode == areaCode;
    }

}
package test.ch02;

import java.util.HashMap;
import java.util.Map;

public class Test {

    public static void main(String[] args) {
        Map<PhoneNumber, String> m = new HashMap<>();
        m.put(new PhoneNumber(707, 867, 5309), "Jenny");
        String s = m.get(new PhoneNumber(707, 867, 5309));
        System.out.println(s); // null
    }

}

Since PhoneNumber does not override the hashCode method, this results in two equal instances having unequal hash codes.

Fixing this is as simple as providing an appropriate hashCode for PhoneNumber.

How to override the canonical hashCode method

A good hashCode method tends to "produce unequal hash codes for unequal objects".

Ideally, a hash function should distribute unequal instances in the set evenly over all possible hash values, which can be done as follows:

  1. Store a non-zero constant value, such as 17, in an int variable of result;
  2. For each key field f in the object, complete the following steps:
  • Compute the hash code c of type int for the field:
  • If the field is boolean, compute (f ? 1 : 0);
  • If the field is byte, char, short, or int, compute (int)f;
  • If the field is long, compute(int)(f^(f>>>32));
  • If the field is float, compute Float.floatToIntBits(f);
  • If the field is double, double.doubleToLongBits(f) is calculated, and it is calculated according to long;
  • If the field is an object reference and the class's equals method compares the field by calling equals recursively, then hashCode is also called recursively for this field;
  • If the field is an array, each element should be treated as a separate field, or the Arrays.hashCode method can be used;
  • Calculate the hash code according to result = 31 * result + c;

Override a hashCode for PhoneNumber:

package test.ch02;

public class PhoneNumber {

    private final int areaCode;
    private final int prefix;
    private final int lineNumber;

    public PhoneNumber(int areaCode, int prefix, int lineNumber) {
        this.areaCode = areaCode;
        this.prefix = prefix;
        this.lineNumber = lineNumber;
    }

    public int getAreaCode() {
        return areaCode;
    }

    public int getPrefix() {
        return prefix;
    }

    public int getLineNumber() {
        return lineNumber;
    }

    @Override
    public boolean equals(Object o) {
        if (this == o) {
            return true;
        }
        if (!(o instanceof PhoneNumber)) {
            return false;
        }
        PhoneNumber pn = (PhoneNumber) o;
        return pn.lineNumber == lineNumber && pn.prefix == prefix && pn.areaCode == areaCode;
    }

    @Override
    public int hashCode() {
        int result = 17;
        result = 31 * result + areaCode;
        result = 31 * result + prefix;
        result = 31 * result + lineNumber;
        return result;
    }

}
package test.ch02;

import java.util.HashMap;
import java.util.Map;

public class Test {

    public static void main(String[] args) {
        Map<PhoneNumber, String> m = new HashMap<>();
        m.put(new PhoneNumber(707, 867, 5309), "Jenny");
        String s = m.get(new PhoneNumber(707, 867, 5309));
        System.out.println(s); // Jenny
    }

}

Optimize hashCode method

  • During the calculation of the hash code, redundant fields can be excluded.
  • The reason why 31 was chosen as the hash code is because it is an odd prime number. 31 has a nice property that replacing multiplication with shift and subtraction gives better performance: 31 * i = (i<<5)-i. Modern VMs can do this optimization automatically;
  • If a class is immutable and computing the hash code is expensive, you should consider caching the hash code inside the object. Or consider lazy-initializing the hashcode and cache it internally on the first call to hashCode;
  • Do not attempt to improve performance by excluding critical parts of an object from the hash code calculation;

Guess you like

Origin http://10.200.1.11:23101/article/api/json?id=327065815&siteId=291194637