重写equals为何总要重写hashCode

Object中的定义

/**

     * Returns a hash code value for the object. This method is

     * supported for the benefit of hash tables such as those provided by

     * {@link java.util.HashMap}.

     * <p>

     * The general contract of {@code hashCode} is:

     *

     * 每当在Java应用程序的执行过程中多次调用同一个对象时，

     * @code hashCode方法必须始终返回相同的整数，

     * 前提是在@code中使用的信息没有被修改。

     * 这个整数不需要保持一致，从一个应用程序的执行到同一个应用程序的另一个执行。  

     * <li>Whenever it is invoked on the same object more than once during

     *     an execution of a Java application, the {@code hashCode} method

     *     must consistently return the same integer, provided no information

     *     used in {@code equals} comparisons on the object is modified.

     *     This integer need not remain consistent from one execution of an

     *     application to another execution of the same application.

     *

     * 如果根据@code equals（Object）方法，两个对象是相等的，

     * 那么在这两个对象上调用@code hashCode方法必须产生相同的整数结果。

     * <li>If two objects are equal according to the {@code equals(Object)}

     *     method, then calling the {@code hashCode} method on each of

     *     the two objects must produce the same integer result.

     *

     * 如果两个对象是不相等的，这是不需要的

     * 根据@link java.lang.object equals（java.lang.object）方法，

     * 然后在每一个上调用@code hashCode方法,两个对象必须产生不同的整数结果。

     * 然而,程序员应该意识到产生不同的整数结果对于不相等的对象，可以提高散列表的性能。

     * <li>It is <em>not</em> required that if two objects are unequal

     *     according to the {@link java.lang.Object#equals(java.lang.Object)}

     *     method, then calling the {@code hashCode} method on each of the

     *     two objects must produce distinct integer results.  However, the

     *     programmer should be aware that producing distinct integer results

     *     for unequal objects may improve the performance of hash tables.

     * </ul>

     * <p>

     * As much as is reasonably practical, the hashCode method defined by

     * class {@code Object} does return distinct integers for distinct

     * objects. (This is typically implemented by converting the internal

     * address of the object into an integer, but this implementation

     * technique is not required by the

     * Java™ programming language.)

     *

     * @return  a hash code value for this object.

     * @see     java.lang.Object#equals(java.lang.Object)

     * @see     java.lang.System#identityHashCode

     */

    public native int hashCode();

 

    /**

     * Indicates whether some other object is "equal to" this one.

     * <p>

     * The {@code equals} method implements an equivalence relation

     * on non-null object references:

     * <ul>

     * <li>It is <i>reflexive</i>: for any non-null reference value

     *     {@code x}, {@code x.equals(x)} should return

     *     {@code true}.

     * <li>It is <i>symmetric</i>: for any non-null reference values

     *     {@code x} and {@code y}, {@code x.equals(y)}

     *     should return {@code true} if and only if

     *     {@code y.equals(x)} returns {@code true}.

     * <li>It is <i>transitive</i>: for any non-null reference values

     *     {@code x}, {@code y}, and {@code z}, if

     *     {@code x.equals(y)} returns {@code true} and

     *     {@code y.equals(z)} returns {@code true}, then

     *     {@code x.equals(z)} should return {@code true}.

     * <li>It is <i>consistent</i>: for any non-null reference values

     *     {@code x} and {@code y}, multiple invocations of

     *     {@code x.equals(y)} consistently return {@code true}

     *     or consistently return {@code false}, provided no

     *     information used in {@code equals} comparisons on the

     *     objects is modified.

     * <li>For any non-null reference value {@code x},

     *     {@code x.equals(null)} should return {@code false}.

     * </ul>

     * <p>

     * The {@code equals} method for class {@code Object} implements

     * the most discriminating possible equivalence relation on objects;

     * that is, for any non-null reference values {@code x} and

     * {@code y}, this method returns {@code true} if and only

     * if {@code x} and {@code y} refer to the same object

     * ({@code x == y} has the value {@code true}).

     *

     * 请注意，每当这个方法被覆盖时，通常都需要覆盖@code hashCode方法，

     * 以便维护@code hashCode方法的一般契约，

     * 该方法规定相等对象必须具有相等的散列码。

     * Note that it is generally necessary to override the {@code hashCode}

     * method whenever this method is overridden, so as to maintain the

     * general contract for the {@code hashCode} method, which states

     * that equal objects must have equal hash codes.

     *

     * @param   obj   the reference object with which to compare.

     * @return  {@code true} if this object is the same as the obj

     *          argument; {@code false} otherwise.

     * @see     #hashCode()

     * @see     java.util.HashMap

     */

    public boolean equals(Object obj) {

        return (this == obj);

}

重写equals一定要重写hashCode

由Object。hashCode的通用约定，我们可以直观地看出：覆盖equals必须覆盖hashCode。

如果没有覆盖hashCode会违反Oject.hashCode通用约定的第二条：相等的对象必须具有相等的散列码，导致该类无法结合所有基于散列的集合（HashMap、HashSet和Hashtable）一起正常运作。同时，不相等的对象产生截然不同的hash结果，有利于提高散列表的性能。

equals与hashCode都用鉴别是否是同一对象，前者比较行为比较为全面复杂（意味着效率低），后者只要用哈希值比较就行（效率高）。但是利用hashCode比较不是绝对可靠的。所以为提高效率：一般先利用hashCode去比较，如果不一样，那么它们就不相等；否则，再用equals比较。这样会大大提高比较的效率。

注：不同对象的hashCode一定不同吗？不一定（这可能是因为：生成哈希值的算法可能存在问题或哈希值的长度限制）

举个例子，

public final class PhoneNumber {
	private final short areaCode;
	private final short prefix;
	private final short lineNumber;
	
	public PhoneNumber(int areaCode, int prefix, int lineNumber) {
		rangeCheck(areaCode,999,"area code");
		rangeCheck(prefix,999,"prefix");
		rangeCheck(lineNumber,9999,"line number");
		
		this.areaCode = (short)areaCode;
		this.prefix = (short)prefix;
		this.lineNumber = (short)lineNumber;
	}
	
	private static void rangeCheck(int arg,int max,String name){
		if(arg<0||arg>max)
		{
			throw new IllegalArgumentException(name+":"+arg);
		}
	}
	
	/**
	 * 这个hashCode是合法的，它确保了相等的对象具有同样的散列码
	 * 同时，这种定义极为恶劣，因为它使每个对象都具有相同的散列码。
	 * 这样会使每个对象都被映射到同一个散列桶中，使散列表退化为链表（本该线性时间运行变为了平方时间运行）
	 * 注：此处这样写仅为方便地演示具体事例。
	 */
//	@Override
//	public int hashCode()
//	{
//		return 42;
//	}
	
	@Override
	public boolean equals(Object obj) {
		if (this == obj)
			return true;
		if(!(obj instanceof PhoneNumber))
			return false;
		PhoneNumber pn=(PhoneNumber)obj;
		return pn.lineNumber==lineNumber 
				&&pn.prefix==prefix
				&&pn.areaCode==areaCode;		
	}
	
}

import java.util.HashMap;
import java.util.Map;

public class Test {
	public static void main(String[] args) {
		Map<PhoneNumber, String> m=new HashMap<PhoneNumber, String>();
		
		m.put(new PhoneNumber(707, 867,5309),"Jenny");
		System.out.println(m.get(new PhoneNumber(707, 867,5309)));
	}

}

当我们期待通过m.get(new PhoneNumber(707, 867,5309))返回Jenny的时候，实际输出结果为null。为什么？因为没有重写hashCode方法，导致相等的实例的散列码不同。put方法将电话号码对象存放在一个散列桶中，get方法却在另一个散列桶中查找这个电话号码。（即使这两个正好放在一个桶中，get方法也会返回null。因为HashMap有一项优化，可以将每个项相关联的散列码缓存起来，如果散列码不匹配，也不必检验对象的等同性）

如何修正这个问题？只需在PhoneNumber 类中重写一个合适的HashCode。

我们将PhoneNumber 中的HashCode取消注释，再运行效果就出来了。

参考文献《Effective Java》

重写equals为何总要重写hashCode

猜你喜欢