Making your class compatible with Java hash maps: overriding hashCode() and equals()
If you've read this site's hash function guidelines,
or if you have prior knowledge of hashing, then you may have an idea of how to write the hash
function itself. On this page we'll discuss the nuts and bolts you need to actually plug your
hash function into a Java class and therefore use instances of that class as a hash map key.
(Note that we concentrate on HashMaps here, but the points we discuss
generally hold for related classes such as ConcurrentHashMap and HashSet.)
The basics: override hashCode() and equals()
Put very simply, there are two methods that a class needs to override to make
objects of that class work as hash map keys:
public int hashCode();
public boolean equals(Object o);
As you might expect, the hashCode() method is where we put our hash function.
Hash functions such as those mentioned in our
hash function guidelines can generally
be slotted in. Note that HashMap will not do any extra caching of the hash code.
So if calculating the hash is relatively expensive (as in the case of String)
it may be worth explicitly caching the hash code.
The equals() method
The equals() method must return true if the fields of the
current object equal those of the object passed in, else return false. By "equal", we
generally mean that primitive fields match via the == operator, and
objects are either both null or both non-null and match via the equals() method.
Note two important constraints on equals():
- if x.equals(y) returns true, then the hash codes
of x and y must be identical;
- it must be reflexive and transitive: that is, x.equals(y)
must return the same value as y.equals(x), and if x.equals(y) and y.equals(z), then x.equals(z) must also be true (see below for what this actually means in real terms!).
The first of these is generally common sense given that the purpose of a hash function
is to "narrow down a search" which will ultimately be performed using the
equals()
to perform the final check for a match. The second is more or less common sense, but it does
mean, for example, that we can't allow a
null reference to equal an "empty" object.
It also means, strictly speaking, that a
subclass cannot add new variable
comparisons to the equals() method2.
Example
Now let's see an example. We'll look at a simple class that encapsulates a pair of screen
coordinates. We assume that individually, the X and Y coordinates are essentially random,
but that the maximum coordinate in each case will be in the order of a couple of thousand
(in other words will have about 10 or 11 bits of randomness).
So to make the hash code, we pick a number that is roughly halfway between these
bits, then find a prime (or at worst odd) number that is close to 211.
Our old favourite of 31 (=25-1) will do us fine.
The equals() method is trivial, but note the convention of returning
false if the object passed in isn't of a compatible type.
public class CoordinatePair {
private int x, y;
public int hashCode() {
return (x * 31) ^ y;
}
public boolean equals(Object o) {
if (o instanceof CoordinatePair) {
CoordinatePair other = (CoordinatePair) o;
return (x == other.x && y == other.y);
}
return false;
}
}
1. I think this
convention predates Java 5 generics: arguably, we really don't expect a case where
equals() will be called against an object of an incompatible type and
should just apply the case an rely on the resulting runtime exception if the cast fails.
2. The problem with subclassing can be explained with a quick example. Suppose we extend Rectangle (whose equals() method effectively compares its co-ordinates and dimensions, although via the Rectangle2D base class) to make a class called ColouredRectangle,
whose equals() method returns true if and only if the colours of the rectangles are identical. Now we have the problem that a plain Rectangle, if passed a ColouredRectangle to its equals() method, would return true provided
the co-ordinates and dimensions were the same, discounting the colour; whereas the other way
round, ColouredRectangle.equals() would always return false (because it wasn't comparing against another ColouredRectangle).
If you enjoy this Java programming article, please share with friends and colleagues. Follow the author on Twitter for the latest news and rants.
Editorial page content written by Neil Coffey. Copyright © Javamex UK 2021. All rights reserved.