Why Equality Is Hard To Get Right

Wednesday, February 7, 2007

Remember the blog entry that suggested Java got equality wrong because the expression ((Long)0L).equals(0) returns false? The one that generated hundreds of comments and split people into a "Java sucks" camp and an "equals wasn't designed for numbers" camp? From what I remember nobody mentioned the obvious: equality is hard.

Intuitively equality should be as simple as the equals sign. Practically there are many different definitions of equality, each of which is useful under different circumstances. Common Lisp has four (!) operators for equality - eq, eql, equal, and equalp. Ruby has three - equal?, eql?, and ==. C++ used to have two (== could mean two different things in different contexts) but that wasn't enough so the standards committee introduced a concept of equivalence to STL. Not surprisingly, Java has two operators for equality (not counting the whole Comparator business) that behave in subtly different ways.

What does it mean to ask if two things are equal? Are two functions that do the same computation in completely different time and space constraints equal? Are two arithmetically equal numbers stored in memory in completely different ways equal? Are two different instance variables of different classes defined in two different libraries designed to represent the same thing (say, a 2006 Audi A4) equal? It gets better. Is me now and me as a four year old child twenty years ago the same person? How about me ten years ago? Ten seconds ago?

The answer is the same in each case - yes and no. It depends on what you're doing.

Different languages tackle the equality problem differently. Most provide a number of built in equality operators that behave in different ways and cover most useful cases. Others, like Haskell, provide one equality operator that tries to behave in a mathematically sound manner. Unfortunately that means that Haskell doesn't provide useful lower level comparisons. For example, Haskell doesn't have a built-in capability to compare two functions based on their address in memory.

I learned about the complex nature of equality years ago when I ran into the following C++ expression that for a certain type of objects occasionally evaluated to true: !(a < b) && !(b < a) && a != b. At the time it led me to a number of mathematics and philosophy (yes, Plato wrote about equality!) books and courses that taught me one thing I should have figured out myself in the first place - that the notion of equality is a very difficult problem.

Comments?

If you have any questions, comments, or suggestions, please drop a note at . I'll be glad to hear your feedback.