HASH-TABLE-KEY-MODIFICATION:TEST argument to MAKE-HASH-TABLE specifies the `equivalence test' for the new hash table.
Define that an object is `visibly modified' with regard to an equivalence test if there exists some set of objects (or potential objects) which are equivalent to the object before the modification but are no longer equivalent afterwards.
If an object is used as a key in a hash table and is then visibly modified with regard to the equivalence test of the hash table, then using the object as a key in further operations on the hash table has unspecified consequences. Moreover, this applies for other objects which either are or were equivalent to the key object. Further, undoing the modification does not remove this effect.
Following are specifications of the modifications which are visible to the equivalence tests which must be supported by hash tables. The modifications are described in terms of modification of components, and are defined recursively. Visible modifications of components of the object are visible modifications of the object.
1. EQ and EQL Common Lisp does not specify any method for visibly modifying any data type with regard to either of these equivalence tests.
2. EQUAL As a consequence of the behavior for EQUAL specified by Issue EQUAL-STRUCTURE, the specification for this case reverts to the previous description for EQ and EQL for many data types. The exceptions are
a. CONS The car and cdr.
b. BIT-VECTOR Elements of the vector, limited by the fill-pointer if present. The length of the vector, if adjustable or a fill-pointer is present.
c. STRING Same as for BIT-VECTOR.
d. PATHNAME Common Lisp does not specify any method for visibly modifying a PATHNAME with regard to EQUAL.
3. EQUALP As a consequence of the behavior for EQUALP specified by Issue EQUAL-STRUCTURE, the specification for this case reverts to the previous description for EQUAL for many data types. The exceptions are
a. NUMBER Common Lisp does not specify any method for visibly modifying a NUMBER with regard to EQUALP.
b. CHARACTER Common Lisp does not specify any method for visibly modifying a CHARACTER with regard to EQUALP.
c. STRUCTURE The values of the slots.
d. VECTOR Elements of the vector, limited by the fill-pointer if present. The length of the vector, if adjustable or a fill-pointer is present.
e. ARRAY Elements of the array, limited by the fill-pointer if present. The value of the fill-pointer, if present. The dimensions, if adjustable.
f. HASH-TABLE The count of entries in the table. The keys. Note that the visibility of modifications to the keys depends on the equivalence test of the hash table, not on the specification of EQUALP. The values associated with the keys.
Implementations which extend the language by providing additional mutator functions (or additional behavior for existing mutator functions) must document how the use of these extensions interacts with equivalence tests and hash table searches.
Implementations which extend the language by defining additional acceptable equivalence tests for hash tables (allowing additional values for the :TEST argument to MAKE-HASH-TABLE) must document the visible components of these tests.
(setf ht (make-hash-table :test #'eq)) (setf a (cons 1 2)) (setf (gethash a ht) 'win) (setf (cdr a) 3) (gethash a ht 'lose) => win t
The same example with :test #'equal in the first line would be an error.
The following example is not an error, because EQUAL does not examine the components of general vectors:
(setf ht (make-hash-table :test #'equal)) (setf a (vector 1 2)) (setf (gethash a ht) 'win) (setf (aref a 1) 3) (gethash a ht 'lose) => win t
The following example is not an error, because EQUALP is limited by the fill-pointer when examining the elements of a vector:
(setf ht (make-hash-table :test #'equalp)) (setf a (make-array 3 :fill-pointer 2 :initial-contents '(1 2 3))) (setf (gethash a ht) 'win) (setf (aref a 2) 4) (gethash a ht 'lose) => win t
The following example is an error, because EQUALP may examine the dimensions of arrays:
(setf ht (make-hash-table :test #'equalp)) (setf a (make-array '(2 3) :adjustable t)) (setf (gethash a ht) 'win) (setf a (adjust-array a '(3 2))) (gethash a ht 'lose) => `unspecified'
The following example is not an error, because EQUALP is case insensitive:
(setf ht (make-hash-table :test #'equalp)) (setf a "ABC") (setf (gethash a ht) 'win) (setf (char a 0) #\a) (gethash a ht 'lose) => win t
The same example with :test #'equal in the first line would be an error.
EQ and EQL hash tables use the identity of the object as the key, while EQUAL and EQUALP hash tables use the structure of the object as the key. Component modification changes the structure of an object, while the identity of an object cannot be changed in Common Lisp.
Specifying `unspecified consequences' provides implementation freedom, while still providing guidance to users.
This is a generalization of the warning on p.168 of CLtL.
Note that this proposal implies that it is invalid to use an overly general hash function, such as SXHASH, as the hash function for EQ or EQL hash tables. The value of SXHASH can be affected by component modifications, and this is likely to cause hash table entries to become inaccessible. The general rule for this is that a hash function must not depend on the value of some property of an object if modification of that property is not visible to the associated equivalence function.
The documentation requirements for extensions seems like the minimal necessary thing, assuming we want to talk about extensions at all.
Some implementations which provide any of the described extensions may not conform to the documentation requirement.
Version 2 Barrett: The two paragraphs about language extensions could be stricken without really affecting the proposal.
This proposal does not deal with any of the performance issues addressed by Issue HASH-TABLE-STABILITY. Rather, it provides guidance to users and implementors as to the requirements for conforming programs.
`Unspecified consequences' could be changed to 'undefined consequences', though it seems unlikely that this is actually a 'crash and burn' situation. (The program which was depending on the result of the operation may exhibit undefined consequences as a result of the hash-table operation's having unspecified consequences, but that's not the same thing.)
Moon: Re: Define that an object is 'visibly modified' with regard to an equivalence test if there exists some class of objects which were equivalent to the object before the modification but are no longer equivalent afterwards. Except for ... the use of "class" to mean "set" rather than the usual CLOS meaning, it looks okay. Actually where it says "some class of objects", a single object would be sufficient. X is visibly modified if there exists any Y such that (funcall test X Y) produces a different result, provided (this is missing from Kim's writeup) Y has not been modified [whatever that means exactly].
Pitman and Barrett support this proposal (v3).