[ Team LiB ] |
7.5 Generating Hash CodeAll object instances can provide a signed 32-bit integer hash of their contents via the GetHashCode( ) method on System.Object. Good hashes can have a dramatic effect on Hashtable speed (they are used to determine which bucket to add entries to in the hashtable), and can also provide a low-fidelity (but possibly more efficient) equivalence test. Using GetHashCode in this way is demonstrated in the following examples: void Enroll(Student s, CourseList cl) { hashtable.Add(s, cl); // GHC called on key (s) } bool FastCompare(Student s1, Student s2) { // Use GHC to test for possible equivalence if (s1.GetHashCode( ) != s2.GetHashCode( )) return false; // Use Equals to test for definite equivalence return s1.Equals(s2); } The default implementation of GetHashCode( ) on System.Object returns a semi-unique member #, while the implementation of GetHashCode( ) on System.ValueType merely returns the hash of the first field in the value type. Although these defaults work in a lot of cases, there are sometimes performance benefits gained from implementing GetHashCode( ) on your own type. Additionally, if a type overrides the Equals( ) method, it is required to override the GetHashCode( ) method, which means that many framework types override GetHashCode( ), as shown here: void DumpHashes(object o, int i, Version v) { Console.WriteLine(o.GetHashCode( )); // object index Console.WriteLine(i.GetHashCode( )); // integer value Console.WriteLine(v.GetHashCode( )); // hash of fields } 7.5.1 The Object.GetHashCode MethodThe System.GetHashCode method is declared as follows: public virtual int GetHashCode( ); It is important to understand the general contract for GetHashCode( ), which looks like this:
The idea is that good implementations use all 32 bits to provide a good even distribution and ideally preserve the significance of field ordering (to ensure that Point(10,20) hashes differently to Point(20,10)). Preserving field ordering is traditionally done by multiplying the hash for each field by some odd prime constant (37 is a popular choice in the Java world, in which a similar construct exists). Additionally, if you have not derived directly from System.Object, consider combining the hash of your base type with the hash of your contained members. Lastly, remember that contained aggregate data structures (such as Arrays) may not hash their contents correctly, and therefore may need to be hashed by hand. The following is an example implementing some of these rules, in order to provide a good hash distribution that includes all the type's data in the hash calculation. public sealed class Data { public readonly short x, y; public readonly Color c; ArrayList al = new ArrayList( ); public override int GetHashCode( ) { int hc = 1; // base.GetHashCode if base!=object hc = 37*hc + x<<16|(ushort)x; hc = 37*hc + y.GetHashCode( ); hc = 37*hc + (c= =null ? 0 : c.GetHashCode( )); foreach (object o in al) hc = 37*hc + o.GetHashCode( ); return hc; } } |
[ Team LiB ] |