valid types for Keys (Hashtable, Dictionary)

**Joseph Bergevin** · Dec 9 '05, 05:45 PM

RE: valid types for Keys (Hashtable, Dictionary)

Ok. I just put the ints into a struct and used that as a key. Problem solved.
Why didn't I think of that earlier?

**Bruce Wood** · Dec 9 '05, 06:05 PM

Re: valid types for Keys (Hashtable, Dictionary)

Depending upon what you are doing, that may have been the wrong
solution. structs have very specific applications in .NET, and using
them without due consideration can lead to grief.

The reason that your int[]s didn't generate unique hash keys is that
Array uses the default GetHashCode() method inherited from Object,
which is documented as being no good for hashing. From the
documentation:

"The default implementation of GetHashCode does not guarantee
uniqueness or consistency; therefore, it must not be used as a unique
object identifier for hashing purposes."

The point is that there was no way for the .NET team to come up with a
one-size-fits all hash function, so they provided merely the signature
and left it up to individual classes to decide how to generate hash
keys.

So, the solution to your original problem is to create your own _class_
(probably not struct) to hold whatever it is you need to hold. That
class should then override GetHashCode() (and Equals()... they go hand
in hand) so that it can be used as a hash key.

**Joseph Bergevin** · Dec 9 '05, 08:55 PM

Re: valid types for Keys (Hashtable, Dictionary)

> Depending upon what you are doing, that may have been the wrong[color=blue]
> solution. structs have very specific applications in .NET, and using
> them without due consideration can lead to grief.[/color]

So what is the specific utility of structs? I assumed that they were chiefly
a handy way to group value types, like Point and Vector3 do. It seemed
wasteful to me to make a class for three ints. I just assumed that object
creation is slower, but I guess that everything in .NET is an object anyway.
[color=blue]
> "The default implementation of GetHashCode does not guarantee
> uniqueness or consistency; therefore, it must not be used as a unique
> object identifier for hashing purposes."[/color]

Aha. I never came across that in my MS Press C# book. Most of the things
I've read about Hashtables stop short of explaining the nuts and bolts.
They're just some magical search tree to me.
[color=blue]
> So, the solution to your original problem is to create your own _class_
> (probably not struct) to hold whatever it is you need to hold. That
> class should then override GetHashCode() (and Equals()... they go hand
> in hand) so that it can be used as a hash key.[/color]

In my particular situation, I'm primarily interested in being able to look
up a given object with a specific 3 int location, without having to create
throwaway object intermediaries for comparisons.
Perhaps for my purposes, since I know that none of the ints will be larger
than, say, X, I could create a unique number from my int[a,b,c] that is
a + (b*(X+1)) + (c*(X+1)*(X+1))
and use this in the overrides you mentioned.
I wish I knew exactly how .NET generated the default hash codes.

Thanks for the feedback.

**Bruce Wood** · Dec 9 '05, 09:36 PM

Re: valid types for Keys (Hashtable, Dictionary)

Joseph Bergevin wrote:[color=blue][color=green]
> > Depending upon what you are doing, that may have been the wrong
> > solution. structs have very specific applications in .NET, and using
> > them without due consideration can lead to grief.[/color]
>
> So what is the specific utility of structs? I assumed that they were chiefly
> a handy way to group value types, like Point and Vector3 do. It seemed
> wasteful to me to make a class for three ints. I just assumed that object
> creation is slower, but I guess that everything in .NET is an object anyway.[/color]

When you declare a struct, you create a _value_ type. That means that
it acts like an int or a double. Whenever you assign it:

MyValudType b = new MyValueType(... );
MyValueType a = b;

it is copied, just like an int or a double. Whenever you put it in a
collection (such as an ArrayList or a Hashtable) it is boxed. In
particular, if you make your value type mutable (not recommended) then
you have to be very, very careful that you understand the exact
semantics so that when you change the value you know that it changes
and you know exactly where it changes and where it doesn't. (Using the
example above, if you were allowed to say a.MyProperty = 5 then b would
not change, because the contents of b were copied into a on
assignment.)

If in your case your three ints are spacial coordinates (x, y, z) and
they were immutable (that is, you couldn't say coord.X = 5), then a
struct would probably be the right way to go. If you make them mutable
(coord.X = 5 is legal) then you could use a struct, but you would have
to be very careful. (Search this newsgroup to see the kind of confusion
that the behaviour of System.Drawing. Point causes.)

The bottom line is that in .NET structs come with important semantic
considerations: value semantics versus reference semantics. As such,
they shouldn't be used in an offhand manner as a sort of "classes lite"
because they introduce major changes in behaviour.
[color=blue]
> In my particular situation, I'm primarily interested in being able to look
> up a given object with a specific 3 int location, without having to create
> throwaway object intermediaries for comparisons.[/color]

Well, keep in mind that the only thing you really save by using structs
is memory management costs of allocating space for an object and then
garbage collecting it later. That is, unless your structs are being
boxed (as they would be if you used them in a Hashtable either as keys
or values). In that case you don't save anything.

Microsoft made System.Drawing. Point a struct because (IMHO) they do a
lot of mathematics with points, and didn't want to create a gazillion
Point objects as a result of intermediate calculation results and then
have to garbage collect them all later. If it's just a question of the
Hashtable, then I would say that it's a non-issue: your decision as to
whether to use struct or class should rest with higher-level conceptual
issues, not with fine points of efficiency.
[color=blue]
> Perhaps for my purposes, since I know that none of the ints will be larger
> than, say, X, I could create a unique number from my int[a,b,c] that is
> a + (b*(X+1)) + (c*(X+1)*(X+1))
> and use this in the overrides you mentioned.[/color]

Yes... that would probably be a good strategy, so long as X never
changes during a particular run of your program. Another way is to
leverage off of the existing GetHashCode method in int:

public override int GetHashCode()
{
return a.GetHashCode() + b.GetHashCode() + c.GetHashCode() ;
}

or even

public override int GetHashCode()
{
return a + b + c;
}

Remember... you're not after a unique value, just a value that will
provide a nice, even distribution across all ints.
[color=blue]
> I wish I knew exactly how .NET generated the default hash codes.[/color]

Well, the default for Object probably just returns zero or something,
or maybe a fixed value for each object type. Each of the Framework
classes that overrides GetHashCode has its own algorithm. I tend to
assume that those are good, and leverage off of them when I can. For
example, if my class needs to hash on two key string fields, I just say

return string1.GetHash Code() + string2.GetHash Code();

assuming that Microsoft pays math wizards big bucks to come up with
good hash functions for strings. :-)

valid types for Keys (Hashtable, Dictionary)

valid types for Keys (Hashtable, Dictionary)

Comment

Comment

Comment

Comment