std.hash design

Johannes Pfau nospam at example.com
Fri Jun 22 06:21:28 PDT 2012


Am Fri, 22 Jun 2012 12:03:27 +0100
schrieb "Regan Heath" <regan at netmail.co.nz>:

> 
> It might help (or it might not) to have a glance at the "design" of
> the hashing routines in Tango:
> http://www.dsource.org/projects/tango/docs/current/
> (see tango.util.digest etc)
> 
> I contributed some of the initial code for these, though it has
> since evolved a lot.  I started with structs, mirroring the phobos
> MD5 code but used all sorts of unnecessary mixins to get the code
> reuse I wanted.  The result was ugly :p
> 
> Later someone contacted me about it, and wanted a class based
> approach so I did some refactoring and the result was much cleaner.
> I'm not trying to say that a struct approach cannot be clean, just
> that I did a bad job of it initially, and also structs don't lend
> themselves to the factory pattern though which is a nice way to use
> hashing.

I had a short look at Piotr Szturmaj's sha implementations, and it
seems this kind of code would benefit a lot from inheritance. I
understand that it was probably impossible to do this in D1, but don't
you think 'alias this' could work in D2? This wouldn't solve the
problem with the factory pattern, but that can be solved by providing
wrapper classes.

> 
> As Dmitry has said, we can likely get the best of both worlds with
> classes wrapping structs or similar.

Yep, although classes wrapping structs doesn't help code reuse. But
alias this should hopefully work for that.

> 
> > toString doesn't make sense on a hash, as finish() has to be called
> > before a string can be generated. So a helper function could be
> > useful.
> 
> toString() could output the intermediate/internal state at the time
> of the call, which if called after "finish" would be the hash
> result.  I can't recall if this has any specific usefulness, tho I
> have a nagging/niggling itch which says I did use this intermediate
> result for something at some stage.
> 
> It might be useful to have toString on a hash so that we can pass a  
> completed hash object around and repeatedly obtain the string  
> representation vs obtaining it once on "finish" and passing the
> string around.  However, that said, it's probably more secure to
> destroy and scrub the memory used by the hash object ASAP and only
> retain the resulting string or ubyte[] result.
> 
> I think I've talked myself round in a circle.. I think if we have a
> way to obtain the current state as ubyte[] that would satisfy the
> niggle I have. Having a separate routine for turning a ubyte[] into a
> hex string is probably better than attaching toString to a hash
> object.

We could also provide a finishString function or something like that.
But toString returning a intermediate state would be confusing.

Tango doesn't seem to offer a way to peek at the current state. But if
it's really useful, it could be added.

BTW: Do you know why digestSize is a function in tango? Are there
digests that produce variable length hashes?



More information about the Digitalmars-d mailing list