New hash API: Update
Alex Rønne Petersen
alex at lycus.org
Sun Jun 24 12:40:53 PDT 2012
On 24-06-2012 17:23, Johannes Pfau wrote:
> I'm mostly finished with my hash API proposal. I also ported the
> existing crc, md5 and the proposed sha1 hash to this new API.
Thanks.
>
> I changed the namespace to std.util.digest. Andrei once said he thinks
> std.digest/std.hash is a too narrow package and someone else said
> putting crc into std.crypto.digest is ridiculous. So I did what tango
> and other libraries do and created a std.util module.
I am strongly against this. Java's 'java.util' package should be a fair
indicator of why; text processing, collections, date/time, ...
>
> I think std.uuid would also fit well into std.util so it'd become
> std.util.uuid.
I really really don't like where that is going. A package with a name
like 'util' basically means /nothing/. It's as good as having this in
the 'std' package in the first place. I understand that you want to
group things into proper packages (I do this heavily in my projects),
but we need a better name than 'util'. I really don't think there's
anything wrong with just plain 'std.hash', especially since we're likely
to add more algorithms over time.
>
> Here's the documentation:
> http://dl.dropbox.com/u/24218791/d/phobos/std_util_digest_digest.html
> http://dl.dropbox.com/u/24218791/d/phobos/std_util_digest_crc.html
> http://dl.dropbox.com/u/24218791/d/phobos/std_util_digest_md5.html
> http://dl.dropbox.com/u/24218791/d/phobos/std_util_digest_sha.html
These modules aren't quite consistent: There exist many versions of
SHA-1 and so the module is named 'sha'. Makes sense. There are several
MD algorithms, but the module is named 'md5'?
I think we should either have a module per specific algorithm or modules
with neutral names like 'md'.
>
> And here's a pull request for the code:
> https://github.com/D-Programming-Language/phobos/pull/646
>
> Github branch:
> https://github.com/jpf91/phobos/tree/newHash
>
>
>
> There are still some open questions:
>
> OOP interface: Digest.finish()
>
> This can only throw if the supplied buffer is too small. Make this
> nothrow & throw an Error on too small buffer? Or check buffer only in
> debug mode using asserts or preconditions?
Error. It's a logic error to pass in a too small buffer.
Also, most (if not all) Digest methods really should be pure nothrow.
Same for some free functions (I think it's good practice to mark
template functions as pure nothrow explicitly if you want to guarantee
this).
I agree with the overall interface. I think it's going in the right
direction.
>
>
>
> CRC32:
> The current implementation doesn't seem to be compliant
> to the 'common' CRC-32-IEEE 802.3 form, at least it doesn't pass these
> test vectors:
> http://www.febooti.com/products/filetweak/members/hash-and-crc/test-vectors/
> http://www.lammertbies.nl/comm/info/crc-calculation.html
> http://rosettacode.org/wiki/CRC-32
>
> Turns out we'd need to invert the value and make sure to use LSB first
> order. I made those changes, so now we produce the same output as the
> rest of the world, but the new std.util.digest.crc will produce
> different values than the old std.crc now. Is this OK?
Yes, correctness over *.
>
> I'm also not happy about the crc license:
> provided that the above copyright notice appear in all copies and
> * that both that copyright notice and this permission notice appear
> * in supporting documentation.
Yes, this annoyed me too.
>
>
> I'll put all my modifications under boost license. The only thing left
> of the original code is the crc32_table and the implementation of the
> put function.
>
> The table and the one line of code is also available as public domain
> code here:
> http://www.csbruce.com/~csbruce/software/crc32.c
>
> So I think it should be possible to change the license to boost?
>
>
Yes IMO. Anyone could have gone and implemented it from that C file.
--
Alex Rønne Petersen
alex at lycus.org
http://lycus.org
More information about the Digitalmars-d
mailing list