maketrans and translate

Jonathan M Davis jmdavisProg at gmx.com
Mon Feb 13 12:48:27 PST 2012


On Monday, February 13, 2012 14:08:16 bearophile wrote:
> Jonathan M Davis:
> > Do you have data to backup that there is a significant speed difference?
> 
> I have just written a small benchmark, it contains both the URL to the test
> data and the timings I am seeing on a slow PC. If your PC is faster feel
> free to use it on more data (creating a larger input file on disk,
> concatenating many copies of the same test data). If you see bugs in the
> benchmark please tell me.
> 
> version3 uses the old translate. version4 uses the new translate.
> version1InPlace is a reference point, it works in-place. version2InPlace is
> similar, uses a linear scan for the chars to remove, and it seems a bit
> slower than version1InPlace (probably because the small deltab array fits
> well in the L1 cache).
> 
> If both the benchmark and the timings are correct, the old translate seems
> almost 5 times faster than the new one (and this is not strange, the new
> translate performs two associative array lookups for each char of the input
> string).

I'll look into the best way to handle it. Those functions aren't being 
deprecated this release regardless.

> > We're trying to not have any ASCII-only string stuff.
> 
> Around there is a large amount of data that is essentially text, but it's
> ASCII, it's not Unicode. Like biological data, text data coming out of
> instruments, etc. Handling it as binary data is not good.

In most cases, it should make no difference. You could use 1, 2, 3, and 4 as 
easily as you could use A, T, C, and G. Most algorithms don't care one whit 
whether you're dealing with characters or numbers. The only problem that I see 
is cases where there's an algorithm which would normally only be done with 
characters and not numbers and which therefore is in std.string and not 
std.array or std.algorithm - and translate falls into that category.

Regardless, in general, Phobos is not going to have ASCII-specific string 
processing.

- Jonathan M Davis


More information about the Digitalmars-d-learn mailing list