Dev.to daily challenge Duplicate Encoder

Vladimir Panteleev thecybershadow.lists at gmail.com
Wed Jun 24 19:13:39 UTC 2020


On Wednesday, 24 June 2020 at 17:21:34 UTC, Jesse Phillips wrote:
> 2. 
> https://github.com/JesseKPhillips/devto-challenge259-dupencoder

In `duplicateEncode_go`, the code is inconsistent in whether it 
wants to process the string by code unit or code point. 
`occurences` [sic] is declared as `int[dchar]`, but later it uses 
a standard (`char`-wise) `foreach` loop, and the `std.ascii` 
variant of `toLower`.

Replacing the type with `int[256]` results in roughly a 2x 
speedup.

I think it might be possible to use an AVX (256-bit) register to 
hold state to avoid going to the RAM at all.

I also don't understand the choices that led to the 
duplicateEncode_pointer implementation. This version is much 
faster:

string duplicateEncode_pointer(string str) {
     import std.ascii : toLower;
     auto result = str.dup;
     char*[256] locMap;

     foreach(ref c; result)
     {
         auto p = &locMap[c.toLower()];
         if (*p)
             **p = c = MANY;
         else
         {
             c = ONCE;
             *p = &c;
         }
     }

     return result.to!string;
}



More information about the Digitalmars-d mailing list