kill the commas! (phobos code cleanup)

via Digitalmars-d digitalmars-d at puremagic.com
Sun Sep 7 03:45:22 PDT 2014


On Sunday, 7 September 2014 at 10:29:41 UTC, ketmar via 
Digitalmars-d wrote:
> index nth symbol! ucs-4 (aka dchar/dstring) is ok though.

For western text strings utf-8 is much better due to cache 
efficiency. You can speed it up using SSE or dedicated 
datastructures.

The point of having unique immutable strings is that they compare 
by reference only and that you can have auxillary datastructures 
that classify them if needed.

I think the D approach to strings is unpleasant. You should not 
have slices of strings, only slices of ubyte arrays.

If you want real speedups for streams of symbols you have to move 
into the landscape of huffman-encoding, tries, dedicated 
datastructures…

Having uniform string support in libraries (i.e. only supporting 
utf-8) is a clear advantage IMO, that will allow for APIs that 
are SSE backed and performant.


More information about the Digitalmars-d mailing list