kill the commas! (phobos code cleanup)
via Digitalmars-d
digitalmars-d at puremagic.com
Sun Sep 7 03:45:22 PDT 2014
On Sunday, 7 September 2014 at 10:29:41 UTC, ketmar via
Digitalmars-d wrote:
> index nth symbol! ucs-4 (aka dchar/dstring) is ok though.
For western text strings utf-8 is much better due to cache
efficiency. You can speed it up using SSE or dedicated
datastructures.
The point of having unique immutable strings is that they compare
by reference only and that you can have auxillary datastructures
that classify them if needed.
I think the D approach to strings is unpleasant. You should not
have slices of strings, only slices of ubyte arrays.
If you want real speedups for streams of symbols you have to move
into the landscape of huffman-encoding, tries, dedicated
datastructures…
Having uniform string support in libraries (i.e. only supporting
utf-8) is a clear advantage IMO, that will allow for APIs that
are SSE backed and performant.
More information about the Digitalmars-d
mailing list