kill the commas! (phobos code cleanup)

ketmar via Digitalmars-d digitalmars-d at puremagic.com
Sun Sep 7 04:30:50 PDT 2014


On Sun, 07 Sep 2014 10:45:22 +0000
via Digitalmars-d <digitalmars-d at puremagic.com> wrote:

> For western text strings utf-8 is much better due to cache 
> efficiency. You can speed it up using SSE or dedicated 
> datastructures.
that's what i call efficiency! using SIMD for string indexing!

> The point of having unique immutable strings is that they compare 
> by reference only and that you can have auxillary datastructures 
> that classify them if needed.
and this fill fail with compacting gc. heh.

> I think the D approach to strings is unpleasant. You should not 
> have slices of strings, only slices of ubyte arrays.
oh, no, thanks. casting strings back and forth for slicing is not fun.
and writing parsers using string slicing is fun.

> If you want real speedups for streams of symbols you have to move 
> into the landscape of huffman-encoding, tries, dedicated 
> datastructures…
or just ditch utf-8 and use ucs-4. this will speedup the most
frequently string operations: correct indexing and slicing.

> Having uniform string support in libraries (i.e. only supporting 
> utf-8) is a clear advantage IMO, that will allow for APIs that 
> are SSE backed and performant.
utf-8 was not invented as encoding for internal string representation.
it's merely for data interchange. i myself believe that language should
not do any encoding/decoding on given string without explicit asking.
i.e. `foreach (dchar ch; s)` must be the same as `foreach (char ch; s)`
when s is `string`. for any decoding i must use `foreach (ch; s.byUtf8Char)`.

the whole "let's use utf-8 as internal string representation" was a
mistake. and i'm not talking about D here.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 181 bytes
Desc: not available
URL: <http://lists.puremagic.com/pipermail/digitalmars-d/attachments/20140907/1584b50a/attachment.sig>


More information about the Digitalmars-d mailing list