kill the commas! (phobos code cleanup)
ketmar via Digitalmars-d
digitalmars-d at puremagic.com
Sun Sep 7 04:30:50 PDT 2014
On Sun, 07 Sep 2014 10:45:22 +0000
via Digitalmars-d <digitalmars-d at puremagic.com> wrote:
> For western text strings utf-8 is much better due to cache
> efficiency. You can speed it up using SSE or dedicated
> datastructures.
that's what i call efficiency! using SIMD for string indexing!
> The point of having unique immutable strings is that they compare
> by reference only and that you can have auxillary datastructures
> that classify them if needed.
and this fill fail with compacting gc. heh.
> I think the D approach to strings is unpleasant. You should not
> have slices of strings, only slices of ubyte arrays.
oh, no, thanks. casting strings back and forth for slicing is not fun.
and writing parsers using string slicing is fun.
> If you want real speedups for streams of symbols you have to move
> into the landscape of huffman-encoding, tries, dedicated
> datastructures…
or just ditch utf-8 and use ucs-4. this will speedup the most
frequently string operations: correct indexing and slicing.
> Having uniform string support in libraries (i.e. only supporting
> utf-8) is a clear advantage IMO, that will allow for APIs that
> are SSE backed and performant.
utf-8 was not invented as encoding for internal string representation.
it's merely for data interchange. i myself believe that language should
not do any encoding/decoding on given string without explicit asking.
i.e. `foreach (dchar ch; s)` must be the same as `foreach (char ch; s)`
when s is `string`. for any decoding i must use `foreach (ch; s.byUtf8Char)`.
the whole "let's use utf-8 as internal string representation" was a
mistake. and i'm not talking about D here.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 181 bytes
Desc: not available
URL: <http://lists.puremagic.com/pipermail/digitalmars-d/attachments/20140907/1584b50a/attachment.sig>
More information about the Digitalmars-d
mailing list