Creeping Bloat in Phobos

monarch_dodra via Digitalmars-d digitalmars-d at puremagic.com
Mon Sep 29 07:33:12 PDT 2014


On Sunday, 28 September 2014 at 23:06:28 UTC, Walter Bright wrote:
> It's very hard to disable the autodecode when it is not needed, 
> though the new .byCodeUnit has made that much easier.

One issue with this though is that "byCodeUnit" is not actually 
an array. As such, by using "byCodeUnit", you have just as much 
chances of improving performance, as you have of *hurting* 
performance for algorithms that are string optimized.

For example, which would be fastest:
"hello world".find(' '); //(1)
"hello world".byCodeUnit.find(' '); //(2)

Currently, (1) is faster :/

This is a good argument though to instead use ubyte[] or 
std.encoding.AsciiString.

What I think we (maybe) need though is std.encoding.UTF8Array, 
which explicitly means: This is a range containing UTF8 
characters. I don't want decoding. It's an array you may memchr 
or slice operate on.


More information about the Digitalmars-d mailing list