The Case Against Autodecode
Andrei Alexandrescu via Digitalmars-d
digitalmars-d at puremagic.com
Tue May 31 10:06:16 PDT 2016
On 05/31/2016 12:54 PM, Jonathan M Davis via Digitalmars-d wrote:
> Equality does not require decoding. Similarly, functions like find don't
> either. Something like filter generally would, but it's also not
> particularly normal to filter a string on a by-character basis. You'd
> probably want to get to at least the word level in that case.
It's nice that the stdlib takes care of that.
> To make matters worse, functions like find or splitter are frequently used
> to look for ASCII delimiters, even when the strings themselves contain
> Unicode characters. So, even if decoding were necessary when looking for a
> Unicode character, it's utterly wasteful when the character you're looking
> for is ASCII.
Good idea. We could overload functions such as find on char, wchar, and
dchar. Jonathan, could you look into a PR to do that?
> But searching generally does not require decoding so long as
> the same character is always encoded the same way.
Yah, a good rule of thumb is to get the same (consistent, heh) results
for a given string (including a given normalization) regardless of the
encoding used. So e.g. it's nice that walkLength the same number for the
string whether it's UTF8/16/32.
Andrei
More information about the Digitalmars-d
mailing list