Proposal for fixing dchar ranges

Mon Mar 10 14:51:37 PDT 2014

On Monday, 10 March 2014 at 18:13:14 UTC, Steven Schveighoffer 
wrote:
> Indexing is rarely a feature one needs or should use, 
> especially with encoded strings.

If I was writing something like a chat or terminal window, I 
would want to be able to jump to chunks of text based on some 
sort of buffer length, then search for actual character 
boundaries. Similarly, if I was indexing text, I don't care what 
the underlying data is just whether any particular set of n-bytes 
have been seen together among some document. For the latter case, 
I don't need to be able to interpret the data as text while 
indexing, but once I perform an actual search and want to jump 
the user to that line in the file, being able to take a byte 
offset that I had stored in the index and convert that to a 
textual position would be good.

I do think that D should have something like

alias String8 = UTF!char;
alias String16 = UTF!wchar;
alias String32 = UTF!dchar;

And that those sit on top of an underlying immutable(xchar)[] 
buffer, providing variants of things like foreach and length 
based on code-point or grapheme boundaries. But I don't think 
there's any value in reinterpretting "string". Not being a struct 
or an object, it doesn't have the extensibility to be useful for 
all the variations of access that working with Unicode and the 
underlying bytes warrants.