Proposal for fixing dchar ranges
Chris Williams
yoreanon-chrisw at yahoo.co.jp
Mon Mar 10 14:51:37 PDT 2014
On Monday, 10 March 2014 at 18:13:14 UTC, Steven Schveighoffer
wrote:
> Indexing is rarely a feature one needs or should use,
> especially with encoded strings.
If I was writing something like a chat or terminal window, I
would want to be able to jump to chunks of text based on some
sort of buffer length, then search for actual character
boundaries. Similarly, if I was indexing text, I don't care what
the underlying data is just whether any particular set of n-bytes
have been seen together among some document. For the latter case,
I don't need to be able to interpret the data as text while
indexing, but once I perform an actual search and want to jump
the user to that line in the file, being able to take a byte
offset that I had stored in the index and convert that to a
textual position would be good.
I do think that D should have something like
alias String8 = UTF!char;
alias String16 = UTF!wchar;
alias String32 = UTF!dchar;
And that those sit on top of an underlying immutable(xchar)[]
buffer, providing variants of things like foreach and length
based on code-point or grapheme boundaries. But I don't think
there's any value in reinterpretting "string". Not being a struct
or an object, it doesn't have the extensibility to be useful for
all the variations of access that working with Unicode and the
underlying bytes warrants.
More information about the Digitalmars-d
mailing list