std.d.lexer requirements

Thu Aug 2 00:43:19 PDT 2012

On Thursday, August 02, 2012 00:29:09 Walter Bright wrote:
> > If we want to be able to operate on ranges of UTF-8 or UTF-16, we need to
> > add a concept of variably-length encoded ranges so that it's possible to
> > treat them as both their encoding and whatever they represent (e.g. code
> > point or grapheme in the case of ranges of code units).
> 
> No, this is not necessary.

It is for ranges in general. In the general case, a range of UTF-8 or UTF-16 
makes no sense whatsoever. Having range-based functions which understand the 
encodings and optimize accordingly can be very beneficial (which happens with 
strings but can't happen with general ranges without the concept of a 
variably-length encoded range like we have with forward range or random access 
range), but to actually have a range of UTF-8 or UTF-16 just wouldn't work. 
Range-based functions operate on elements, and doing stuff like filter or map or 
reduce on code units doesn't make any sense at all.

- Jonathan M Davis