Proposal for fixing dchar ranges

Steven Schveighoffer schveiguy at yahoo.com
Mon Mar 10 11:13:25 PDT 2014


On Mon, 10 Mar 2014 13:59:53 -0400, John Colvin  
<john.loughran.colvin at gmail.com> wrote:

> On Monday, 10 March 2014 at 13:35:33 UTC, Steven Schveighoffer wrote:
>> I proposed this inside the long "major performance problem with  
>> std.array.front," I've also proposed it before, a long time ago.
>>
>> But seems to be getting no attention buried in that thread, not even  
>> negative attention :)
>>
>> An idea to fix the whole problems I see with char[] being treated  
>> specially by phobos: introduce an actual string type, with char[] as  
>> backing, that is a dchar range, that actually dictates the rules we  
>> want. Then, make the compiler use this type for literals.
>>
>> e.g.:
>>
>> struct string {
>>    immutable(char)[] representation;
>>    this(char[] data) { representation = data;}
>>    ... // dchar range primitives
>> }
>>
>> Then, a char[] array is simply an array of char[].
>>
>> points:
>>
>> 1. No more issues with foreach(c; "cassé"), it iterates via dchar
>> 2. No more issues with "cassé"[4], it is a static compiler error.
>> 3. No more awkward ASCII manipulation using ubyte[].
>> 4. No more phobos schizophrenia saying char[] is not an array.
>> 5. No more special casing char[] array templates to fool the compiler.
>> 6. Any other special rules we come up with can be dictated by the  
>> library, and not ignored by the compiler.
>>
>> Note, std.algorithm.copy(string1, mutablestring) will still  
>> decode/encode, but it's more explicit. It's EXPLICITLY a dchar range.  
>> Use std.algorithm.copy(string1.representation,  
>> mutablestring.representation) will avoid the issues.
>>
>> I imagine only code that is currently UTF ignorant will break, and that  
>> code is easily 'fixed' by adding the 'representation' qualifier.
>>
>> -Steve
>
> I know warnings are disliked, but couldn't we make the slicing and  
> indexing work as currently but issue a warning*? It's not ideal but it  
> does mean we get backwards compatibility.

As I mentioned elsewhere (but repeating here for viewers), I was not  
planning on disabling slicing.

Indexing is rarely a feature one needs or should use, especially with  
encoded strings.

-Steve


More information about the Digitalmars-d mailing list