DustMite, a D test case minimization tool

Sun May 22 19:21:04 PDT 2011

On Sun, 22 May 2011 21:39:55 -0400, Vladimir Panteleev  
<vladimir at thecybershadow.net> wrote:
> On Mon, 23 May 2011 04:14:32 +0300, Robert Jacques <sandford at jhu.edu>  
> wrote:
>
>> On Sun, 22 May 2011 19:30:58 -0400, Vladimir Panteleev  
>> <vladimir at thecybershadow.net> wrote:
>>
>>> On Mon, 23 May 2011 02:15:49 +0300, Robert Jacques <sandford at jhu.edu>  
>>> wrote:
>>>
>>>>  As for performance, using appender is never slower than ~=, as it  
>>>> uses essentially the same code.
>>>
>>> I don't think using ~= when appending a string to a string will  
>>> validate the UTF. Will it?
>>>
>>
>> For string ~= string, appender calls string[] = string, which does a  
>> memcopy, iirc.
>
> Right, so my complexity rant was BS, but appender will still validate  
> UTF on every append, unlike ~=. Isn't that a bug?
>

Appender doesn't validate UTF when the character widths are the same.
For example,

     string test = "\<" ~ "\>" ~ "\Α" ~ "\Β" ~ "\Γ"~  
"\♠" ~ "\♦"~ "\U0001D11E";
     Appender!string app;
     foreach(i;0..ds.length-1) {
         app.put(test[i..$]);
     }

Runs fine, even though at times test[i..$] is an invalid string, because  
the type of test and appender are both strings. However, if you change  
Appender to a wstring, then encoding and decoding occur and those routines  
always validate. Hence, if app is a Appender!wstring, it will throw a UTF  
validation error.