Semantics of toString

Thu Nov 12 13:40:50 PST 2009

On Thu, 12 Nov 2009 16:19:39 -0500, Andrei Alexandrescu  
<SeeWebsiteForEmail at erdani.org> wrote:

> Steven Schveighoffer wrote:
>> On Thu, 12 Nov 2009 14:40:12 -0500, Andrei Alexandrescu  
>> <SeeWebsiteForEmail at erdani.org> wrote:
>>
>>> Steven Schveighoffer wrote:
>>>> On Thu, 12 Nov 2009 13:46:08 -0500, Andrei Alexandrescu  
>>>> <SeeWebsiteForEmail at erdani.org> wrote:
>>>>
>>>>>>   From my C++ book, it appears to only use virtual inheritance.  I  
>>>>>> don't know enough about virtual inheritance to know how that  
>>>>>> changes function calls.
>>>>>>  As far as virtual functions, only the destructor is virtual, so  
>>>>>> there is no issue there.
>>>>>
>>>>> You're right, but there is an issue because as far as I can recall  
>>>>> these functions' implementation do end up calling a virtual function  
>>>>> per char; that might be streambuf.overflow. I'm not keen on  
>>>>> investigating this any further, but I'd be grateful if you shared  
>>>>> any related knowledge.
>>>>  Yep, you are right.  It appears the reason they do this is so the  
>>>> conversion to the appropriate width can be done per character (and is  
>>>> a no-op for char).
>>>>
>>>>> At the end of the day, there seem to be violent agreement that we  
>>>>> don't want one virtual call per character or one delegate call per  
>>>>> character.
>>>>  After running my tests, it appears the virtual call vs. delegate is  
>>>> so negligible, and the virtual call vs. direct call is only slightly  
>>>> less negligible, I think the virtualness may not matter.  However, I  
>>>> think avoiding one *call* per character is a worthy goal.
>>>>  This doesn't mean I change my mind :)  I still think there is little  
>>>> benefit to having to conjure up an entire object just to convert  
>>>> something to a string vs. writing a simple inner function.
>>>>  One way to find out is to support only char[], and see who complains  
>>>> :)  It'd be much easier to go from supporting char[] to supporting  
>>>> all the widths than going from supporting all to just one.
>>>
>>> One problem I just realized is that, if we e.g. offer only put(in  
>>> char[]) or a delegate to that effect, we make it impossible to output  
>>> one character efficiently. The (&c)[0 .. 1] trick will not work in  
>>> safe mode. You'd have to allocate a one-element array dynamically.
>>  char[1] buf;
>> buf[0] = c;
>> put(buf);
>
> This would not compile in SafeD.

:O

Why not?  I would expect that using a local buffer would be the main way  
for converting non-string things to strings, or to avoid calling the  
delegate/vfunction lots of times.

i.e. if I want to output an integer i:

if(i == 0) put("0");
else
{
   char[20] buf;
   int idx = buf.length - 1;
   while(i != 0)
   {
     buf[idx] = i % 10;
     --idx;
     i /= 10;
   }
   put(buf[idx..$]); // no compily in SafeD???
}

Do I have to allocate a heap buffer in SafeD?

>>> Also, many OSs adopted UTF-16 as their standard format. It may be wise  
>>> to design for compatibility.
>>  So you want toString's to look like this?
>>  version(utf16isdefault)
>> {
>>   textobj.put("Array: "w);
>>   ...
>> }
>> else
>> {
>>   textobj.put("Array: ");
>>   ...
>> }
>>  -Steve
>
>
> I was just thinking of offering an interface that offers utf8 and utf16  
> and utf32.

Yes, and your explaination for this is because many OSes adopt UTF-16 as  
their standard format.  My expectation is that the outputter will convert  
to the required OS format anyways, regardless of what you pass it, so why  
should we write code to cater to what the OS wants?  I'd like to write  
string-handling code once and be done with it, not try to optimize my  
toString functions so that they use the "right" methods for the current  
OS.  I asserted that the only reason you want to use the functions other  
than the char[] version is in the case where your data is *stored* as  
wchar[] or dchar[].  Otherwise, it makes no sense to do the conversion  
because the outputter already does it for you.  So the question becomes,  
how often do you need to output data that's already in dchar[] or wchar[]  
format, and is it worth passing around a list of functions just in case  
you need that, or should you just call a conversion routine the few times  
you need it?

Let's not forget that this is mainly for debugging...

-Steve