Semantics of toString

Steven Schveighoffer schveiguy at yahoo.com
Thu Nov 12 10:07:50 PST 2009


On Thu, 12 Nov 2009 11:46:48 -0500, Andrei Alexandrescu  
<SeeWebsiteForEmail at erdani.org> wrote:

> Steven Schveighoffer wrote:
>> On Thu, 12 Nov 2009 10:29:17 -0500, Andrei Alexandrescu  
>> <SeeWebsiteForEmail at erdani.org> wrote:
>>
>>> Steven Schveighoffer wrote:
>>>> On Tue, 10 Nov 2009 18:49:54 -0500, Andrei Alexandrescu  
>>>> <SeeWebsiteForEmail at erdani.org> wrote:
>>>>
>>>>> I think the best option for toString is to take an output range and  
>>>>> write to it. (The sink is a simplified range.)
>>>>  Bad idea...
>>>>  A range only makes sense as a struct, not an interface/object.  I'll  
>>>> tell you why: performance.
>>>
>>> You are right. If range interfaces accommodate block transfers, this  
>>> problem may be addressed. I agree that one virtual call per character  
>>> output would be overkill. (I seem to recall it's one of the reasons  
>>> why C++'s iostreams are so inefficient.)
>>  IIRC, I don't think C++ iostreams use polymorphism
>
> Oh yes they do. (Did you even google?) Virtual multiple inheritance, the  
> works.
>
> http://www.deitel.com/articles/cplusplus_tutorials/20060225/virtualBaseClass/

 From my C++ book, it appears to only use virtual inheritance.  I don't  
know enough about virtual inheritance to know how that changes function  
calls.

As far as virtual functions, only the destructor is virtual, so there is  
no issue there.

>>  void put(in char[] str)
>> {
>>   foreach(dchar dc; str)
>>   {
>>      put((&dc)[0..1]);
>>   }
>> }
>>  Note that you probably want to build a buffer of dchars instead of  
>> putting one at a time, but you get the idea.
>
> I don't get the idea. I'm seeing one virtual call per character.

You missed the note.  I didn't implement it, but you could easily  
implement a stack-allocated buffer to cache the conversions, passing  
multiple converted code-points at once.  But I don't think it's even worth  
discussing per my other points.

>> That being said, one other point that makes all this moot is --  
>> toString is for debugging, not for general purpose.  We don't need to  
>> support everything that is possible.  You should be able to say "hey,  
>> toString only accepts char[], deal."  Of course, you could substitute  
>> wchar[] or dchar[], but I think by far char[] is the most common (and  
>> is the default type for string literals).
>
> I was hoping we could elevate the usefulness of toString a bit.

Whatever kind of data the output stream gets, it's going to convert it to  
the format it wants anyways (as for stdout, I think that would be utf8),  
the only benefit is if you have data stored in a different width that you  
wanted to output.  Calling a conversion function in that case I think is  
reasonable enough, and saves the output stream from having to convert/deal  
with it.

In other words, I don't think it's going to be that common a case where  
you need anything other than utf8 output, and therefore the cost of  
creating an interface, making virtual calls, disallowing simple delegate  
passing etc is worth the convenience *just in case* you have data stored  
as wchar[] you want to output.

>> That's not to say there is no reason to have a TextOutputStream  
>> object.  Such a thing is perfectly usable for a toString which takes a  
>> char[] delegate sink, just pass &put.  In fact, there could be a  
>> default toString function in Object that does just that:
>>  class Object
>> {
>>    ...
>>    void toString(delegate void(in char[] buf) put, string fmt) const
>>    {}
>>    void toString(TextOutputStream tos, string fmt) const
>>    { toString(&tos.put, fmt); }
>> }
>
> I'd agree with the delegate idea if we established that UTF-8 is favored  
> compared to all other formats.

D seems to favor UTF8 -- it is the default type for string literals.  I  
don't think I've ever used dchar, and I usually only use wchar to talk to  
Win32 functions when required.

The question I'd ask is -- how common is it where the versions other than  
char[] would be more convenient?

-Steve



More information about the Digitalmars-d mailing list