Semantics of toString
Steven Schveighoffer
schveiguy at yahoo.com
Thu Nov 12 10:07:50 PST 2009
On Thu, 12 Nov 2009 11:46:48 -0500, Andrei Alexandrescu
<SeeWebsiteForEmail at erdani.org> wrote:
> Steven Schveighoffer wrote:
>> On Thu, 12 Nov 2009 10:29:17 -0500, Andrei Alexandrescu
>> <SeeWebsiteForEmail at erdani.org> wrote:
>>
>>> Steven Schveighoffer wrote:
>>>> On Tue, 10 Nov 2009 18:49:54 -0500, Andrei Alexandrescu
>>>> <SeeWebsiteForEmail at erdani.org> wrote:
>>>>
>>>>> I think the best option for toString is to take an output range and
>>>>> write to it. (The sink is a simplified range.)
>>>> Bad idea...
>>>> A range only makes sense as a struct, not an interface/object. I'll
>>>> tell you why: performance.
>>>
>>> You are right. If range interfaces accommodate block transfers, this
>>> problem may be addressed. I agree that one virtual call per character
>>> output would be overkill. (I seem to recall it's one of the reasons
>>> why C++'s iostreams are so inefficient.)
>> IIRC, I don't think C++ iostreams use polymorphism
>
> Oh yes they do. (Did you even google?) Virtual multiple inheritance, the
> works.
>
> http://www.deitel.com/articles/cplusplus_tutorials/20060225/virtualBaseClass/
From my C++ book, it appears to only use virtual inheritance. I don't
know enough about virtual inheritance to know how that changes function
calls.
As far as virtual functions, only the destructor is virtual, so there is
no issue there.
>> void put(in char[] str)
>> {
>> foreach(dchar dc; str)
>> {
>> put((&dc)[0..1]);
>> }
>> }
>> Note that you probably want to build a buffer of dchars instead of
>> putting one at a time, but you get the idea.
>
> I don't get the idea. I'm seeing one virtual call per character.
You missed the note. I didn't implement it, but you could easily
implement a stack-allocated buffer to cache the conversions, passing
multiple converted code-points at once. But I don't think it's even worth
discussing per my other points.
>> That being said, one other point that makes all this moot is --
>> toString is for debugging, not for general purpose. We don't need to
>> support everything that is possible. You should be able to say "hey,
>> toString only accepts char[], deal." Of course, you could substitute
>> wchar[] or dchar[], but I think by far char[] is the most common (and
>> is the default type for string literals).
>
> I was hoping we could elevate the usefulness of toString a bit.
Whatever kind of data the output stream gets, it's going to convert it to
the format it wants anyways (as for stdout, I think that would be utf8),
the only benefit is if you have data stored in a different width that you
wanted to output. Calling a conversion function in that case I think is
reasonable enough, and saves the output stream from having to convert/deal
with it.
In other words, I don't think it's going to be that common a case where
you need anything other than utf8 output, and therefore the cost of
creating an interface, making virtual calls, disallowing simple delegate
passing etc is worth the convenience *just in case* you have data stored
as wchar[] you want to output.
>> That's not to say there is no reason to have a TextOutputStream
>> object. Such a thing is perfectly usable for a toString which takes a
>> char[] delegate sink, just pass &put. In fact, there could be a
>> default toString function in Object that does just that:
>> class Object
>> {
>> ...
>> void toString(delegate void(in char[] buf) put, string fmt) const
>> {}
>> void toString(TextOutputStream tos, string fmt) const
>> { toString(&tos.put, fmt); }
>> }
>
> I'd agree with the delegate idea if we established that UTF-8 is favored
> compared to all other formats.
D seems to favor UTF8 -- it is the default type for string literals. I
don't think I've ever used dchar, and I usually only use wchar to talk to
Win32 functions when required.
The question I'd ask is -- how common is it where the versions other than
char[] would be more convenient?
-Steve
More information about the Digitalmars-d
mailing list