Semantics of toString
Genghis Khan
genghis at outer.mn
Thu Nov 12 11:21:12 PST 2009
Andrei Alexandrescu Wrote:
> Steven Schveighoffer wrote:
> > On Thu, 12 Nov 2009 11:46:48 -0500, Andrei Alexandrescu
> > <SeeWebsiteForEmail at erdani.org> wrote:
> >
> >> Steven Schveighoffer wrote:
> >>> On Thu, 12 Nov 2009 10:29:17 -0500, Andrei Alexandrescu
> >>> <SeeWebsiteForEmail at erdani.org> wrote:
> >>>
> >>>> Steven Schveighoffer wrote:
> >>>>> On Tue, 10 Nov 2009 18:49:54 -0500, Andrei Alexandrescu
> >>>>> <SeeWebsiteForEmail at erdani.org> wrote:
> >>>>>
> >>>>>> I think the best option for toString is to take an output range
> >>>>>> and write to it. (The sink is a simplified range.)
> >>>>> Bad idea...
> >>>>> A range only makes sense as a struct, not an interface/object.
> >>>>> I'll tell you why: performance.
> >>>>
> >>>> You are right. If range interfaces accommodate block transfers, this
> >>>> problem may be addressed. I agree that one virtual call per
> >>>> character output would be overkill. (I seem to recall it's one of
> >>>> the reasons why C++'s iostreams are so inefficient.)
> >>> IIRC, I don't think C++ iostreams use polymorphism
> >>
> >> Oh yes they do. (Did you even google?) Virtual multiple inheritance,
> >> the works.
> >>
> >> http://www.deitel.com/articles/cplusplus_tutorials/20060225/virtualBaseClass/
> >>
> >
> > From my C++ book, it appears to only use virtual inheritance. I don't
> > know enough about virtual inheritance to know how that changes function
> > calls.
> >
> > As far as virtual functions, only the destructor is virtual, so there is
> > no issue there.
>
> You're right, but there is an issue because as far as I can recall these
> functions' implementation do end up calling a virtual function per char;
> that might be streambuf.overflow. I'm not keen on investigating this any
> further, but I'd be grateful if you shared any related knowledge. At the
> end of the day, there seem to be violent agreement that we don't want
> one virtual call per character or one delegate call per character.
>
> >>> void put(in char[] str)
> >>> {
> >>> foreach(dchar dc; str)
> >>> {
> >>> put((&dc)[0..1]);
> >>> }
> >>> }
> >>> Note that you probably want to build a buffer of dchars instead of
> >>> putting one at a time, but you get the idea.
> >>
> >> I don't get the idea. I'm seeing one virtual call per character.
> >
> > You missed the note. I didn't implement it, but you could easily
> > implement a stack-allocated buffer to cache the conversions, passing
> > multiple converted code-points at once. But I don't think it's even
> > worth discussing per my other points.
> >
> >>> That being said, one other point that makes all this moot is --
> >>> toString is for debugging, not for general purpose. We don't need to
> >>> support everything that is possible. You should be able to say "hey,
> >>> toString only accepts char[], deal." Of course, you could substitute
> >>> wchar[] or dchar[], but I think by far char[] is the most common (and
> >>> is the default type for string literals).
> >>
> >> I was hoping we could elevate the usefulness of toString a bit.
> >
> > Whatever kind of data the output stream gets, it's going to convert it
> > to the format it wants anyways (as for stdout, I think that would be
> > utf8), the only benefit is if you have data stored in a different width
> > that you wanted to output. Calling a conversion function in that case I
> > think is reasonable enough, and saves the output stream from having to
> > convert/deal with it.
> >
> > In other words, I don't think it's going to be that common a case where
> > you need anything other than utf8 output, and therefore the cost of
> > creating an interface, making virtual calls, disallowing simple delegate
> > passing etc is worth the convenience *just in case* you have data stored
> > as wchar[] you want to output.
>
> I'm not sure.
>
> http://www.gnu.org/s/libc/manual/html_node/Streams-and-I18N.html#Streams-and-I18N
>
> gnu defines means to set and detect a utf-16 console, which dmd observes
> (grep std/ for fwide). But then I'm not sure how many are using that
> kind of stuff.
>
> >>> That's not to say there is no reason to have a TextOutputStream
> >>> object. Such a thing is perfectly usable for a toString which takes
> >>> a char[] delegate sink, just pass &put. In fact, there could be a
> >>> default toString function in Object that does just that:
> >>> class Object
> >>> {
> >>> ...
> >>> void toString(delegate void(in char[] buf) put, string fmt) const
> >>> {}
> >>> void toString(TextOutputStream tos, string fmt) const
> >>> { toString(&tos.put, fmt); }
> >>> }
> >>
> >> I'd agree with the delegate idea if we established that UTF-8 is
> >> favored compared to all other formats.
> >
> > D seems to favor UTF8 -- it is the default type for string literals. I
> > don't think I've ever used dchar, and I usually only use wchar to talk
> > to Win32 functions when required.
> >
> > The question I'd ask is -- how common is it where the versions other
> > than char[] would be more convenient?
>
> I don't know. I think Asian-language users might give a salient answer.
亞洲用戶有一個突出的答案
More information about the Digitalmars-d
mailing list