Semantics of toString

Genghis Khan genghis at outer.mn
Thu Nov 12 11:21:12 PST 2009


Andrei Alexandrescu Wrote:

> Steven Schveighoffer wrote:
> > On Thu, 12 Nov 2009 11:46:48 -0500, Andrei Alexandrescu 
> > <SeeWebsiteForEmail at erdani.org> wrote:
> > 
> >> Steven Schveighoffer wrote:
> >>> On Thu, 12 Nov 2009 10:29:17 -0500, Andrei Alexandrescu 
> >>> <SeeWebsiteForEmail at erdani.org> wrote:
> >>>
> >>>> Steven Schveighoffer wrote:
> >>>>> On Tue, 10 Nov 2009 18:49:54 -0500, Andrei Alexandrescu 
> >>>>> <SeeWebsiteForEmail at erdani.org> wrote:
> >>>>>
> >>>>>> I think the best option for toString is to take an output range 
> >>>>>> and write to it. (The sink is a simplified range.)
> >>>>>  Bad idea...
> >>>>>  A range only makes sense as a struct, not an interface/object.  
> >>>>> I'll tell you why: performance.
> >>>>
> >>>> You are right. If range interfaces accommodate block transfers, this 
> >>>> problem may be addressed. I agree that one virtual call per 
> >>>> character output would be overkill. (I seem to recall it's one of 
> >>>> the reasons why C++'s iostreams are so inefficient.)
> >>>  IIRC, I don't think C++ iostreams use polymorphism
> >>
> >> Oh yes they do. (Did you even google?) Virtual multiple inheritance, 
> >> the works.
> >>
> >> http://www.deitel.com/articles/cplusplus_tutorials/20060225/virtualBaseClass/ 
> >>
> > 
> >  From my C++ book, it appears to only use virtual inheritance.  I don't 
> > know enough about virtual inheritance to know how that changes function 
> > calls.
> > 
> > As far as virtual functions, only the destructor is virtual, so there is 
> > no issue there.
> 
> You're right, but there is an issue because as far as I can recall these 
> functions' implementation do end up calling a virtual function per char; 
> that might be streambuf.overflow. I'm not keen on investigating this any 
> further, but I'd be grateful if you shared any related knowledge. At the 
> end of the day, there seem to be violent agreement that we don't want 
> one virtual call per character or one delegate call per character.
> 
> >>>  void put(in char[] str)
> >>> {
> >>>   foreach(dchar dc; str)
> >>>   {
> >>>      put((&dc)[0..1]);
> >>>   }
> >>> }
> >>>  Note that you probably want to build a buffer of dchars instead of 
> >>> putting one at a time, but you get the idea.
> >>
> >> I don't get the idea. I'm seeing one virtual call per character.
> > 
> > You missed the note.  I didn't implement it, but you could easily 
> > implement a stack-allocated buffer to cache the conversions, passing 
> > multiple converted code-points at once.  But I don't think it's even 
> > worth discussing per my other points.
> > 
> >>> That being said, one other point that makes all this moot is -- 
> >>> toString is for debugging, not for general purpose.  We don't need to 
> >>> support everything that is possible.  You should be able to say "hey, 
> >>> toString only accepts char[], deal."  Of course, you could substitute 
> >>> wchar[] or dchar[], but I think by far char[] is the most common (and 
> >>> is the default type for string literals).
> >>
> >> I was hoping we could elevate the usefulness of toString a bit.
> > 
> > Whatever kind of data the output stream gets, it's going to convert it 
> > to the format it wants anyways (as for stdout, I think that would be 
> > utf8), the only benefit is if you have data stored in a different width 
> > that you wanted to output.  Calling a conversion function in that case I 
> > think is reasonable enough, and saves the output stream from having to 
> > convert/deal with it.
> > 
> > In other words, I don't think it's going to be that common a case where 
> > you need anything other than utf8 output, and therefore the cost of 
> > creating an interface, making virtual calls, disallowing simple delegate 
> > passing etc is worth the convenience *just in case* you have data stored 
> > as wchar[] you want to output.
> 
> I'm not sure.
> 
> http://www.gnu.org/s/libc/manual/html_node/Streams-and-I18N.html#Streams-and-I18N
> 
> gnu defines means to set and detect a utf-16 console, which dmd observes 
> (grep std/ for fwide). But then I'm not sure how many are using that 
> kind of stuff.
> 
> >>> That's not to say there is no reason to have a TextOutputStream 
> >>> object.  Such a thing is perfectly usable for a toString which takes 
> >>> a char[] delegate sink, just pass &put.  In fact, there could be a 
> >>> default toString function in Object that does just that:
> >>>  class Object
> >>> {
> >>>    ...
> >>>    void toString(delegate void(in char[] buf) put, string fmt) const
> >>>    {}
> >>>    void toString(TextOutputStream tos, string fmt) const
> >>>    { toString(&tos.put, fmt); }
> >>> }
> >>
> >> I'd agree with the delegate idea if we established that UTF-8 is 
> >> favored compared to all other formats.
> > 
> > D seems to favor UTF8 -- it is the default type for string literals.  I 
> > don't think I've ever used dchar, and I usually only use wchar to talk 
> > to Win32 functions when required.
> > 
> > The question I'd ask is -- how common is it where the versions other 
> > than char[] would be more convenient?
> 
> I don't know. I think Asian-language users might give a salient answer.

&#20126;&#27954;&#29992;&#25142;&#26377;&#19968;&#20491;&#31361;&#20986;&#30340;&#31572;&#26696;





More information about the Digitalmars-d mailing list