stdio line-streaming revisited

Thu Mar 29 14:14:05 PDT 2007

Sean Kelly wrote:
> Andrei Alexandrescu (See Website For Email) wrote:
>> Sean Kelly wrote:
>>> Andrei Alexandrescu (See Website For Email) wrote:
>>>> Sean Kelly wrote:
>>>>> Andrei Alexandrescu (See Website For Email) wrote:
>>>>>> Bill Baxter wrote:
>>>>>>>
>>>>>>> I believe the current discussion is only about under-the-hood 
>>>>>>> implementation issues.  So I don't think you have to worry about 
>>>>>>> any D libraries exposing (good or bad) C/C++ design decisions to 
>>>>>>> users.  Tango is going to expose the same D interface no matter 
>>>>>>> how it's implemented under the hood.
>>>>>>
>>>>>> That's great, however the interface has a problem too: it does not 
>>>>>> produce atomic strings in multithreaded programs.
>>>>>
>>>>> Which interface?  And are you talking about input or output?
>>>>
>>>> Cout. The thing is that the design using the syntax Cout(a)(b) is 
>>>> conducive to two separate calls to the library. I recall this has 
>>>> been briefly discussed here.
>>>
>>> Just making sure we were on the same page.  In my experience, raw 
>>> output is typically used for logging--that's about the only case I 
>>> can think of where the random interleaving of messages is 
>>> acceptable.  Tango has a separate logging facility which performs 
>>> synchronization for exactly this purpose: tango.util.log.  Perhaps 
>>> this particular critique would be more appropriately applied to the 
>>> logging package?
>>
>> I don't think so. For example, I have a multithreaded program that 
>> starts processes on multiple machines and outputs "stamped" lines to 
>> the standard output - lines with a prefix clarifying which machine the 
>> line comes from, and at what time. That output is gzipped for 
>> efficiency and transported via ethernet to a hub where the lines are 
>> demultiplexed and put into multiple files etc.
>>
>> It is essential that lines are written atomically, but beyond that, 
>> the actual order does not matter.
> 
> This sounds to me like logging output, which is exactly what I 
> described.

Oh, I thought you meant logging as just auxiliary informative message as 
opposed to the meat of the I/O. Just to clarify: the interleaved output 
is the meat of the I/O in the case above.

> The logger could be attached to the console or a socket as 
> easily as a file. Why use Cout for this?

"Because it's there!" :o) All a bona fide programmer expects is to have 
access to the three pre-opened standard streams and just use them. I'm 
not sure how to make a logger output to stdout. The manual (after an 
introduction that makes it pretty clear I'm already swimming upstream by 
using logger for something else than logging) says:

// send output of myLogger to stderr
myLogger.addAppender(new ConsoleAppender());

Later in the document there's the section "A Custom Appender Example" 
which implements what I need - not in the library, in userland. Caveat: 
I don't know whether that example keeps things properly multithreaded. 
If it does, it's deadlock-prone as it allows arbitrary code to be 
executed with locked files. If it doesn't, it's incorrect. Doomed either 
way. The lure of just using phobos' and C's stdio is so much stronger :o).

So what's the recommended use of Cout?

(a) If you do stdio, use Cout (and don't forget to flush Cout manually 
every time you plan to read from Cin).

(b) But not for multithreaded programs that do stdio. For those, use the 
logger facility. If you want multithreaded output to stdout, copy the 
code from http://www.dsource.org/projects/tango/wiki/ChapterLogging into 
your program. Be careful that that code might be incorrect; the manual 
doesn't specify. If it is correct, be careful with what you do inside 
that code, because you could deadlock yourself.

(c) And not for programs linking with anything that uses C's stdio. For 
those, use Phobos.

Andrei