phobos / tango / ares

Sat Feb 10 19:27:50 PST 2007

Kevin Bealer wrote:
> Sean Kelly wrote:
> 
>> Kevin Bealer wrote:
>>
>>>
>>> Okay -- I'm really sorry if any of this seems to have a negative 
>>> tone. I hesitate to write this since I have a lot of respect for the 
>>> Tango design in general, but there are a couple of friction points 
>>> I've noticed.
>>>
>>> 1. writefln / format replacements
>>>
>>> Concerning standard output and string formatting, in phobos I can do 
>>> these operations:
>>>
>>>   writefln("%s %s %s", a, b, c);
>>>   format("%s %s %s", a, b, c);
>>>
>>> How do I do these in Tango?  The change to "{0} {1}" stuff is fine 
>>> with me, in fact I like it, but this syntax:
>>>
>>>   Stdout.formatln("{0} {1} {2}", a, b, c);
>>>   Format!(char).convert("{0} {1} {2}", a, b, c);
>>>
>>> Is awkward.  And these statements are used *all the time*.  In a 
>>> recent toy project I wrote, I used Stdout 15 times, compared to using 
>>> "foreach" only 8 times.  I also use the "format to string" idiom a 
>>> lot (oddly enough, not in that project), and it's even more awkward.
>>
>>
>> The conversion modules seem to have slightly spotty API documentation, 
>> but I think this will work for the common case:
>>
>> Formatter( "{0} {1} {2}", a, b, c );
> 
> 
> Okay, I didn't see this possibility, that actually looks like a decent 
> syntax; I withdraw the paragraphs in question, subject to the (zig zag) 
> example below. :)
> 
>> The Stdout design is the result of a lengthy discussion involving 
>> overload rules and expected behavior.  I believe two of the salient 
>> points were that the default case should be the simplest to execute, 
>> and that the .format method call provided a useful signifier that an 
>> explicit format was being supplied.  That said, I believe that the 
>> default output format can be called via:
>>
>> Stdout( a, b, c );
>>
>> or the "whisper" syntax:
>>
>> Stdout( a )( b )( c );
> 
> 
> Okay - there is a problem with new users who try to print strings with 
> "%" somewhere in the string -- this solves that problem, which is nice.
> 
>>> That's why I think phobos really did the "Right Thing" by keeping 
>>> those down to one token.  Second, the fact that the second one does 
>>> exactly what the first does but you need to build a template, etc, is 
>>> annoying.  I kept asking myself if I was doing the right thing 
>>> because it seemed like I was using too much syntax for this kind of 
>>> operation (I'm still not sure it's the best way to go -- is it?)
> 
> 
> So am I, but in D I often don't have to, maybe I'm getting spoiled.
> 
>> Do you consider the Formatter instance to be sufficient or would it be 
>> more useful to wrap this behavior in a free function?  I'll admit 
>> that, being from a C++ background I'm quite used to customizing the 
>> library behavior to suit my particular use style, but I can understand 
>> the desire for "out of the box" convenience.
> 
> 
> Hmmm.... given these two statements:
> 
> 1. char[] zig = Formatter("{0} {1}", "ciao", "bella");
> 2. char[] zag = Formatter("{0} {1}", "one", "two");
> 
> Questions:
> 
> A. If these are done sequentially, will zig be affected by the 
> processing of 'zag'?  (I.e. because of buffer sharing.)

no

> B. Will doing 1 and 2 from different threads affect zig or zag?

no

> 
> If the answer to A and B is both "NO", then I have no problem with using 
> Formatter.  I don't care about free function specifically (i.e. for 
> getting a pointer or something), I just want safety, efficiency and 
> clean syntax.
> 
> Documentation for Sprint suggests that both 1 and 2 are dangerous, I 
> don't know if Formatter is like Sprint in that regard.

The doc says that each instance of Sprint should not be shared. Each 
thread can happily create it's own Sprint instance and use that. It's a 
nice solution when you're doing lots of fiddly formatting, or need to do 
some formatting for a logger, or whatnot. Once instantiated it doesn't 
hit the heap ... that's the only benefit. In fact, it's really just a 
thin wrapper around:

# Formatter.sprint (char[] output, char[] format, ...)

Another option for multi-threads is to synch on the Sprint object; but 
that's obviously somewhat less efficient.

> 
>>> 2. toString and toUtf8 (collisions)
>>>
>>> The change of the terminology is actually okay with me.
>>>
>>> But phobos has a way of using toString as both a method and a 
>>> top-level function name, all over the place.  This gets really clumsy 
>>> because you can never use the top level function names when writing a 
>>> class unless you fully qualify them.
>>>
>>> For example, std.cpuid.toString(), always has to be fully qualified 
>>> when called from a class, and seems nondescriptive anyway.  All the 
>>> std.conv.toString() functions are nice but it's easy to accidentally 
>>> call the in-class toString() by accident.
>>>
>>> For the utf8 <--> utf16 and similar, it's frustrating to have to do 
>>> this:
>>>
>>> dchar[] x32 = ...;
>>> char[] x8 = tango.text.convert.Utf.toUtf8(x32);
>>>
>>> But you have to fully qualify if you are writing code in any class or 
>>> struct.  If these were given another name, like makeUtf8, then these 
>>> collisions would not happen.
>>
>>
>> One aspect of the Mango design that has carried forward into Tango is 
>> that similar functions are typically intended to live in their own 
>> namespace for the sake of clarity.  Previously, most/all of the free 
>> functions were declared in structs simply to prevent collisions, but 
>> this had code bloat issues so the design was changed.  Now, users are 
>> encouraged to use the aliasing import to produce the same effect:
>>
>> import Utf = tango.text.convert.Utf;
>>
>> Utf.toUtf8( x32 );
>>
>> I'll admit it's not as convenient as simply importing and using the 
>> functions, but it does make the origin of every function call quite 
>> clear.  I personally avoid "using" in C++ for exactly this reason--if 
>> I'm using an external routine I want to know what library it's from by 
>> inspection.
>>
>>
>> Sean
> 
> 
> This is not earth-shaking to me, so the current way is not a big deal, 
> but what I want to avoid is what I think of as the Java naming effect, 
> where you need to do this:
> 
> System.out.print(foo);
> 
> ... to print something.  To me, the design of a programming language or 
> library is like a natural language.  In english we say "tin can" but we 
> always say "can" when there is no ambiguity.  You never say "I want to 
> buy a tin can of beans".  (I think that in the UK, they say "tin of 
> beans" instead, but its the same idea.)

They do - and a lot of extra large tins are consumed :)

> 
> My view is for the common things to be simple and the complex things to 
> be as simple as possible.  The extra formality of spelling out the full 
> names of things is something that people find comfort in (*), but I 
> would as soon do without in D.

That's a tough call, as you note below. We were discussing options on 
this today, so we'll see what evolves?

> 
> (*) I think people find comfort in it because they have been abused by 
> other languages.  In C and C++ land, I agree --- if you do a '#define 
> binary 1' in an include file somewhere, you can kill an algorithm in 
> another file that is a dozen includes up the chain -- I found exactly 
> this definition in a file at my job, and it was an 'interesting' problem 
> to debug.  Working on large C and C++ projects breeds a kind of paranoia 
> about symbol tables that I can completely relate to.
> 
> Sometimes the combination of #include and #define is a lot like "come 
> from" in the way that it messes with the debugging process.
> 
> http://en.wikipedia.org/wiki/Come_from
> 
> But again, sorry if I'm being nit picky.

Not at all!

Tango is in early Beta, and this is exactly what's needed to file off 
the rough edges. We may not implement *everything* that everyone 
suggests, but every bug-report and every little nit-pick is wholly 
welcomed; seriously :)

>
> Kevin
>