toString refactor in druntime

Manu via Digitalmars-d digitalmars-d at puremagic.com
Sat Nov 1 06:30:29 PDT 2014


On 31 October 2014 01:30, Steven Schveighoffer via Digitalmars-d
<digitalmars-d at puremagic.com> wrote:
> On 10/28/14 7:06 PM, Manu via Digitalmars-d wrote:
>>
>> On 28 October 2014 22:51, Steven Schveighoffer via Digitalmars-d
>> <digitalmars-d at puremagic.com> wrote:
>>>
>>> On 10/27/14 8:01 PM, Manu via Digitalmars-d wrote:
>>>>
>>>>
>>>>    28 October 2014 04:40, Benjamin Thaut via Digitalmars-d
>>>> <digitalmars-d at puremagic.com> wrote:
>>>>>
>>>>>
>>>>> Am 27.10.2014 11:07, schrieb Daniel Murphy:
>>>>>
>>>>>> "Benjamin Thaut"  wrote in message
>>>>>> news:m2kt16$2566$1 at digitalmars.com...
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> I'm planning on doing a pull request for druntime which rewrites
>>>>>>> every
>>>>>>> toString function within druntime to use the new sink signature. That
>>>>>>> way druntime would cause a lot less allocations which end up beeing
>>>>>>> garbage right away. Are there any objections against doing so? Any
>>>>>>> reasons why such a pull request would not get accepted?
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> How ugly is it going to be, since druntime can't use std.format?
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> They wouldn't get any uglier than they already are, because the current
>>>>> toString functions within druntime also can't use std.format.
>>>>>
>>>>> An example would be to toString function of TypInfo_StaticArray:
>>>>>
>>>>> override string toString() const
>>>>> {
>>>>>           SizeStringBuff tmpBuff = void;
>>>>>           return value.toString() ~ "[" ~
>>>>> cast(string)len.sizeToTempString(tmpBuff) ~ "]";
>>>>> }
>>>>>
>>>>> Would be replaced by:
>>>>>
>>>>> override void toString(void delegate(const(char)[]) sink) const
>>>>> {
>>>>>           SizeStringBuff tmpBuff = void;
>>>>>           value.toString(sink);
>>>>>           sink("[");
>>>>>           sink(cast(string)len.sizeToTempString(tmpBuff));
>>>>>           sink("]");
>>>>> }
>>>>
>>>>
>>>>
>>>> The thing that really worries me about this synk API is that your code
>>>> here produces (at least) 4 calls to a delegate. That's a lot of
>>>> indirect function calling, which can be a severe performance hazard on
>>>> some systems.
>>>> We're trading out garbage for low-level performance hazards, which may
>>>> imply a reduction in portability.
>>>
>>>
>>>
>>> I think given the circumstances, we are better off. But when we find a
>>> platform that does perform worse, we can try and implement alternatives.
>>> I
>>> don't want to destroy performance on the platforms we *do* support, for
>>> the
>>> worry that some future platform isn't as friendly to this method.
>>
>>
>> Video games consoles are very real, and very now.
>> I suspect they may even represent the largest body of native code in
>> the world today.
>
>
> Sorry, I meant future *D supported* platforms, not future not-yet-existing
> platforms.

I'm not sure what you mean. I've used D on current and existing games
consoles. I personally think it's one of D's most promising markets...
if not for just a couple of remaining details.

Also, my suggestion will certainly perform better on all platforms.
There is no platform that can benefit from the existing proposal of an
indirect function call per write vs something that doesn't.

>> I don't know if 'alternatives' is the right phrase, since this
>> approach isn't implemented yet, and I wonder if a slightly different
>> API strategy exists which may not exhibit this problem.
>
>
> Well, the API already exists and is supported. The idea is to migrate the
> existing toString calls to the new API.

Really? Bummer... I haven't seen this API anywhere yet.
Seems a shame to make such a mistake with a brand new API. Too many
competing API patterns :/

>>> But an aggregate which relies on members to output themselves is going to
>>> have a tough time following this model. Only at the lowest levels can we
>>> enforce such a rule.
>>
>>
>> I understand this, which is the main reason I suggest to explore
>> something other than a delegate based interface.
>
>
> Before we start ripping apart our existing APIs, can we show that the
> performance is really going to be so bad? I know virtual calls have a bad
> reputation, but I hate to make these choices absent real data.

My career for a decade always seems to find it's way back to fighting
virtual calls. (in proprietary codebases so I can't easily present
case studies)
But it's too late now I guess. I should have gotten in when someone
came up with the idea... I thought it was new.

> For instance, D's underlying i/o system uses FILE *, which is about as
> virtual as you can get. So are you avoiding a virtual call to use a buffer
> to then pass to a virtual call later?

I do a lot of string processing, but it never finds it's way to a
FILE*. I don't write console based software.

>>> Another thing to think about is that the inliner can potentially get rid
>>> of
>>> the cost of delegate calls.
>>
>>
>> druntime is a binary lib. The inliner has no effect on this equation.
>
>
> It depends on the delegate and the item being output, whether the source is
> available to the compiler, and whether or not it's a virtual function. True,
> some cases will not be inlinable. But the "tweaks" we implement for platform
> X which does not do well with delegate calls, could be to make this more
> available.

I suspect the cases where the inliner can do something useful would be
in quite a significant minority (with respect to phobos and druntime
in particular). I haven't tried it, but I have a lifetime of
disassembling code of this sort, and I'm very familiar with the
optimisation patterns.

>>>> Ideally, I guess I'd prefer to see an overload which receives a slice
>>>> to write to instead and do away with the delegate call. Particularly
>>>> in druntime, where API and potential platform portability decisions
>>>> should be *super*conservative.
>>>
>>>
>>>
>>> This puts the burden on the caller to ensure enough space is allocated.
>>> Or
>>> you have to reenter the function to finish up the output. Neither of
>>> these
>>> seem like acceptable drawbacks.
>>
>>
>> Well that's why I open for discussion. I'm sure there's room for
>> creativity here.
>>
>> It doesn't seem that unreasonable to reenter the function to me
>> actually, I'd prefer a second static call in the rare event that a
>> buffer wasn't big enough, to many indirect calls in every single case.
>
>
> A reentrant function has to track the state of what has been output, which
> is horrific in my opinion.

How so? It doesn't seem that bad to me. We're talking about druntime
here, the single most used library in the whole ecosystem... that shit
should be tuned to the max. It doesn't matter how pretty the code is.

>> There's no way that reentry would be slower. It may be more
>> inconvenient, but I wonder if some API creativity could address
>> that...?
>
>
> The largest problem I see is, you may not know before you start generating
> strings whether it will fit in the buffer, and therefore, you may still end
> up eventually calling the sink.

Right. The api should be structured to make a virtual call _only_ in
the rare instance the buffer overflows. That is my suggestion.
You can be certain to supply a buffer that will not overflow in many/most cases.

> Note, you can always allocate a stack buffer, use an inner function as a
> delegate, and get the inliner to remove the indirect calls. Or use an
> alternative private mechanism to build the data.

We're talking about druntime specifically. It is a binary lib. The
inliner won't save you.

> Would you say that *one* delegate call per object output is OK?

I would say that an uncontrollable virtual call is NEVER okay,
especially in otherwise trivial and such core functions like toString
in druntime. But one is certainly better than many.
Remember I was arguing for final-by-default for years (because it's
really important)... and I'm still extremely bitter about that
outcome.

>>> What would you propose for such a mechanism? Maybe I'm not thinking of
>>> your
>>> ideal API.
>>
>>
>> I haven't thought of one I'm really happy with.
>> I can imagine some 'foolproof' solution at the API level which may
>> accept some sort of growable string object (which may represent a
>> stack allocation by default). This could lead to a virtual call if the
>> buffer needs to grow, but that's not really any worse than a delegate
>> call, and it's only in the rare case of overflow, rather than many
>> calls in all cases.
>>
>
> This is a typical mechanism that Tango used -- pass in a ref to a dynamic
> array referencing a stack buffer. If it needed to grow, just update the
> length, and it moves to the heap. In most cases, the stack buffer is enough.
> But the idea is to try and minimize the GC allocations, which are
> performance killers on the current platforms.

I wouldn't hard-code to overflow to the GC heap specifically. It
should be an API that the user may overflow to wherever they like.

> I think adding the option of using a delegate is not limiting -- you can
> always, on a platform that needs it, implement a alternative protocol that
> is internal to druntime. We are not preventing such protocols by adding the
> delegate version.

You're saying that some platform may need to implement a further
completely different API? Then no existing code will compile for that
platform. This is madness. We already have more than enough API's.

> But on our currently supported platforms, the delegate vs. GC call is soo
> much better. I can't see any reason to avoid the latter.

The latter? (the GC?) .. Sorry, I'm confused.


More information about the Digitalmars-d mailing list