What are best practices around toString?

Sat Oct 1 08:26:43 UTC 2022

On Friday, 30 September 2022 at 13:11:56 UTC, christian.koestlin 
wrote:
> Dear Dlang experts,
>
> up until now I was perfectly happy with implementing 
> `(override) string toString() const` or something to get nicely 
> formatted (mostly debug) output for my structs, classes and 
> exceptions.

Human beings read extremely slowly compared to how quickly the GC 
can allocate and free `string`s as needed, so there is no need to 
complicate your code with more text formatting strategies unless 
you want to generate this debug output far faster than a human 
can actually read it.

> But recently I stumbled upon 
> https://wiki.dlang.org/Defining_custom_print_format_specifiers 
> and additionally 
> https://github.com/dlang/dmd/blob/4ff1eec2ce7d990dcd58e5b641ef3d0a1676b9bb/druntime/src/object.d#L2637 which at first sight is great, because it provides the same customization of an objects representation with less memory allocations.
>
> When grepping through phobos, there are a bunch of "different" 
> signatures implemented for this, e.g.
>
> ```d
> ...
> phobos/std/typecons.d:        void toString(DG)(scope DG sink) 
> const
> ...
> phobos/std/typecons.d:        void toString(DG, Char)(scope DG 
> sink,  scope const ref FormatSpec!Char fmt) const
> ...
> phobos/std/typecons.d:        void toString()(scope void 
> delegate(const(char)[]) sink, scope const ref FormatSpec!char 
> fmt)
> ...
> phobos/std/sumtype.d:        void toString(this This, Sink, 
> Char)(ref Sink sink, const ref FormatSpec!Char fmt);
> ...
> ```
> to just show a few.

The `FormatSpec` parameter only belongs there if you're actually 
going to do something useful with it in your `toString` 
implementation. Even if you are going to use it, you should 
probably still provide a convenience overload with a default 
specifier.

> Furthermore, when one works with instances of struct, objects 
> or exceptions a `aInstance.toString()` does not "work" when one 
> only implements the sink interface (which is to be expected), 
> whereas a `std.conv.to!string` or a formatted write with `%s` 
> always works (no matter what was used to implement the 
> toString).

I generally do something like this:

```D
struct A {
     string message;
     int enthusiasm;

     void toString(DG)(scope DG sink) scope const @safe
         if(is(DG : void delegate(scope const(char[])) @safe)
         || is(DG : void function(scope const(char[])) @safe))
     {
         import std.format : formattedWrite;
         sink(message);
         sink(" x ");
         formattedWrite!"%d"(sink, enthusiasm);
         sink("!");
     }
     string toString() scope const pure @safe {
         StringBuilder builder;
         toString(&(builder.opCall)); // Find the exact string 
length.
         builder.allocate();
         toString(&(builder.opCall)); // Actually write the chars.
         return builder.finish();
     }
}
```

So, the first `toString` overload defines how to format the value 
to text, while the second overload does memory management and 
forwards the formatting work to the first.

`StringBuilder` is a utility shared across the entire project:

```D
struct StringBuilder {
private:
     char[] buffer;
     size_t next;

public:
     void opCall(scope const(char[]) str) scope pure @safe nothrow 
@nogc {
         const curr = next;
         next += str.length;
         if(buffer !is null)
             buffer[curr .. next] = str[];
     }
     void allocate() scope pure @safe nothrow {
         buffer = new char[next];
         next = 0;
     }
     void allocate(const(size_t) maxLength) scope pure @safe 
nothrow {
         buffer = new char[maxLength];
         next = 0;
     }
     string finish() pure @trusted nothrow @nogc {
         assert(buffer !is null);
         string ret = cast(immutable) buffer[0 .. next];
         buffer = null;
         next = 0;
         return ret;
     }
}
```

The first formatting pass to find the required buffer length can 
be skipped if you can somehow pre-calculate the maximum possible 
length, or if you prefer the common strategy of repeatedly 
re-allocating the buffer with exponentially increasing size used 
by the likes of `std.array.Appender`. Since the API for 
`toString` remains the same regardless, you are free to choose 
the best strategy for each type.