Semantics of toString
Denis Koroskin
2korden at gmail.com
Tue Nov 10 09:20:16 PST 2009
On Tue, 10 Nov 2009 15:30:20 +0300, Don <nospam at nospam.com> wrote:
> Bill Baxter wrote:
>> On Tue, Nov 10, 2009 at 2:51 AM, Don <nospam at nospam.com> wrote:
>>> Lutger wrote:
>>>> Justin Johansson wrote:
>>>>
>>>>> Lutger Wrote:
>>>>>
>>>>>> Justin Johansson wrote:
>>>>>>
>>>>>>> I assert that the semantics of "toString" or similarly
>>>>>>> named/purposed
>>>>>>> methods/functions in many PL's (including and not limited to D) is
>>>>>>> ill-defined.
>>>>>>>
>>>>>>> To put this statement into perspective, I would be most
>>>>>>> appreciative of
>>>>>>> D NG readers responding with their own idea(s) of what the
>>>>>>> semantics of
>>>>>>> "toString" are (or should be) in a language agnostic ideology.
>>>>>>>
>>>>>> My other reply didn't take the language agnostic into account,
>>>>>> sorry.
>>>>>>
>>>>>> Semantics of toString would depend on the object, I would think
>>>>>> there
>>>>>> are
>>>>>> three general types of objects:
>>>>>>
>>>>>> 1. objects with only one sensible or one clear default string
>>>>>> representations, like integers. Maybe even none of these exist
>>>>>> (except
>>>>>> strings themselves?)
>>>>>>
>>>>>> 2. objects that, given some formatting options or locale have a
>>>>>> clear
>>>>>> string representation. floating points, dates, curreny and the like.
>>>>>>
>>>>>> 3. objects that have no sensible default representation.
>>>>>>
>>>>>> toString() would not make sense for 3) type objects and only for 2)
>>>>>> type
>>>>>> objects as part of a formatting / localization package.
>>>>>>
>>>>>> toString() as a debugging aid sometimes doubles as a formatter for
>>>>>> 1)
>>>>>> and
>>>>>> 2) class objects, but that may be more confusing than it's worth.
>>>>>>
>>>>> Thanks for that Lutger.
>>>>>
>>>>> Do you think it would make better sense if programming
>>>>> languages/their
>>>>> libraries separated functions/methods which are currently loosely
>>>>> purposed
>>>>> as "toString" into methods which are more specific to the types you
>>>>> suggest (leaving only the types/classifications and number thereof to
>>>>> argue about)?
>>>>>
>>>>> In my own D project, I've introduced a toDebugString method and left
>>>>> toString alone. There are times when I like D's default toString
>>>>> printing
>>>>> out the name of the object
>>>>> class. For debug purposes there are times also when I like to see a
>>>>> string printed
>>>>> out in quotes so you can tell the difference between "123" and 123.
>>>>> Then
>>>>> again, and since I'm working on a scripting language, sometimes I
>>>>> like to
>>>>> see debug output distinguish between different numeric types.
>>>>>
>>>>> Anyway going by the replies on this topic, looks like most people
>>>>> view
>>>>> toString as being good for debug purposes and that about it.
>>>>>
>>>>> Cheers
>>>>> Justin
>>>>>
>>>> Your design makes better sense (to me at least) because it is based
>>>> on why
>>>> you want a string from some object.
>>>> Take .NET for example: it does provide very elaborate and nice
>>>> formatting
>>>> options based and toString() with parameters. For some types however,
>>>> the
>>>> default toString() gives you the name of the type itself which is in
>>>> no way
>>>> related to formatting an object. You learn to work with it, but I
>>>> find it a
>>>> bit muddled.
>>>> As a last note, I think people view toString as a debug thing mostly
>>>> because it is very underpowered.
>>> There is a definite use for such as thing. But the existing toString()
>>> is
>>> much, much worse than useless. People think you can do something with
>>> it,
>>> but you can't.
>>> eg, people have asked for BigInt to support toString(). That is an
>>> over-my-dead-body.
>> You can definitely do something with it -- printf debugging. And if I
>> were using BigInt, that's exactly why I'd want BigInt to have a
>> toString.
>
> I almost always want to print the value out in hex. And with some kind
> of digit separators, so that I can see how many digits it has.
>
> Just out of curiousity, how does someone print out the
>> value of a BigInt right now?
>
> In Tango, there's just .toHex() and .toDecimalString(). Needs proper
> formatting options, it's the biggest thing which isn't done. I hit one
> too many compiler segfaults and starting patching the compiler instead
> <g>. But I really want a decent toString().
>
> Given a BigInt n, you should be able to just do
>
> writefln("%s %x", n, n); // Phobos
> formatln("{0} {0:X}", n); // Tango
>
> To solve this part of the issue, it would be enough to have toString()
> take a string parameter. (it would be "x" or "X" in this case).
>
> string toString(string fmt);
> But the performance would still be very poor, and that's much more
> difficult to solve.
Yes, it would solve half of the toString problems.
Another part (i.e. memory allocation) could be solved by providing an
optional buffer to the toString:
char[] toString(string format = "s" /* comes from %s which is a default
qualifier */, char[] buffer = null)
{
// operate on the buffer, possibly resizing it
// which is safe and fast - it only allocates
// when *really* necessary, instead of always, as now
return buffer;
}
You can use it almost the same way you used it before:
string s = assumeUnique(someObject.toString()); // because we return a
mutable string now
Optimization example:
int sprintf(string format, ...)
{
char[512] preallocatedBuffer;
char[] buffer = preallocatedBuffer[]; // buffer may grow, but
// initially points to a preallocatedBuffer
char[] storage = buffer[]; // storage for a current element
...
for (...) { // iterate over qualifiers (and arguments)
string currentQualifier = format[i..j];
auto currentArgument = argsTuple[n];
char[] result = currentArgument.toString(storage);
if (result.ptr is storage.ptr) {
// okay, string was constructed in-place
storage = storage[result.length..$];
} else {
// storage didn't have enough space for the whole
// string (a reallocation occurred)
int offset = buffer.length - storage.length;
// increase the capacity
buffer.length *= 2;
// append our string to the buffer
buffer[offset..offset+storage.length] = storage[];
// renew the temporary storage
storage = preallocatedBuffer[];
}
}
...
}
Another example:
class Array(T)
{
// ...
private T[] elements;
char[] toString(string format, char[] buffer) {
auto builder = StringBuilder(buffer); // reallocates when no space
left
builder.append("[");
foreach (i, o; elements) {
if (i > 0) builder.append(", "); // separator
buffer = builder.getBuffer()[appender.length..$];
char[] result = o.toString(format, buffer);
if (result.ptr is buffer.ptr) {
// no reallocation
builder.length += result.length; // without copying
} else {
builder.append(result);
}
}
builder.append("]");
return builder.toString();
}
}
auto array = new Array!(int);
array ~= [0, 1, 2, 3, 4];
assert(array.toString() == "[0, 1, 2, 3, 4]");
It's not very easy to take advantage of, but it's usable the old way
(well, almost).
Any ideas?
More information about the Digitalmars-d
mailing list