Phobos' std.conv.to-conversion from enum to string doesn't scale beyond hundreds of enumerators
Steven Schveighoffer
schveiguy at yahoo.com
Fri Jun 22 18:15:22 UTC 2018
On 6/22/18 1:41 PM, Stefan Koch wrote:
> On Friday, 22 June 2018 at 00:50:05 UTC, Steven Schveighoffer wrote:
>> On 6/21/18 6:46 PM, Per Nordlöw wrote:
>>> I've discovered the annoying fact that std.conv.to doesn't scale for
>>> enum to string conversion when the enum has hundreds of members. This
>>> because of a call to `NoDuplicates` which has (at least) O(n*log(n)
>>> time and space complexity.
>>>
>>> So I've come up with
>>>
>>>
>>> /** Faster implementation of `std.conv.to`.
>>> */
>>> string toString(T)(T value) @safe pure nothrow @nogc
>>> if (is(T == enum))
>>> {
>>> final switch (value)
>>> {
>>> static foreach (member; __traits(allMembers, T))
>>> {
>>> case __traits(getMember, T, member):
>>> return member;
>>> }
>>> }
>>> }
>>>
>>> ///
>>> @safe pure nothrow @nogc unittest
>>> {
>>> enum E { unknown, x, y, z, }
>>> assert(E.x.toString == "x");
>>> assert(E.y.toString == "y");
>>> assert(E.z.toString == "z");
>>> }
>>>
>>>
>>> The question know is: How do I make this support enums with
>>> enumerator aliases without needing to call `NoDuplicates`?
>>
>> You can't. That's why it's in there. And I can't think of a better
>> algorithm than O(nlgn). That's what it would cost to sort anyway.
>>
>> The sucky thing is, the compiler is *already* doing a sort on the
>> items in the switch, and *already* doing the duplicate check. It would
>> be cool to be able to leverage this mechanism to avoid the library
>> solution, but I don't know how we can do that, as the semantics for
>> switch are well defined, and there's no other way to hook this builtin
>> functionality.
>>
>> One thing that may be worthwhile is checking to see if a CTFE solution
>> is better than a template solution. In other words:
>>
>> string[] dedup(string[] identifiers) { ... }
>>
>> enum items = dedup(__traits(allMembers, T));
>>
>> static foreach(member; items) ...
>>
>> Just avoiding all the template symbol generation may make it worth it,
>> even if the dedup function is O(n^2) complexity.
>>
>
>
> I do have my own ctfe utils for this
> {https://forum.dlang.org/post/bfnwstkafhfgihavtzsz@forum.dlang.org},
> however without dedup (because I'd rather be alerted when it happens)
> what you have to do is to sort the enum values.
> (I'd suggest a non-recursive mergesort.)
How will that perform in CTFE? I'm concerned about swapping values
making it allocate new arrays all over the place.
> However if there are no duplicates there's no need to deduplicate, the
> reason it is deduplicated it because the switch_statement forbids
> duplicates.
If you can sort, deduplicating is as easy as checking if the value is
the same as the previous. So really, we just need a good compile-time
sort for strings.
> I may add that this performs very well with newCTFE I don't know about
> the old engine, but I'd assume even there it's faster.
I think your algorithm is roughly the same as Nordlow's. Basically you
are using the compiler to sort the array (as it does anyway for a switch
statement).
-Steve
More information about the Digitalmars-d
mailing list