Mutable enums

Mon Nov 14 12:39:55 PST 2011

On Mon, 14 Nov 2011 14:59:50 -0500, Timon Gehr <timon.gehr at gmx.ch> wrote:

> On 11/14/2011 08:37 PM, Steven Schveighoffer wrote:
>> On Mon, 14 Nov 2011 13:37:18 -0500, Timon Gehr <timon.gehr at gmx.ch>  
>> wrote:
>>
>>> On 11/14/2011 02:13 PM, Steven Schveighoffer wrote:
>>>> On Mon, 14 Nov 2011 03:27:21 -0500, Timon Gehr <timon.gehr at gmx.ch>
>>>> wrote:
>>>>
>>>>> On 11/14/2011 01:02 AM, bearophile wrote:
>>>>>> Jonathan M Davis:
>>>>>>
>>>>>>>> import std.algorithm;
>>>>>>>> void main() {
>>>>>>>> enum a = [3, 1, 2];
>>>>>>>> enum s = sort(a);
>>>>>>>> assert(equal(a, [3, 1, 2]));
>>>>>>>> assert(equal(s, [1, 2, 3]));
>>>>>>>> }
>>>>>>>
>>>>>>> It's not a bug. Those an manifest constants. They're copy-pasted
>>>>>>> into whatever
>>>>>>> code you used them in. So,
>>>>>>>
>>>>>>> enum a = [3, 1, 2];
>>>>>>> enum s = sort(a);
>>>>>>>
>>>>>>> is equivalent to
>>>>>>>
>>>>>>> enum a = [3, 1, 2];
>>>>>>> enum s = sort([3, 1, 2]);
>>>>>>
>>>>>> You are right, there's no DMD bug here. Yet, it's a bit surprising  
>>>>>> to
>>>>>> sort in-place a "constant". I have to stop thinking of them as
>>>>>> constants. I don't like this design of enums...
>>>>>
>>>>> It is the right design. Why should enum imply const or immutable? (or
>>>>> inout, for that matter). They are completely orthogonal.
>>>>
>>>> There is definitely some debatable practice here for wherever enum is
>>>> used on an array.
>>>>
>>>> Consider that:
>>>>
>>>> enum a = "hello";
>>>>
>>>> foo(a);
>>>>
>>>> Does not allocate heap memory, even though "hello" is a reference  
>>>> type.
>>>>
>>>> However:
>>>>
>>>> enum a = ['h', 'e', 'l', 'l', 'o'];
>>>>
>>>> foo(a);
>>>>
>>>> Allocates heap memory every time a is *used*. This is  
>>>> counter-intuitive,
>>>> one uses enum to define things using the compiler, not during runtime.
>>>> It's used to invoke CTFE, to avoid heap allocation. It's not a  
>>>> glorified
>>>> #define macro.
>>>>
>>>> The deep issue here is not that enum is used as a manifest constant,  
>>>> but
>>>> rather the fact that enum can map to a *function call* rather than the
>>>> *result* of that function call.
>>>>
>>>> Would you say this should be acceptable?
>>>>
>>>> enum a = malloc(5);
>>>>
>>>> foo(a); // calls malloc(5) and passes the result to foo.
>>>>
>>>> If the [...] form is an acceptable enum, I contend that malloc should  
>>>> be
>>>> acceptable as well.
>>>>
>>>
>>> a indeed refers to the result of the evaluation of ['h', 'e', 'l',
>>> 'l', 'o'].
>>>
>>> enum a = {return ['h', 'e', 'l', 'l', 'o'];}(); // also allocates on
>>> every use
>>>
>>> But malloc is not CTFE-able, that is why it fails.
>>
>> You are comparing apples to oranges here. Whether it's CTFE able or not
>> has nothing to do with it, since the code is executed at runtime, not
>> compile time.
>>
>
> The code is executed at compile time. It is just that the value is later  
> created by allocating at runtime.
>
> enum foo = {writeln("foo"); return [1,2,3];}(); // fails, because  
> writeln is not ctfe-able.

Look at the code generated for enum a = [1, 2, 3].  using a is replaced  
with a call to _d_arrayliteral.  There is no CTFE going on.

>
>
>>>
>>>
>>>> My view is that enum should only be acceptable on data that is
>>>> immutable, or implicitly cast to immutable,
>>>
>>> Too restrictive imho.
>>
>> It allows the compiler to evaluate the enum at compile time, and store
>> any referenced data in ROM, avoiding frequent heap allocations (similar
>> to string literals).
>>
>> IMO, type freedom is lower on the priority list than performance.
>>
>> You can already define a symbol that calls arbitrary code at runtime:
>>
>> @property int[] a() { return [3, 1, 2];}
>>
>> Why should we muddy enum's goals with also being able to call functions
>> during runtime?
>>
>
> As I said, I would not miss the capability of enums to create mutable  
> arrays a lot. Usually you don't want that behavior, and explicitly  
> .dup-ing is just fine.
>
> But I think it is a bit exaggerated to say enums can call functions at  
> runtime. It is up to the compiler how to implement the array allocation.

The compiler has no choice.  It must develop the array at runtime, or else  
the type allows one to modify the source value (just like in D1 how you  
could modify string literals).  In essence, the compiler is creating a new  
copy for every usage (and building it from scratch).

>
>>>
>>>> and should *never* map to an
>>>> expression that calls a function during runtime.
>>>>
>>>
>>> Well, I would not miss that at all.
>>> But being stored as enum should not imply restrictions on type
>>> qualifiers.
>>
>> The restrictions are required in order to avoid calling runtime
>> functions for enum usage. Without the restrictions, you must necessarily
>> call runtime functions for any reference-based types (to avoid modifying
>> the original).
>
> Yes, I don't need that. But I don't really want compile time  
> capabilities hampered.
>
> enum a = [2,1,4];
> enum b = sort(a); // should be fine.

I was actually surprised that this compiles.  But this should not be a  
problem even if a was immutable(int)[].  sort should be able to create a  
copy of an immutable array in order to sort it.  It doesn't matter the  
performance hit, because this should all be done at compile time.

>>
>> Note that I'm not saying literals in general should not trigger heap
>> allocations, I'm saying assigning such literals to enums should require
>> unrestricted copying without runtime function calls.
>
> Yes, I get that. And I think it makes sense. But I am not (yet?)  
> convinced that the solution to make all enums non-assignable,  
> head-mutable and tail-immutable is satisfying.

When I see an enum, I think "evaluated at compile time".  No matter how  
complex it is to build that value, it should be built at compile-time and  
*used* at runtime.  No complex function calls should be done at runtime,  
an enum is a value.

I did an interesting little test:

import std.algorithm;
import std.stdio;

int[] foo(int[] x)
{
     return x ~ x;
}
enum a = [3, 1, 2];
enum b = sort(foo(foo(foo(a))));

void main()
{
     writeln(b);
}

Want to see the assembly generated for the writeln call?

                 push    018h
                 mov     EAX,offset FLAT:_D11TypeInfo_Ai6__initZ at SYM32
                 push    EAX
                 call      _d_arrayliteralTX at PC32
                 add     ESP,8
                 mov     ECX,1
                 mov     [EAX],ECX
                 mov     4[EAX],ECX
                 mov     8[EAX],ECX
                 mov     0Ch[EAX],ECX
                 mov     010h[EAX],ECX
                 mov     014h[EAX],ECX
                 mov     018h[EAX],ECX
                 mov     01Ch[EAX],ECX
                 mov     EDX,2
                 mov     020h[EAX],EDX
                 mov     024h[EAX],EDX
                 mov     028h[EAX],EDX
                 mov     02Ch[EAX],EDX
                 mov     030h[EAX],EDX
                 mov     034h[EAX],EDX
                 mov     038h[EAX],EDX
                 mov     03Ch[EAX],EDX
                 mov     EBX,3
                 mov     040h[EAX],EBX
                 mov     044h[EAX],EBX
                 mov     048h[EAX],EBX
                 mov     04Ch[EAX],EBX
                 mov     050h[EAX],EBX
                 mov     054h[EAX],EBX
                 mov     058h[EAX],EBX
                 mov     05Ch[EAX],EBX
                 mov     ECX,EAX
                 mov     EAX,018h
                 mov     -8[EBP],EAX
                 mov     -4[EBP],ECX
                 mov     EDX,-4[EBP]
                 mov     EAX,-8[EBP]
                 push    EDX
                 push    EAX
                 call       
_D3std5stdio76__T7writelnTS3std5range37__T11SortedRangeTAiVAyaa5_61203c2062Z11SortedRangeZ7writelnFS3std5range37__T11SortedRangeTAiVAyaa5_61203c2062Z11SortedRangeZv at PC32

Really?  That's a better solution than using ROM space to store the result  
of the expression as evaluated at compile time?  The worst part is that  
this will be used *EVERY TIME* I use the enum b (even if I pass it as a  
const array).

>
>>
>> I don't think you would miss this as much as you think. Assigning a
>> non-immutable array from an immutable one is as easy as adding a .dup,
>> and then the code is more clear that an allocation is taking place.
>>
>
> It would be somewhat odd.
>
> enum a = [2,1,4];
> enum b = sort(a.dup); // what exactly is that 'a.dup' thing?

I don't think .dup should be necessary at compile time.  Creating a sorted  
copy of an immutable array should be quite doable.

> enum c = a.dup;   // does this implicitly convert to immutable, or what  
> happens here?

Either a compile error (cannot store mutable reference data as an enum),  
or an implicit conversion back to immutable.

> enum d = sort(c); // does not work?
>
> enum e = foo(a.dup, b.dup, c.dup, d.dup);

Again, I don't think .dup would be used for dependent enums, I was rather  
thinking dup would be used where you need a mutable copy of an array  
during enum usage in normal code.

-Steve