auto, var, raii,scope, banana

Wed Aug 2 16:09:57 PDT 2006

Regan Heath wrote:
> On Fri, 28 Jul 2006 04:44:44 -0400, Chad J  
> <gamerChad at _spamIsBad_gmail.com> wrote:
> 
>> Regan Heath wrote:
>>
>>>  To expand on why I think it's intuitive let look at 2 scenarios, 
>>> first  the  C++ programmer trying D. Likely their first program 
>>> involving  classes will  look like this:
>>>  class A {
>>>   int a;
>>> }
>>>  void main() {
>>>   A a = A();
>>>   writefln(a.a);
>>> }
>>>  currently, this crashes with an access violation.
>>
>>
>> Actually, it doesn't compile.
>>
>> main.d(6): function expected before (), not A of type main.A
>> main.d(6): cannot implicitly convert expression ((A)()) of type int to
>> main.A
>>
>> Maybe not the best error messages for this case, but better than an
>> access violation.
> 
> 
> Err.. oops.. my example was in fact assuming the new "omit 'new'" 
> syntax  was added, then I went and confused it with:
> 
> A a;
> writefln(a.a);
> 
> which is actually what a C++ programmer would write to allocate a class 
> on  the stack and have it destroyed at end of scope.
> 
> The syntax:
> 
> A a;
> writefln(a.a);
> 
> will crash, the new syntax isn't invloved at all, sorry.
> 
>>> Add the new RAII syntax  and not only does it not crash, but it 
>>> behaves  just like a C++ program  would with A being destroyed at the 
>>> end of  scope.
>>
>>
>> I hear that such syntax, in C++, means that 'A' will be stack allocated,
>> which AFAIK does not imply RAII.
> 
> 
> No, but the resulting behaviour is almost identical. In C++ the syntax:
> 
> A a;
> 
> causes 'a' to be allocated on the stack and the destructor called at 
> end  of scope. Apart from being allocated on the stack (which Walter 
> indicates  in the docs is something he plans to implement at a later 
> date for 'auto')  it's identical to the behaviour of D's 'auto'.
> 
> http://www.digitalmars.com/d/memory.html#stackclass
> 
> "To have a class allocated on the stack that has a destructor, this is 
> the  same as a declaration with the auto attribute. Although the 
> current  implementation does not put such objects on the stack, future 
> ones can."
> 
>> One example I can think of that would
>> break RAII is if the function gets inlined, then the stack allocated
>> object may not be deleted until some point outside of the scope's end.
>> It will /probably/ be deleted at the end of the function that the
>> enclosing function was inlined into.  I could be wrong about that though.
> 
> 
> I'm not sure what happens when something is inlined either.
> 
>> Also, does that C++ syntax allow class references to leak into code that
>> is executed after the class is deleted?
> 
> 
> I would expect not. They should be destroyed when the scope ends and 
> the  stack space is released.
> 
>>>  Now, take a Java/C# programmer, their first program with classes 
>>> will  be:
>>>  class A {
>>>   int a;
>>> }
>>>  void main() {
>>>   A a = new A();
>>>   writefln(a.a);
>>> }
>>>  No crash, and the behaviour they expect.
>>>  Then imagine the same programmers looking at each others code, the   
>>> C++ programmer will understand 'new' to indicate heap allocation and  
>>> the  Java/C# programmer will need to ask the C++ programmer what the  
>>> absense of  'new' means, he'll reply that the class is destroyed at 
>>> the  end of scope.  He might also say it's stack allocated, but 
>>> that's fine,  both of them have  something to learn, and it has no 
>>> effect on the way  the program behaves.
>>
>>
>> That learning process would be nice, but it assumes an extra programmer
>> being around.  As I learn D, I am alone.  Either I go to the spec or I
>> ask on the newsgroup.  I'm not sure how many others are in the same
>> situation.
> 
> 
> In the case where you're not looking at someone elses code you'll never  
> write:
> 
> A a = A();
> 
> so you will not learn about it until you go looking for RAII. There 
> you'll  find the syntax, learn it, and carry on. It's no harder to learn 
> than a  keyword 'local'.
> 

When reading other people's code, I find that most often I don't know or 
work with the other people.  I'd have to contact them to learn from 
them.  The easier option, for me, and probably them too, is to do a 
search in the spec, even if that is a bit difficult as-is.

>> I've explained below why I still don't think it's very obvious, which
>> means more potential newsgroup asking.  That doesn't bug me too much
>> though.
> 
> 
> I'm not especially bothered which syntax is chosen, I just like 
> discussing  stuff.
> 
>>>  Simple, intuitive and elegant no matter what your background. Of  
>>> course,  if you're a Java/C# programmer you have to 'learn' what the  
>>> absense of  'new' means. Both have to learn that D might not stack  
>>> allocate the RAII  class instance.
>>>  If we use a 'scope' keyword then both sets of programmers have 
>>> some   learning to do. It's not much worse, but this makes it my 2nd 
>>> choice.
>>>
>>
>> "Both have to learn that D might not stack allocate the RAII  class
>> instance."
>> "If we use a 'scope' keyword then both sets of programmers have some
>> learning to do."
>>
>> If they both have learning to do anyways, why not just have them learn
>> 'scope'?
> 
> 
> Right back at ya.. why not have them both learn to omit 'new'?
> 

Point being it doesn't give you the win-lose, you're still stuck with a 
lose-lose.  If both coders have to learn, then we lose the advantage of 
chosing this syntax to make it so C++ coders don't have to learn.  You 
still have a point wrt similarity, though I dislike subtly different 
behaviours as it can lead to bad assumptions.

>> I suppose it wouldn't be too bad otherwise, except for one problem:
>> The unfortunate C++ programmer's program worked as expected.
> 
> 
> Actually, my example was wrong and it doesn't. This poor guy still gets 
> an  access violation. Perhaps the new 'auto' syntax should just be:
> 
> A a;
> 
> then, my previous example and point would be valid.
> 
>> I say problem and unfortunate, because they may just assume that D
>> behaves like C++ in all of the corner cases, and if I'm correct about
>> the RAII vs stack alloc above, it may not.
> 
> 
> One of the things Walter tends to do (which bothers some D users) is to  
> emulate C/C++ where appropriate. This makes porting C/C++ to D easier  
> because there aren't corner cases where it does something subtley  
> different. Of course I may be wrong, there may be a corner case, but I  
> believe Walter intentionally tries to reduce these.
> 
> In our case, I believe the D syntax:
> 
> auto A a = new A();
> 
> and the C++ syntax:
> 
> A a;
> 
> behave almost identically. The difference being that D current does not  
> allocate on the stack, but, that may change and if it does they will  
> behave identically.
> 

OK I'd like some clarification about the C++ side of things.
It seems to me that both syntaxes such as

A a = A();

and

A a;

have been said to cause stack allocation in C++.  Is this true?

A a;
Currently that is the syntax for declaring a variable.  It will break 
any code that relies on 'a' being initialized to null.
If that's alright, you could change it to mean RAII/stackalloc in the 
context of a function, but then you lose consistancy with D's practice 
of initializing variables to knownly invalid values.

>>>> For a D newbie reading code, having a keyword is good because it  
>>>> gives  them a keyword to search the spec for. Better yet if it is  
>>>> 'scope',  because that gives them an idea of where to look.  This 
>>>> is  also one of  those things that is used seldom enough that the  
>>>> succinctness gained  from implicit things is only slightly helpful.
>>>
>>>   The C++ programmer will already know what it does.
>>> The Java/C++ programmer will search for 'new' (because it's absent).
>>> In either case they'll find the docs on RAII at the same time as 
>>> they  find  the docs on 'new'.
>>
>>
>> Searching for 'new' is a good idea.  What if they don't think of that?
> 
> 
> Then they'll look for the section on variables, or class references, or  
> declaring variables in function, or .. It shouldn't be all that hard to  
> find, assuming half decent documentation.
> 
>> The only reason I would associate the absence of 'new' with allocation
>> is because of the talk from C++ programmers on this newsgroup.
>> Otherwise it's some kind of wacky cool feature of D that they have not
>> read about yet (as it is now!).  This is why I prefer 'scope' - it would
>> make scope, unambiguously, THE keyword to search for to find out what's
>> going on.
> 
> 
> Have you tried searching for an existing D keyword in the docs? First 
> you  have to decide which section to go to; is it a declaration, is it a 
> type,  is it a property, is it .. Sure it could be made easier with 
> different  documentation but that's all this boils down to, ensuring the 
> information  is in a logical place, if that's the case it makes no 
> difference whether  your searching for a specific keyword or 'RAII' or 
> 'new' or 'class' etc.  In an ideal world whatever you search for should 
> find the results you need.
> 

Yeah.  I find the current situation wrt searching the D spec pretty 
clumsy.  It would really help to be able to search the spec, and just 
the spec, and not the forums or anything else at the same time.  I think 
once that is handled, then using keywords searches will narrow things 
down to a very small number of results that someone could read through. 
  Having to figure out what section something is in is good in some 
cases, but in others is a real drag IMO :(

>>>> In a nutshell: assume a programmer of undefined language 
>>>> background,  and  I believe 'scope' will be much more informative 
>>>> (and therefore   intuitive) than a blank space.
>>>
>>>   I've assumed programmers from 2 types of background above, I still  
>>> think  removing 'new' is better than adding 'scope'.
>>>
>>
>> That's fine as long as you clear up the two things:
>> - What tells a non-C++ coder to look up 'new'? (besides another coder)
> 
> 
> See above. This is a documentation issue.
> 
>> - The possible assumption by a C++ coder that RAII syntax means stack
>> allocation.  (and subsequent whammy when they hit a pointy corner case)
>>
>> Proving that there are no corner cases in 'stack allocation implies
>> RAII' or me being wrong about that meaning in C++ should clear up that
>> second one.
> 
> 
> See above. I believe the behaviour is almost identical (and may become  
> identical in the future). I believe Walter intentionally avoids corner  
> cases WRT C++ behavior. That's the best I can do.. if I knew of a 
> corner  case I'd let you know ;)
> 
>>
>>
>> OK I think I have not explained my example very well.
>> First I'll simplify my original example and try harder to explain.  I've
>> added comments this time.
>> Here it is:
>>
>> import std.stdio;
>>
>> // class used as a function
>> class f
>> {
>>      int result;
>>      int tempVar1; // this is your non-static variable
>>
>>      static int opCall( int x )
>>      {
>>          // what do you know, not so 'static' after all!
>>          f instance = new f(x);
>>          int result = instance.result;
>>
>>          // cleanup
>>          delete instance;
>>          return result;
>>      }
>>
>>      this( int x )
>>      {
>>          // If this were a function with nested functions,
>>          //   then the main execution would occur here.
>>          result = x + internalFunc();
>>      }
>>
>>      // this is a non-static function
>>      int internalFunc()
>>      {
>>          return 314;
>>      }
>> }
>>
>> void main()
>> {
>>      int x = 42;
>>      x = f(x);
>>      writefln(x);
>> }
>>
>> Now I'll rewrite it into a free function:
>>
>> import std.stdio;
>>
>> int f( int x )
>> {
>>       // Error! internalFunc is not defined.
>>      return x + internalFunc();
>>
>>      // oh but here it is... nested functions don't allow this
>>      int internalFunc()
>>      {
>>          return 314;
>>      }
>> }
> 
> 
> Err.. you've re-written your constructor as a free function? not the  
> static opCall. Why?
> 
> I would have expected this free function:
> 
> int free_function( int x )
> {
>           // what do you know, not so 'static' after all!
>           f instance = new f(x);
>           int result = instance.result;
>           // cleanup
>           delete instance;
>           return result;
> }
> 
> void main()
> {
>      int x = 42;
>      x = free_function(x);
>      writefln(x);
> }
> 

That does work.  Forces you to put the function outside of the class, 
but meh, it's just aesthetics/syntax sugar.

> I've gotta go.. so, I'll just post what I've said so far and see what 
> you  respond with.. I may reply again to the stuff below.. :)
> 
> Regan
> 
>> In a trivial case like this all you have to do is move the nested
>> function internalFunc to the top.  In other cases though, where you want
>> to be able to places the functions where they intuitively belong, this
>> becomes annoying.  This is that forward referencing issue I was talking
>> about.
>>
>> Also, if 'f' were derived from another class, then the code executed 
>> in  f's constructor would have access to the super class's members.  
>> That's  more to do with OOP, but it's all hidden behind a static opCall.
>>
>> I suppose you could use a module function to construct and delete a  
>> private class that gets used for it's own scope and inheritance.  I  
>> worry that I am missing some detail that stopped me from just doing 
>> that  before.
>>
>> Sean, perhaps you could share some of your uses for static opCall 
>> since  I'm doing such a bad job at this?
>>
>>
>> Only slightly related - I wonder if D's 'new' even implies heap  
>> allocation as it is.  That might just be a dmd thing, but I am 
>> looking  in the spec at Classes -> ctors/dtors, and it doesn't say 
>> where they get  allocated.  Maybe someone who is good at searching the 
>> spec can find if  it says where classes get allocated?
>> My thought, as xs0 has mentioned as well, is that 'new' could be 
>> defined  independant of those implementation details.  For example, 
>> just say that  'new' will allocate a class instance, and that the 
>> instance will be  cleaned up only after there are no more references 
>> to that class.  The  only exception being explicit deletion.  Since 
>> RAII gaurantees, AFAIK,  that there are no references leaked into code 
>> that gets executed after  the scope is exited, then stack allocation 
>> would work here.  I'm finding  it hard to imagine code that would 
>> require an object to be in the heap's  address range or in the stack's 
>> address range.  xs0 says there are other  cases in which a class can 
>> be allocated on the stack without breaking  the gaurantees of no 
>> premature deletion, and I wonder what they are.
> 
>