auto, var, raii,scope, banana

Fri Jul 28 21:51:19 PDT 2006

On Fri, 28 Jul 2006 04:44:44 -0400, Chad J  
<gamerChad at _spamIsBad_gmail.com> wrote:
> Regan Heath wrote:
>>
> ...
>> I've already said.. I think it's intuitive because it 'looks' like a  
>> class  declared on the stack, as in C++ and like those it will be  
>> destroyed at  the end of scope. Your counter argument is that not  
>> everyone is a  C++ programmer, correct? In which case, read on..
>>
> ...
>>  I come from a C background. I have done a little C++, a little Java  
>> and  very little C#.
>>   The absence of 'new' gives us the required information. You are  
>> required  to know what the absense of 'new' _means_ just like you are  
>> required to  know what keyword X means in any given context. In short,  
>> some knowledge  of programming in D is required to program in D.
>
> I agree, the point is to make it easier to obtain that knowledge while
> learning.
>
>>  To expand on why I think it's intuitive let look at 2 scenarios, first  
>> the  C++ programmer trying D. Likely their first program involving  
>> classes will  look like this:
>>  class A {
>>   int a;
>> }
>>  void main() {
>>   A a = A();
>>   writefln(a.a);
>> }
>>  currently, this crashes with an access violation.
>
> Actually, it doesn't compile.
>
> main.d(6): function expected before (), not A of type main.A
> main.d(6): cannot implicitly convert expression ((A)()) of type int to
> main.A
>
> Maybe not the best error messages for this case, but better than an
> access violation.

Err.. oops.. my example was in fact assuming the new "omit 'new'" syntax  
was added, then I went and confused it with:

A a;
writefln(a.a);

which is actually what a C++ programmer would write to allocate a class on  
the stack and have it destroyed at end of scope.

The syntax:

A a;
writefln(a.a);

will crash, the new syntax isn't invloved at all, sorry.

>> Add the new RAII syntax  and not only does it not crash, but it behaves  
>> just like a C++ program  would with A being destroyed at the end of  
>> scope.
>
> I hear that such syntax, in C++, means that 'A' will be stack allocated,
> which AFAIK does not imply RAII.

No, but the resulting behaviour is almost identical. In C++ the syntax:

A a;

causes 'a' to be allocated on the stack and the destructor called at end  
of scope. Apart from being allocated on the stack (which Walter indicates  
in the docs is something he plans to implement at a later date for 'auto')  
it's identical to the behaviour of D's 'auto'.

http://www.digitalmars.com/d/memory.html#stackclass

"To have a class allocated on the stack that has a destructor, this is the  
same as a declaration with the auto attribute. Although the current  
implementation does not put such objects on the stack, future ones can."

> One example I can think of that would
> break RAII is if the function gets inlined, then the stack allocated
> object may not be deleted until some point outside of the scope's end.
> It will /probably/ be deleted at the end of the function that the
> enclosing function was inlined into.  I could be wrong about that though.

I'm not sure what happens when something is inlined either.

> Also, does that C++ syntax allow class references to leak into code that
> is executed after the class is deleted?

I would expect not. They should be destroyed when the scope ends and the  
stack space is released.

>>  Now, take a Java/C# programmer, their first program with classes will  
>> be:
>>  class A {
>>   int a;
>> }
>>  void main() {
>>   A a = new A();
>>   writefln(a.a);
>> }
>>  No crash, and the behaviour they expect.
>>  Then imagine the same programmers looking at each others code, the   
>> C++ programmer will understand 'new' to indicate heap allocation and  
>> the  Java/C# programmer will need to ask the C++ programmer what the  
>> absense of  'new' means, he'll reply that the class is destroyed at the  
>> end of scope.  He might also say it's stack allocated, but that's fine,  
>> both of them have  something to learn, and it has no effect on the way  
>> the program behaves.
>
> That learning process would be nice, but it assumes an extra programmer
> being around.  As I learn D, I am alone.  Either I go to the spec or I
> ask on the newsgroup.  I'm not sure how many others are in the same
> situation.

In the case where you're not looking at someone elses code you'll never  
write:

A a = A();

so you will not learn about it until you go looking for RAII. There you'll  
find the syntax, learn it, and carry on. It's no harder to learn than a  
keyword 'local'.

> I've explained below why I still don't think it's very obvious, which
> means more potential newsgroup asking.  That doesn't bug me too much
> though.

I'm not especially bothered which syntax is chosen, I just like discussing  
stuff.

>>  Simple, intuitive and elegant no matter what your background. Of  
>> course,  if you're a Java/C# programmer you have to 'learn' what the  
>> absense of  'new' means. Both have to learn that D might not stack  
>> allocate the RAII  class instance.
>>  If we use a 'scope' keyword then both sets of programmers have some   
>> learning to do. It's not much worse, but this makes it my 2nd choice.
>>
>
> "Both have to learn that D might not stack allocate the RAII  class
> instance."
> "If we use a 'scope' keyword then both sets of programmers have some
> learning to do."
>
> If they both have learning to do anyways, why not just have them learn
> 'scope'?

Right back at ya.. why not have them both learn to omit 'new'?

> I suppose it wouldn't be too bad otherwise, except for one problem:
> The unfortunate C++ programmer's program worked as expected.

Actually, my example was wrong and it doesn't. This poor guy still gets an  
access violation. Perhaps the new 'auto' syntax should just be:

A a;

then, my previous example and point would be valid.

> I say problem and unfortunate, because they may just assume that D
> behaves like C++ in all of the corner cases, and if I'm correct about
> the RAII vs stack alloc above, it may not.

One of the things Walter tends to do (which bothers some D users) is to  
emulate C/C++ where appropriate. This makes porting C/C++ to D easier  
because there aren't corner cases where it does something subtley  
different. Of course I may be wrong, there may be a corner case, but I  
believe Walter intentionally tries to reduce these.

In our case, I believe the D syntax:

auto A a = new A();

and the C++ syntax:

A a;

behave almost identically. The difference being that D current does not  
allocate on the stack, but, that may change and if it does they will  
behave identically.

>>> For a D newbie reading code, having a keyword is good because it  
>>> gives  them a keyword to search the spec for. Better yet if it is  
>>> 'scope',  because that gives them an idea of where to look.  This is  
>>> also one of  those things that is used seldom enough that the  
>>> succinctness gained  from implicit things is only slightly helpful.
>>   The C++ programmer will already know what it does.
>> The Java/C++ programmer will search for 'new' (because it's absent).
>> In either case they'll find the docs on RAII at the same time as they  
>> find  the docs on 'new'.
>
> Searching for 'new' is a good idea.  What if they don't think of that?

Then they'll look for the section on variables, or class references, or  
declaring variables in function, or .. It shouldn't be all that hard to  
find, assuming half decent documentation.

> The only reason I would associate the absence of 'new' with allocation
> is because of the talk from C++ programmers on this newsgroup.
> Otherwise it's some kind of wacky cool feature of D that they have not
> read about yet (as it is now!).  This is why I prefer 'scope' - it would
> make scope, unambiguously, THE keyword to search for to find out what's
> going on.

Have you tried searching for an existing D keyword in the docs? First you  
have to decide which section to go to; is it a declaration, is it a type,  
is it a property, is it .. Sure it could be made easier with different  
documentation but that's all this boils down to, ensuring the information  
is in a logical place, if that's the case it makes no difference whether  
your searching for a specific keyword or 'RAII' or 'new' or 'class' etc.  
In an ideal world whatever you search for should find the results you need.

>>> In a nutshell: assume a programmer of undefined language background,  
>>> and  I believe 'scope' will be much more informative (and therefore   
>>> intuitive) than a blank space.
>>   I've assumed programmers from 2 types of background above, I still  
>> think  removing 'new' is better than adding 'scope'.
>>
>
> That's fine as long as you clear up the two things:
> - What tells a non-C++ coder to look up 'new'? (besides another coder)

See above. This is a documentation issue.

> - The possible assumption by a C++ coder that RAII syntax means stack
> allocation.  (and subsequent whammy when they hit a pointy corner case)
>
> Proving that there are no corner cases in 'stack allocation implies
> RAII' or me being wrong about that meaning in C++ should clear up that
> second one.

See above. I believe the behaviour is almost identical (and may become  
identical in the future). I believe Walter intentionally avoids corner  
cases WRT C++ behavior. That's the best I can do.. if I knew of a corner  
case I'd let you know ;)

>>>>
>>>>   What's the point of using static opCall here, as opposed to a  
>>>> normal   static method or even a free function?
>>>
>>>
>>> Why not a normal static method? - implementation hiding.
>>   I'm assuming this defintion:
>> http://en.wikipedia.org/wiki/Information_hiding
>>  I'm not sure I follow your logic, did you mean "encapsulation" instead?
>>  If you did mean "information hiding" then it seems to me that using  
>> opCall  is a design decision, just as using a method called "Bob" would  
>> be. The  advantage to using "Bob" is that you can define an 'stable'  
>> interface eg.
>>  interface MyImplementation {
>>   int Bob(int x);
>> }
>>  and in that way protect your code against changes in implementation.  
>> In  short, this is a better "information hiding" solution than using  
>> static  opCall.
>>
>>> Why not a free function? - I'm assuming you mean a normal function   
>>> declared at module scope.
>>   Yes.
>>
>>> It allows you to have multiple functions with access to the same   
>>> non-static variables (member variables).
>>   What does?
>>  _static_ opCall does *not* have access to non-static variables, it  
>> can  only access static variables. Static methods and variables are  
>> very  similar to module level variables. The difference being how to  
>> access them  (class name prefix) and where they are declared (inside  
>> the class instead  of at module level). You can replace static methods  
>> and variables with  module level ones quite easily.
>>
>>> I suppose this is usually accomplished with nested functions, or OOP   
>>> that can't do implementation hiding.
>>   I'm obviously not following your point here.. can you elaborate,  
>> perhaps  another example?
>>
>>> Doing it this way allows me to avoid forward referencing issues
>>   What forward referencing issues does the code above have?
>>
>>> and also gives me more options about how my code behaves.
>>   What options? example?
>>
>>>> What problem does it solve?
>>>
>>>
>>> Annoyance from forward referencing in nested functions, but that's  
>>> just  one.  Maybe a small degree of implementation hiding for memory   
>>> management, though I haven't tried this enough to demonstrate it.  It  
>>> is  not really a solution.  It is a technique, a syntax sugar, or just  
>>> an  easy way to get D to do things you would not ordinarily make it do.
>>   So far:
>> - I'm not seeing the forward reference issue(s)
>> - It seems like a worse way to implement information hiding (than an   
>> interface and method).
>>
>>>> What new technique or method does it allow?
>>>
>>>
>>> Multiple functions using the same non-static variables with no need  
>>> for  forward referencing and possible code behaviour tweaks, etc yadda  
>>> yadda.    I admit it's not horribly important, but I don't think that  
>>> this  choice of syntax is either.  Most of all, I like it.
>>   I *like* non-static opCall, I can see benefits to that. But, so far,   
>> static opCall seems to be fairly useless to me.
>>
>>> *sigh* those were long winded answers, sorry about that.
>>   No worries. Long winded tends to imply more information and that's  
>> not a  bad thing.
>>
>>>>> I do like using the scope keyword for this btw.  It seems quite    
>>>>> consistant with D's meaning of 'scope'.
>>>>
>>>>   It's not bad but it's my 2nd choice currently.
>>>>  Regan
>>>
>>>
>>> Care to explain why static opCall is a hack and/or not useful outside  
>>> of  RAII or struct constructors?
>>   I'm trying, by learning when, where and why you use it. I can't think  
>> of a  place where _I_ would use it (which is why I think it's not  
>> useful).
>>  It's only a 'hack' when used with a struct to emulate a constructor.   
>> Strangely that 'hack' is the only vaguely good use I've seen for a   
>> *static* opCall.
>>  Regan
>
> OK I think I have not explained my example very well.
> First I'll simplify my original example and try harder to explain.  I've
> added comments this time.
> Here it is:
>
> import std.stdio;
>
> // class used as a function
> class f
> {
>      int result;
>      int tempVar1; // this is your non-static variable
>
>      static int opCall( int x )
>      {
>          // what do you know, not so 'static' after all!
>          f instance = new f(x);
>          int result = instance.result;
>
>          // cleanup
>          delete instance;
>          return result;
>      }
>
>      this( int x )
>      {
>          // If this were a function with nested functions,
>          //   then the main execution would occur here.
>          result = x + internalFunc();
>      }
>
>      // this is a non-static function
>      int internalFunc()
>      {
>          return 314;
>      }
> }
>
> void main()
> {
>      int x = 42;
>      x = f(x);
>      writefln(x);
> }
>
> Now I'll rewrite it into a free function:
>
> import std.stdio;
>
> int f( int x )
> {
>       // Error! internalFunc is not defined.
>      return x + internalFunc();
>
>      // oh but here it is... nested functions don't allow this
>      int internalFunc()
>      {
>          return 314;
>      }
> }

Err.. you've re-written your constructor as a free function? not the  
static opCall. Why?

I would have expected this free function:

int free_function( int x )
{
           // what do you know, not so 'static' after all!
           f instance = new f(x);
           int result = instance.result;
           // cleanup
           delete instance;
           return result;
}

void main()
{
      int x = 42;
      x = free_function(x);
      writefln(x);
}

I've gotta go.. so, I'll just post what I've said so far and see what you  
respond with.. I may reply again to the stuff below.. :)

Regan

> In a trivial case like this all you have to do is move the nested
> function internalFunc to the top.  In other cases though, where you want
> to be able to places the functions where they intuitively belong, this
> becomes annoying.  This is that forward referencing issue I was talking
> about.
>
> Also, if 'f' were derived from another class, then the code executed in  
> f's constructor would have access to the super class's members.  That's  
> more to do with OOP, but it's all hidden behind a static opCall.
>
> I suppose you could use a module function to construct and delete a  
> private class that gets used for it's own scope and inheritance.  I  
> worry that I am missing some detail that stopped me from just doing that  
> before.
>
> Sean, perhaps you could share some of your uses for static opCall since  
> I'm doing such a bad job at this?
>
>
> Only slightly related - I wonder if D's 'new' even implies heap  
> allocation as it is.  That might just be a dmd thing, but I am looking  
> in the spec at Classes -> ctors/dtors, and it doesn't say where they get  
> allocated.  Maybe someone who is good at searching the spec can find if  
> it says where classes get allocated?
> My thought, as xs0 has mentioned as well, is that 'new' could be defined  
> independant of those implementation details.  For example, just say that  
> 'new' will allocate a class instance, and that the instance will be  
> cleaned up only after there are no more references to that class.  The  
> only exception being explicit deletion.  Since RAII gaurantees, AFAIK,  
> that there are no references leaked into code that gets executed after  
> the scope is exited, then stack allocation would work here.  I'm finding  
> it hard to imagine code that would require an object to be in the heap's  
> address range or in the stack's address range.  xs0 says there are other  
> cases in which a class can be allocated on the stack without breaking  
> the gaurantees of no premature deletion, and I wonder what they are.