[D.learn] Re: Walnut

Wed Jan 2 16:40:06 PST 2008

snip snip..  - lots of bits inline.

Actually I forgot to mention - have you seen ECMAscript 4 (the new one) 
- While aiming for that as a target may be a bit adventurous, It's 
probably worth thinking about how some of it will be implemented eventually.

> Yeah, the opAssign/opCall is only being used so I can go:
> Value v = cast(Value) 4;
>
> instead of:
> Value v;
> v.i = 4;
> v.type = TYPE.NUMBER;
>   
yeah I was hoping for something like:
auto v = new Value(4);
Which is roughly the same length, and is a little clearer.. - probably 
worth thinking about for new code.. (If you decide to go for it, I might 
get bored one day, and help you refactor the old code ;)

> The only magic that ever happens there should be the automatic type property assignment.
>
> Then there's the opCall(Value, Value, Value[] ...) which is to call Functions, and the opIndex, opIndexAssign, opIn_r which are to use Values as Objects.
>
> The promotion of the Value struct to hold Function, Array and Object is a blatant diregard of the ECMAScript spec, however, it *is* semantically consistent, and consistent with the language itself.  It could be used to bring a significant structural advantage as we now have a single primitive to work with; and since the original form needed to disambiguate the type of a Value anyways.
>   
I suspect this may get into a bit of trouble when you deal with some of 
the weird and wonderfull scoping stuff with Javascript.
 From what I remember:

CallableFunction extends Object
Value can hold an object...

CallableFunction holds a reference to the FunctionDefinition (code etc.)
FunctionDefinition holds a reference to the Creation scope...

>   
>> It may be better to switch to more obvious/classic methods, - overloaded 
>> constructors or an overloaded static method "construct()", / 
>> to[typename], index(int id) etc.
>>     
>
> I am actually starting to think that the Value.to[typename] format is cumbersome, as in all honesty, I'm not sure whether the output of a Value.toString() which is a Number object containing a value of 4 should be "4", "[object Number]", "4.0" (it's a double), or what.  I was then wondering how this should relate to the methods that we have; Object_prototype_toString, RegExp_prototype_source, etc.
>
> So, there'll be a semantic change there somewhere to disambiguate, as I'm sure we both agree that ambiguity is bad.
>   
as D uses the cast keyword, it's actually marginally shorter,
a = cast(String) theval
a = theval.toString();
obviously, you could use as[typename], to make it distinct...

>   
>> = This is going to make the future code alot easier to read, and 
>> understand. (along with maintain, enable others to quickly work out what 
>> is going on )
>>
>> structure.d:
>> I would be tempted to create a method to generate this type of code, 
>>     
>
> It is very tedious to maintain that one.  I'll probably try to do something like that soon.
>
> To expand, I had originally hoped to be able to use Associative Arrays, but they apparently contain a pointer to a complex hashing structure with even more pointers below.  I had hoped to simply sort the char[] pointing structures based on the strings alphabetically and do a binary search; which is probably faster for the small sets typically used for ECMAScript objects.
>
> The structure.d file was an effort to create a static literal which wouldn't need any memcpy or anything of the sort; it would be loaded in via DMA straight from the file and be useable immediately.
>   
When you start binding something like gtk, with craploads of enum's 
expressed as object properties, the whole lookup stuff gets even more 
complex. I suspect the answer is to have a method to add/get  property 
etc. and use assoc. arrays to start with, then optimize the crap out of 
it later..
>   
>
> When I converted Walnut 1.x from DMDScript, I was mostly doing it to understand more of what Walter had written to learn how a good implementation looks.
>
> I noticed that there was alot of redundancy in each of the files, and that my head was filling with all sorts of different constructs as I examined each file.  That's why I converted it to aspect oriented.  Now the code is so boringly simple that apart from value.d it reads like a list.
>
> The problem with encapsulating JavaScript classes with D classes is that spec requires you to be able to expand JavaScript objects, so you eventually have to use an array notation inside that; as per DMDScript and Spidermonkey.  You end up duplicating several properties inside the array notation and class notation; and there's extensive code to look up the address of an ECMAscript property.  This is why even DMDScript property lookup is a few times slower than Lua or Io.
>   
Most of the classes are really just static classes - just used to tidy 
up the code, rather than actually doing real encapsulation.
eg.
static class Date {
      void  getHours(....) { }
      void  getTime(....) { }
}
Although I have to add prefixes in the code generator when doing 
bindings, as library writers seem to have a horible habit of using D 
keywords or common method names ;)

>   
>> Not sure why your standard method call is using varargs (...) - unless I 
>> misread the code..
>>     
>
> The Value[] arguments was originally not a varargs, and you could pass it an array of Values just fine.  My interpretation of varargs is that it converts a set of Values to a Value[] at the caller by prepending the length?  So the varargs would simply mean you can now call the function passing:
>
> (self, cc, arg1, arg2, arg3), as well as:
>
> Value[] args = { arg1, arg2, arg3 };
> (self, cc, args)
>
> and that the call would look identical.
>   
mmh,, kind of cute ;) - That's reminds me of those cool language 
features, that confuses other people when they first see it
[string] / [number] => resulting in an array of strings  [Pike]
object += object (adding event listeners to objects - C#)

>   
>> interpreter.d
>> (might be better to rename it tokenizer.d)
>>     
>
> I'm (now) hoping to run the parser algorithms from the same file, and making sure it inlines the lexer.  I'm not sure if I want to generate tokens and then interpret them, or if I can use the finite state brought about by position in the lexer switch to somehow mean the same (preventing a double-switch).  The problem with that is that I can't seem to think beyond one token very well - the same one faced by the guys who invented separation of lexer, parser, interpreter.
>   
Actually having the tokenizer available is really usefull -
http://www.akbkhome.com/blog.php/View/156/Script_Crusher.html

I think you are being a bit hopeful on that. - I face the same problem 
with the steps, by the time I've finished the parser, my brain is usualy 
at exploding point and I give up ;)

I was wondering, although you can not copy DMDscript directly, If 
someone (eg. me) wrote a summary of the steps that where involved in the 
parser/code gen stage.. - and posibly to opcodes, then you would not be 
breaking copyright??? based on code documentation, rather than actual 
code????
- It would save you a considerable amount of pain.....

>   
>> Scope Management?
>> Opcode runtime (~2800 lines of code in dmdscript)
>>     
>
> Yeah, I was hoping to tie scope in with something during parsing of {}.  I've already got a Global object which is already being looked at for non-keywords in my [rather pathetic so far] lexer.  I think what I want is a bunch of Value's, which are of TYPE.OBJECT, or perhaps a new type just like it, which carry variables and stuff.
>   
I need to understand how Walter solved the closures bug in the last 
release - I copied the code into my repo, but did not have time to 
understand it.
The problems you get with Javascript, is that the scope is not only from 
Global, but also creation scope (which may not be a compile time).. and 
outer layers (eg. functions within functions etc.)
> If I compile all functions down to (unoptimized) native code with the same call interface as the natives (my dream) then I could probably just use the stack to handle scope as per the natural way instead of faking it like most interpreters.
>   
I've looked at this a few times, I dont think you will ever get native 
code out of a scripted language very well. (let alone understanding 
gcc's internals to make it happen ;) - One thing to think about is how 
it may be possible to write your opcode arrays to memory, and how to 
duplicate your stack (so that  you r interpreter can eventually handle  
multi-threaded applications), key to this is making the Value object 
serializable/unserializable..
>
>
> Actually, Walnut 1.0 is branched from DMDScript, but I reformatted it, cleaned it up and the likes.  It almost has native ActiveX, moreso than JScript.  But there are major bugs that I don't understand.  Perhaps you'd be more prone to help there than Walnut 2.x.
>   
Have you updated Walnut 1.0 with Walter's last change? - the closure fix?
Yes, There are alot of other stuff I've added to DMDscript that could do 
with a better home ;)

Regards
Alan

> Well, that was a HUGE ramble.
> Regards,
> Dan
>