Walnut

Dan Lewis murpsoft at hotmail.com
Wed Jan 2 22:26:38 PST 2008


Alan Knowles Wrote:

> snip snip..  - lots of bits inline.
> 
> Actually I forgot to mention - have you seen ECMAscript 4 (the new one) 

Yeah, so far I've been targetting ECMAScript 3, but I've seen 4 and have it in mind.  I figured I'd worry about the difference after it was running js files.

> > Yeah, the opAssign/opCall is only being used so I can go:
> > Value v = cast(Value) 4;
> >
> > instead of:
> > Value v;
> > v.i = 4;
> > v.type = TYPE.NUMBER;
> >   
> yeah I was hoping for something like:
> auto v = new Value(4);

I wish I could, but auto only accepts "simple" data types, and constructors can't even be faked in structs.  One would need to store it in a class, which involves keeping unweildy, opague vtbls, forces us to use the heap and pass by reference (structs can go either way)

What we can do now is go:

Value v = 4;

and have it correctly use opAssign and opCall.  What it fails to do is handle things like:

Value myFunc() {
  return 4;
}

>(If you decide to go for it, I might 
> get bored one day, and help you refactor the old code ;)

You're always welcome to try refactoring it any which way.  If the resultant program is more elegant, it goes in my source.

> I suspect this may get into a bit of trouble when you deal with some of 
> the weird and wonderfull scoping stuff with Javascript.
>  From what I remember:
> 
> CallableFunction extends Object
> Value can hold an object...
> 
> CallableFunction holds a reference to the FunctionDefinition (code etc.)
> FunctionDefinition holds a reference to the Creation scope...

Actually, that's pretty easy.  Value stores both callable js functions and js objects, and the js functions hold pointers to native functions.  

Getting more challenging, the scope for the function takes two aspects, first we have a bunch of identifiers that we need to know, and second, we need those to be local to the function.  My plan was to essentially push Values onto the local part of the call stack (below EBP)

> >> It may be better to switch to more obvious/classic methods, - overloaded 
> >> constructors or an overloaded static method "construct()", / 
> >> to[typename], index(int id) etc.
> as D uses the cast keyword, it's actually marginally shorter,
> a = cast(String) theval

The problem is that cast(string) X can only be defined on either the string struct (which is native to D), or for *one* type via the opCast method.  The reason for only being able to do it for one type, is because the function signatures in D need to have the same return type (I dunno, ask Walter?)

> a = theval.toString();
> obviously, you could use as[typename], to make it distinct...

Yeah, that could work.

> >> I would be tempted to create a method to generate this type of code, 
> > It is very tedious to maintain that one.  I'll probably try to do something like that soon.
> When you start binding something like gtk, with craploads of enum's 
> expressed as object properties, the whole lookup stuff gets even more 
> complex. I suspect the answer is to have a method to add/get  property 
> etc. and use assoc. arrays to start with, then optimize the crap out of 
> it later..

Yeah, I would, except I wouldn't be able to optimize out AA's.  They're part of D, not part of my code.  What I believe I could do is to use AA's, and later use opIndex, opAssign, opIn_r; but essentially this doesn't provide much benefit over what I've got now - the produced functionality is same; and I *can* declare a static literal.

> Most of the classes are really just static classes - just used to tidy 
> up the code, rather than actually doing real encapsulation.
> eg.
> static class Date {
>       void  getHours(....) { }
>       void  getTime(....) { }
> }

Oh, I didn't know that got optimized out.

> Although I have to add prefixes in the code generator when doing 
> bindings, as library writers seem to have a horible habit of using D 
> keywords or common method names ;)

Yeah, I don't mind folks using common names, as long as they tidy up their namespaces.  Walnut at this point is not tidy or threadsafe.  My program is using static global variables for now, and none of the modules identify themselves as walnut.module.

> >> interpreter.d
> >> (might be better to rename it tokenizer.d)
> Actually having the tokenizer available is really usefull -
> http://www.akbkhome.com/blog.php/View/156/Script_Crusher.html


To follow up, today I created the functions interpret() and tokenize().  The tokenizer is return Value's, which can now also be non-morphemic tokens (like '=' and 'a bunch of stuff in parens').  The tokenizer puts data into the Value for morphemic tokens (like numbers, strings, etc)

This is actually a ways better than DMD because it only needs to be read once.

> I think you are being a bit hopeful on that. - I face the same problem 
> with the steps, by the time I've finished the parser, my brain is usualy 
> at exploding point and I give up ;)

Yeah, I realized that you can only efficiently have a single instruction address; which is why I couldn't move beyond the current token.  Once could theoretically write a predictive parser, but those are evil.

> 
> I was wondering, although you can not copy DMDscript directly, If 
> someone (eg. me) wrote a summary of the steps that where involved in the 
> parser/code gen stage.. - and posibly to opcodes, then you would not be 
> breaking copyright??? based on code documentation, rather than actual 
> code????
> - It would save you a considerable amount of pain.....

Yes, except the object isn't to copy DMDScript without the license, the objective is to create an engine that's significantly better.  At the moment, I would say roughly half the code is written and I'm using 108KB vs DMDScript's 513KB.  The parser is the only remaining component before it can (incorrectly) run javascript files.  The rest is debugging.

> I need to understand how Walter solved the closures bug in the last 
> release - I copied the code into my repo, but did not have time to 
> understand it.
> The problems you get with Javascript, is that the scope is not only from 
> Global, but also creation scope (which may not be a compile time).. and 
> outer layers (eg. functions within functions etc.)

You have a scope chain, essentially, every function containing this one (including global) is checked in order from this context up to global.  This can be done by examining the stack during runtime; which should only be storing Value structs, so it should be pretty readily understood.

> > If I compile all functions down to (unoptimized) native code with the same call interface as the natives (my dream) then I could probably just use the stack to handle scope as per the natural way instead of faking it like most interpreters.
> >   
> I've looked at this a few times, I dont think you will ever get native 
> code out of a scripted language very well. (let alone understanding 
> gcc's internals to make it happen ;) - One thing to think about is how 
> it may be possible to write your opcode arrays to memory, and how to 

The plan is to write opcodes to memory strictly operating on whatever's in my Value structs.  There isn't any ambiguity between types or sizes, and I can probably take advantage of D's inline asm statements to ease things a bit.

> duplicate your stack (so that  you r interpreter can eventually handle  
> multi-threaded applications), key to this is making the Value object 
> serializable/unserializable..

You mean like fork().  I'm thinking like; being able to run multiple instances of ecmascript by calling:

global = Global_init();
interpret(source,global,args); 

as many times as I like within the same program from different threads (that someone else can go ahead and figure out how to make)

Value is already serializable.  It's a struct.  One of the reasons I hate classes is because they're opague, and thus very hard to serialize.

> Have you updated Walnut 1.0 with Walter's last change? - the closure fix?

Nope.  I should though, and I should make 1.x run on D 2.x

> Yes, There are alot of other stuff I've added to DMDscript that could do 
> with a better home ;)

Highly interested.  Also, 1.x almost has native ActiveXObject.  It needs a few bugs worked out, and I haven't had the brainpower to face it again for a while.

At the moment, fromVariant isn't recognizing whatever type is being passed for numbers (as seen by running test\activex.nut)  It's recognizing functions and I think even letting you call them.  It's enumerating the properties perfectly.  I had to comment out the Put method, but I'd like to refactor set and setByRef into that.  That would make ActiveXObject more native to Walnut 1.x than JScript.

: p

Regards,
Dan


More information about the Digitalmars-d-learn mailing list