[vworld-tech] Embedding JavaScript

Sat Jan 24 11:20:26 PST 2004

(All this about the C engine "SpiderMonkey", and not the Java engine
"Rhino".  I know much more about the C engine, and would recommend that
people ask the addresses listed on the Rhino pages for more information
about that system, lest I mislead them inadvertently.)

I'm sorry this is so long, and also that it's so short.  I wish I had
the time to make it both more thorough and more concise.  I should
apologize up front for any errors within, since I'm writing in a bit of
a rush, and editing haphazardly as I go.

On Jan 23, Bruce Mitchener wrote:
>     * Watchdog for preventing infinite loops.

As I mentioned previously, the JS engine can be configured to make a
call-out for every backward branch or function return, so that embedding
apps can police infinite or just over-long scripts.  Common choices for
that callout include a hard-and-fast limit, prompting the operator
through a dialog, or simply sleeping the thread in question for a second
to limit the damage it can do.

>     * Any other resource starvation prevention mechanisms.

There's a compile-time configurable limit on function-call depth, mainly
to complement the branch-callback and catch unbounded -- or
insufficiently bounded -- script-function recursion.

A recent addition is a C-stack-limit API, which allows the embedding to
specify a minimum stack address to be consumed by C-frame recursion and
depth.

>     * Support for threading, continuations, other means of
>       multitasking.

The JS threading and locking model is, IMO, one of the best available
for scripting engines.  Any number of contexts can operate concurrently
within a given runtime, permitting multithreaded apps to call back into
the engine as required, without worrying about Python-GIL-style
deadlocks, and the locking of object access is very efficient.  Races
against the garbage collector are mediated by the Request API[gctips],
which basically amounts to a reader/writer locking pattern. Highly
optimized for the (outrageously common) case that a given object is only
accessed on one thread, use of the Request API to schedule GC also
allows us to use it as a synchronization point for making an object
"multi-threaded" the first time that it's accessed from a second thread.
Prior to that, there is no locking overhead incurred for object property
access, which results in quite excellent performance without sacrificing
the safety of fine-grained "Bacon bit" locking.

There's no directly-exposed threading primitive within the language,
though many embeddings provide their own.

(An aside: JS provides a very small set of objects -- String, Date, Function,
Object, Array, RegExp, Number, Script -- that are applicable to pretty
much any embedding, and which don't impose security or I/O models on
their hosts.  This has let us run virtually unmodified on security
domains ranging from electronic commerce servers to browsers to MUDs,
and on things as small as the Palm Pilot; the Avantgo browser uses the
SpiderMonkey engine, I think built simply without the debugging API and
regex support, possibly without the decompiler.)

>     * Impact of GC on a server app, and ways to minimize that impact.

This is one weak spot, which can require careful work to properly tune
for an application's use.  The GC excludes all other script execution on
that runtime (an application can host multiple runtimes, but cannot
share objects between them; they are entirely separate incarnations of
the JS engine) while it runs, so scheduling that operation frequently
enough to keep the pause time acceptable is a bit of an art.

There have been plans for a generational/incremental GC for the JS
engine for some time, but I haven't found the cycles to finish the job.

>     * Security principals: What are they? How do they work? What
>       do they really provide and what types of security models
>       are they useful for?

There is an almost-uselessly short description of the principals system
in the Embedder's Guide[embguide], upon which I will elaborate only briefly:

Scripts and objects have principals associated with them.  Objects
inherit their principals, generally, from the script that created them.
You would inherit from (or nest, in pure C) the JSPrincipals struct, and
then the engine can give you the principals object associated with the
current JS execution context and the objects that it's trying to operate
on.  Mozilla uses this mechanism to deny cross-site access to document
contents, while permitting privileged script -- the UI/"chrome" JS, for
example or scripts that are signed and have been explicitly granted
additional privs -- to perform additional operations.  The JS principals
model is very much minimal, and designed to provide mechanism and not
policy.  (Principals are integrated into the woefully-underdocumented
XDR functionality for serializing and deserializing scripts, though, so
that you don't lose principals information across that round-trip, as
long as your embedding app implements the transcoding bits required to
serialize its private state.)

>     * The joys of XPConnect: Simpler than SWIG, Boost::Python and
>       most other binding systems!

XPCOM and the xptcall/xptinfo portions of it allow
"zero-generated-code"[zgc]
bridging between pretty much any languages' XPCOM components.  You
define an interface in XPIDL[xpidl], and use that to generate both C++
header files and language-independent typelibs.  Those typelibs are
loaded by the XPCOM runtime and are used to invoke and handle, via
ABI-specific marshalling code, XPCOM interface methods.  Implementations
are dynamically loaded and registered with a somewhat COM-esque system,
and can be added without modification of the embedding application.  (In
some applications, they don't even need to restart.)

XPConnect[xpconnect] is really just the XPCOM binding set for JavaScript
(specifically, Mozilla's JavaScript engine, which I think is the only
one authorized by the folks at Sun (!) to use the name "JavaScript", but
I promise to not bring that up again).

Also of possible interest is the JS debugging API, which has been
exposed in a variety of forms (gdb-like command line interface, the
"Venkman" Mozilla script debugger[venkman], itself written in JS and
XUL, CORBA-remoted for earlier incarnations of the Netscape
"LiveWire" application systemi, and wrapped in Java for the original
Netscape JavaScript Debugger).  This API is a little bit rough in
places, but quite usable, and under active maintenance by the Venkman
team.

> One of these days soon, I'll write up a post about why I no longer think 
> Python is a great choice for use in a server app for vworlds now that 
> I've written a fair bit of code using Python (but not for vworlds).

OOC, does Stackless Python make you happier?  Other than the fact that
it seems to be maintained quite sporadically, it seems somewhat
attractive, if only because it gets out from under the crushing weight
of the Python GIL.

> The docs on embedding JavaScript seem pretty decent on Mozilla.org. 
> What I find missing is what I find to also be missing from other 
> language embedding docs .. the tips and tricks for getting good/high 
> performance and good patterns to follow.

Back when I did Mozilla/JS for a living, I always wanted to write a
series of articles with Brendan and Rob Ginda and co in the vein of
"Effective C++", but aimed at JS engine embedders.  There's a lot of
"Effective JSAPI" knowledge out there that isn't captured outside
newsgroup posts and the like, unfortunately, and it would be nice to fix
that up.  Maybe Mr. Crowder has some old IRC logs with advice that he
would codify more properly. =)

[gctips] http://www.mozilla.org/js/spidermonkey/gctips.html, esp #5.
[embguide] http://www.mozilla.org/js/spidermonkey/apidoc/jsguide.html
[xpidl] http://mozilla.org/scriptable/xpidl/index.html
[zgc] http://mozilla.org/scriptable/zero-generated-code-proposal.html
[xpconnect] http://www.mozilla.org/scriptable/
[venkman] http://www.mozilla.org/projects/venkman/

Mike