How To Dynamic Web Rendering?

Sun May 15 11:54:22 PDT 2011

Alexander wrote:
> If application is a set of pages, there should be a way to share
> data between pages without using complicated or expensive
> persistent storage.

FYI, PHP uses files on the hard drive for sessions by default...
optionally, it can use a database too. AFAIK, there is no
in-memory option beyond the options the kernel or database
provide for file/table caching.

> Again - session data s something that is transparently (more or
> less) accessible to all pages. dhp doesn't do this, as far as I can
> see.

Nope, it'd be provided by an external library.

Like I said in another sub-thread, I thought about cloning PHP's
approach for a while. (There's a small amount of code to this
end still commented out in my cgi.d and phphelpers.d modules)

But, I never finished it, because I use sessions so rarely. Most
usages of it don't match well with the web's stateless design -
if someone opens a link on your site in a new window, can be
browse both windows independently? Changing one *usually* shouldn't
change the other.

The two exceptions are logging in, which barely requires session
data at all - matching session id and other identifiers to user id
is all that's really necessary. Another big exception is something
like a shopping cart. In cases like that, I prefer to use the
database to store the cart anyway.

So I've had no need for a fancy session system, and thus, never
finished mine. It provides some session ID functions, but lets you
do whatever you want with it beyond that. I can use them to store
temporary files, as keys into the database, whatever.

(BTW, PHP's automatic session handling *only* uses the session id.
This leaves it open to trivial session hijacking. In web.d, the
session functions automatically check IP address and user agent
as well as cookies. It can still be hijacked in some places, but
it's a little harder. To prevent hijacking in all situations,
https is a required part of the solution, and the cgi library can't
force that unilaterally. Well, maybe it could, but it'd suck.)

I have considered adding a library for some persistent stuff though.
One example is a service that pushes updates to clients. (The client
uses ajax long polling or something similar to work around the
limitations of the web environment.)

I've written two programs in D that work like this - a real time
web chat server and the more generic (and incomplete) D Windowing
System's ajax bridge.

But neither one simply broadcasts messages to waiting http clients.

This would be useful for doing updates like Facebook. If someone
posts a new wall post, it shows up on everyone's browser, pretty
quickly.

The usage code would look like this:

broadcaster.push(channelId, message);

The waiting clients run javascript that looks like so:

var listener = new Listener("channelId");
listener.messageReceived = function(message) { ... };

Very simple. Maybe I'll write that today.

There's two other things a longer running process is good for:

1) Session storage (see, on topic!). You speak a protocol to it...
it's basically a reinvention of a database engine using memory
tables. Might as well just use a real database engine with
in-memory tables, or temporary files, relying on the kernel to keep
them in memory.

I'm in no rush to write this, since there's little need and existing
stuff does it well when you do want it. If you have complex objects,
you might want to keep them in memory.

In that case, you can just use an embedded http server instead of
cgi. My cgi class lets you do this without changing the bulk of
your client code, and I've written an embedded http server in D
to go with it.

2) Doing work that takes a long time or needs to be scheduled.
There's three approaches to this, all of which work today:

a) fork() your process, letting the child do the work while the
main cgi program finishes.

This is very convenient to program:

// prepare stuff

runInBackground( {
      // write code here to run, using variables prepared above
});

// terminate, not worrying about the background process. It will
// live on without you.

The downside is I *believe* it doesn't scale to massiveness. Then
again, most our sites aren't massive anyway, so does it matter?

Also, I'm not sure if it scales or not, since I haven't tried. The
most I've actually done using it for real was about 20 concurrent
users. It did fine there, but everything does fine for small n.

Another downside is communicating with it is hard. Could be solved
by adding some message passing functions to the library, or
combining it with the persistent notifier service described above.
You would have to store it's process ID or the name of a pipe to
talk to it though.

Also, remember if it is doing work, you don't want to talk to it
again. A cross-process lock is probably desired. Such is not
handled automatically either.

Finally, long running processes can't be updated. You have to kill
them and restart, but if their state is only in memory, this means
you lose data.

Still, it is quite convenient!

b) The way most PHP sites do things is to write out work to do
later to the database, then depend on a cron job to check it
and execute the stored commands.

While it still takes care to ensure things are locked, communication
is a little easier - instead of talking to the process, you poll
the work list in the database. It could also use the persistent
notification service.

The cron job may be triggered by a lot of things. Page views (ugh),
including automatic refreshes (double ugh), a real cron daemon, or
user action, such as hitting ok on a schedule screen then doing
it in the foreground, or a desktop application.

One of my work projects went with the latter. The server cron app
was moved to the user's desktops. Thanks to it all being written
in a real programming language like D, reusing the code from the
server app in a compiled GUI application was no trouble. I imported
the same modules and compiled them for distribution. The users
didn't have to install an interpreter or an httpd.

3) Use a persistent process in the first place, and run long work
in a background thread, or even a foreground thread that isn't
killed off by timeouts or disconnections.

An embedded http server and std.concurrency can help with this.

While all of these are possible, I usually stick with the
traditional cron approach. It's simple enough, it's effective,
and it's pretty resilient to things like the server going down.
The worklist is still in the db so it can be restarted later.

wow, I went off at length again.