XML API

Michel Fortin michel.fortin at michelf.com
Sun May 24 11:13:31 PDT 2009


On 2009-05-24 12:51:43 -0400, Daniel Keep <daniel.keep.lists at gmail.com> said:

> Michel Fortin wrote:
>> ...
>> 
>> A callback API isn't necessarily SAX. A callback API doesn't necessarily
>> have to parse everything until completion, it could parse only the next
>> token and call the appropriate callback.
> 
> When I talk "callback api," I mean something fundamentally like SAX.

SAX is defintely a popular callback API for XML, but to me a callback 
API just imply that some callback gets called.


> The reason is that if your callback api only does a single callback, all
> you've really done is move the switch statement inside the function call
> at the cost of having to define a crapload of functions outside of it.

The thing is that inside the parser code there is already a separate 
code path for dealing with each type of token. Various callbacks can be 
called from these separate code paths. When you return after parsing 
one token, the code path isn't different anymore, so you need to add an 
extra swich statement that wouldn't be there with a callback called 
from the right code path.


>> If I can construct a range class/struct over my callback API I'll be
>> happy. And if I can recursively call the parser API inside a callback
>> handler so I can reuse the call stack while parsing then I'll be very
>> happy.
> 
> I don't see how constructing a range over a callback api will work.
> Callback apis are inversion of control, ranges aren't.

Your definition of a callback API is about inversion of control. My 
definition is just that it parse one token and call a function for that 
token. If you read what I wrote using your definition, it obviously 
can't work indeed.


> ...
> 
> Like I said, this seems like a lot of work to bolt a callback interface
> onto something a pull api is designed for.
> 
> At best, you'll end up rewriting this:
> 
>> foreach( tt ; pp )
>> {
>> switch( tt )
>> {
>> case XmlTokenType.StartElement: blah(pp.name); break;
>> ...
>> }
>> }
> 
> to this:
> 
>> pp.parse
>> (
>> XmlToken(Type.StartElement, {blah(pp.name);}),
>> ...
>> );
> 
> Except of course that you now can't easily control the loop, nor can do
> you do fall-through on the cases.

Again, my definition of a callback API doesn't include an implicit 
loop, just a callback. And I intend the callback to be a template 
argument so it can be dispatched using function overloading and/or 
function templates. So you'll have this instead:

	bool continue = true;
	do
		continue = pp.readNext!(callback)();
	while (continue);

	void callback(OpenElementToken t) { blah(t.name); }
	void callback(CloseElementToken t) { ... }
	void callback(CharacterDataToken t) { ... }
	...

No switch statement and no inversion of control.

And here's my current prototype for a range:

	alias Algebraic!(
		CharDataToken, CommentToken, PIToken, CDataSectionToken, AttrToken,
		XMLDeclToken, OpenElementToken, CloseElementToken, EmptyElementToken
		) XMLToken;

	struct XMLForwardRange(Parser)
	{
		bool empty;
		XMLToken front;
		Parser parser;
	
		this(Parser parser)
		{
			this.parser = parser;
			popFront(); // parse first token
		}
	
		void popFront()
		{
			empty = !parser.readNext!(callback)();
		}

		private void callback(T)(T token)
		{
			front = token;
		}
	}

Constructing a pull parser using the same pattern should be pretty easy 
if you wanted to.


-- 
Michel Fortin
michel.fortin at michelf.com
http://michelf.com/




More information about the Digitalmars-d mailing list