[Issue 2979] Xml tags with only attributes return as without attributes ElementParser.parse

d-bugmail at puremagic.com d-bugmail at puremagic.com
Fri May 15 06:11:33 PDT 2009


http://d.puremagic.com/issues/show_bug.cgi?id=2979





--- Comment #3 from hed010gy <y0uf00bar at gmail.com>  2009-05-15 06:11:32 PDT ---
(In reply to comment #1)
> why does the code use new Tag instead of tag_ ?
> 

Because thats in the original code! And I yet do not know the problem | D-code
2.0 | modules integral design well enough to say whether or not its best thing.
It seems to work. 

My current position is that of a user of the code, playing around to see if I
can do anything with the module.  I noticed in parsing and debugging my own
code in the onStartTag delegate callback, that if the tag was Empty but with
attributes, the ElementParser xml object returned  has a Tag object attached,
but with no attributes. That is not what I want in an XML parser.

I do not wish to extensively revise the module, only to make it work for what I
want it to do, otherwise I will be using an Expat parser. Writing a good parser
takes a long time.

The original Author(s) design is to be judged at the moment by its external
interface usability (D-friendlyness, similar to the idea of Pythonesque code)
and correctness of result, and and ability to be enhanced or fix bugs in the
behaviour, without breaking the rest of it.  So it does parse the EMPTY tag and
the attributes OK, but then creates a new Tag using only the name from the
parsed tag, as if it was really, really empty without any attributes at all. 
Empty means no markup content, but attributes allowed, hence I add here the
lines to copy any attributes found. 

Because the new tag defaults to START, all the Tags to the onStartTag callback
are marked as START even if it is empty.  This seems depriving the module user
of useful information, as why bother setting up for more work in the callback
if the Tag is actually empty.

So why a new Tag?  I know this makes fore more cpu work.  But module writer has
had make some robust assumptions, especially in an early draft in a strange new
language. (To me lots of it is still strange). The passed ElementParser object
is not a const object, because the onStartTag call back likes to set various
properites and delagates.  Perhaps the user can modify the passed Tag at this
point, and so the module functions are partly protected by passing objects
whose modification will not hurt the its code. 

Before making more of this, I personally need to have to learn to create D unit
test cases. All that fine contract - assertion stuff , doc comments, unit tests
takes time and learning. My coding efforts usually stop as soon as the code
appears to do what the test cases require, so more test cases is good.

Having a more complete validation suite of files would be essential to bringing
the parser up to a reasonable state. I found modules pretty output will not
create the Empty Tag style. Possibly thats why it wasn't tested to read it
either.

I found I could improve the pretty function to make the output look like what I
call pretty.

The isEmptyXML for Tag, which is used by pretty always returns false. 
The function could instead do a  (items.length == 0) ? true : false;

There is some argument has to whether a space is necessary before the /> of an
Empty tag. Such EmptyTags with or without attributes are a compatibility hazard
with SGML and HTML variant parsers. Output style needs to be modified depending
on the intended consumer of the file. I am an XML abuser who happens to like
empty tag style, and uses XML in a loose way not liked by markup purists and
SGML, HTML and XHTML overs.

So customizable bits for pretty for me.

Emit Empty Tags ?   true | false
Space before /> in empty tag?  true | false

Having a few different or a customizable single output pretty formatters would
be nice. Its easy enough to write another as well.

Now I shall go back to using it for my code, using my slightly modified
version, and get on with my own project, for I am far more interesting in
abusing XML than writing and fixing parsers.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------


More information about the Digitalmars-d-bugs mailing list