Subject: XML Packages - Proposed Changes - DN [1]
Steve Ball <Steve.Ball@zveno.com> - 15 Nov 1999 - comp.lang.tcl
There are now a number of packages floating around that provide
variants on XML parsers and the DOM for Tcl. It's about time
these were rationalised and brought into a single code base so
that we can collaborate on making the tools better.
Following is my audit of the packages/code that I know of,
and my plans for the next couple of months to bring these
together. CVS will be used to allow collaboration between
implementors. Either the Zveno CVS server, waycool.zveno.com,
or the Scriptics CVS server will be used.
This message is being sent to c.l.t, but I would prefer that
discussions were directed to the tclxml@egroups.com mailing list.
XML Parsers
===========
There are now three XML parsers for Tcl: TclExpat, TclXML and
tDOM's simple XML parser. TclExpat has two variants: the main
package available from Scriptics' CVS server (TEA-compliant)
and tDOM's modified TclExpat.
I'd like to fold TclExpat and TclXML into one package - TclXML.
The new package will automatically load the C extension if available,
otherwise it will already have the Tcl version available.
TclExpat will benefit by being able to use the Tcl code for some
parsing tasks, such as interpreting Document Type Declarations,
entity references, etc, and by combining the test suites.
Some means of determining which code has been loaded (C or Tcl)
will be made available.
The main tDOM mod is to bind the callback commands to a single Tcl
command rather than feed them to Tcl_EvalObj. This is to avoid
object shimmering and other overheads. Jeff Hobbs and I have discussed
this, and the solution is to fix Tcl_EvalObj rather than change
TclExpat. I propose to retain the current approach (Tcl_EvalObj)
in anticipation of the fix to Tcl. However, to satisfy Jochen's
performance requirements we can introduce a SAX interface which
accepts a Tcl command as the callback. For example,
proc SAXHandler {method args} {
switch $method {
startelement {}
endelement {}
cdata {}
...
}
}
xml::parser myparser -saxcommand SAXHandler
myparser parse $xml
The disadvantage with this approach is that you can't pass extra
arguments to the callback command as you normally can with eval'd
commands.
tDOM also has a simple, fast parser. I haven't looked at this much
yet, but the idea is good. Expat has alot of code for handling
character encodings, but Tcl 8.1+ already has that so why slow down
the parser by doing work that Tcl takes care of? This needs further
investigation.
TclDOM
======
There are three DOM implementations: TclDOM, tDOM and an internal
Scriptics implementation of TclDOM. The latter two are in C and
hence much faster than TclDOM. However, tDOM presents a markedly
different API to TclDOM. TclDOM's API uses handles for each node
and passes them around ala Tcl channel identifiers, whereas tDOM
takes a OO approach and creates a Tcl command for each DOM node,
ala Tk or TclBlend.
Rather than argue about which approach is better, it is clear that
people want either or both. Let's embrace both approaches and
allow them to work together. Accordingly, my plan is to take
the tDOM source code and add in the TclDOM API to make it compatible.
When (or if) Scriptics make their C-based TclDOM implementation
available we can merge that code in if appropriate.
The 0.4alpha tDOM source code is now checked in to the Zveno CVS
repository. Set your CVSROOT to :pserver:cvs@waycool.zveno.com:/cvsroot
use 'cvs login' and give the password 'cvs'. You can then do
'cvs checkout tDOM'.
XML Data Type
=============
My dream, which I have expressed here before, is to make XML/DOM
a Tcl data type just like strings and lists. There was some talk
a while ago about new data types for Tcl, in particular trees,
but I haven't seen anything appear along those lines.
An obvious major problem with this is the Tcl object Copy-On-Write
semantics. A DOM object reference would refer to a live tree, and
when that object modifies the tree it does not want to take a copy,
but rather update the tree itself.
As part of the work on tDOM to make it TclDOM compatible I will
write some experimental code to make DOM a Tcl internal object
representation. If anyone has some thoughts on this, either for
or against, I'd appreciate hearing them.
Cheers,
Steve Ball
--
Steve Ball | Swish XML Editor | Training & Seminars
Zveno Pty Ltd | Web Tcl Complete | XML XSL
http://www.zveno.com/ | TclXML TclDOM | Tcl, Web Development
Steve.Ball@zveno.com +-----------------------+---------------------
Ph. +61 2 6242 4099 | Mobile (0413) 594 462 | Fax +61 2 6242 4099
Last modified
1999-11-25
1999-11-25
(195.108.246.50)
Note: you are looking at
the snapshot of an old wiki
- much of this information
is likely to be very outdated
