dom(n)dom(n)______________________________________________________________________________NAMEdom - Create an in-memory DOM tree from XML
SYNOPSIS
package require tdom
dom method ?arg arg ...?
_________________________________________________________________DESCRIPTION
This command provides the creation of complete DOM trees in memory. In
the usual case a string containing a XML information is parsed and con‐
verted into a DOM tree. method indicates a specific subcommand.
The valid methods are:
dom parse ?options? ?data?
Parses the XML information and builds up the DOM tree in memory
providing a Tcl object command to this DOM document object.
Example:
dom parse $xml doc
$doc documentElement root
parses the XML in the variable xml, creates the DOM tree in mem‐
ory, make a reference to the document object, visible in Tcl as
a document object command, and assigns this new object name to
the variable doc. When doc gets freed, the DOM tree and the
associated Tcl command object (document and all node objects)
are freed automatically.
set document [dom parse $xml]
set root [$document documentElement]
parses the XML in the variable xml, creates the DOM tree in mem‐
ory, make a reference to the document object, visible in Tcl as
a document object command, and returns this new object name,
which is then stored in document. To free the underlying DOM
tree and the associative Tcl object commands (document + nodes +
fragment nodes) the document object command has to be explicitly
deleted by:
$document delete
or
rename $document ""
The valid options are:
-simple
If -simple is specified, a simple but fast parser is used
(conforms not fully to XML recommendation). That should
double parsing and DOM generation speed. The encoding of
the data is not transformed inside the parser. The simple
parser does not respect any encoding information in the
XML declaration. It skips over the internal DTD subset
and ignores any information in it. Therefor it doesn't
include defaulted attribute values into the tree, even if
the according attribute declaration is in the internal
subset. It also doesn't expand internal or external
entity references other than the predefined entities and
character references.
-html If -html is specified, a fast HTML parser is used, which
tries to even parse badly formed HTML into a DOM tree.
-keepEmpties
If -keepEmpties is specified, text nodes, which contain
only whitespaces, will be part of the resulting DOM tree.
In default case (-keepEmpties not given) those empty text
nodes are removed at parsing time.
-channel <channel-ID>
If -channel <channel-ID> is specified, the input to be
parsed is read from the specified channel. The encoding
setting of the channel (via fconfigure -encoding) is
respected, ie the data read from the channel are con‐
verted to UTF-8 according to the encoding settings, befor
the data is parsed.
-baseurl <baseURI>
If -baseurl <baseURI> is specified, the baseURI is used
as the base URI of the document. External entities refer‐
enced in the document are resolved relative to this base
URI. This base URI is also stored within the DOM tree.
-feedbackAfter <#bytes>
If -feedbackAfter <#bytes> is specified, the tcl command
::dom::domParseFeedback is evaluated after parsing every
#bytes. If you use this option, you have to create a tcl
proc named ::dom::domParseFeedback, otherwise you will
get an error. Please notice, that the calls of
::dom::domParseFeedback are not done exactly every
#bytes, but always at the first element start after every
#bytes.
-externalentitycommand <script>
If -externalentitycommand <script> is specified, the
specified tcl script is called to resolve any external
entities of the document. The actual evaluated command
consists of this option followed by three arguments: the
base uri, the system identifier of the entity and the
public identifier of the entity. The base uri and the
public identifier may be the empty list. The script has
to return a tcl list consisting of three elements. The
first element of this list signals, how the external
entity is returned to the processor. At the moment, the
two allowed types are "string" and "channel". The second
element of the list has to be the (absolute) base URI of
the external entity to be parsed. The third element of
the list are data, either the already read data out of
the external entity as string in the case of type
"string", or the name of a tcl channel, in the case of
type "channel". Note that if the script returns a tcl
channel, it will not be closed by the processor. It must
be closed separately if it is no longer required.
-useForeignDTD <boolean>
If <boolean> is true and the document does not have an
external subset, the parser will call the -externalenti‐
tycommand script with empty values for the systemId and
publicID arguments. Pleace notice, that, if the document
also doesn't have an internal subset, the -startdoctype‐
declcommand and -enddoctypedeclcommand scripts, if set,
are not called. The -useForeignDTD respects
-paramentityparsing <always|never|notstandalone>
The -paramentityparsing option controls, if the parser
tries to resolve the external entities (including the
external DTD subset) of the document, while building the
DOM tree. -paramentityparsing requires an argument, which
must be either "always", "never", or "notstandalone". The
value "always" means, that the parser tries to resolves
(recursively) all external entities of the XML source.
This is the default, in case -paramentityparsing is omit‐
ted. The value "never" means, that only the given XML
source is parsed and no external entity (including the
external subset) will be resolved and parsed. The value
"notstandalone" means, that all external entities will be
resolved and parsed, with the execption of documents,
which explicitly states standalone="yes" in their XML
declaration.
dom createDocument docElemName ?objVar?
Creates a new DOM document object with one element node with
node name docElemName. The objVar controls the memory handling
as explained above.
dom createDocumentNS uri docElemName ?objVar?
Creates a new DOM document object with one element node with
node name docElemName. Uri gives the namespace of the document
element to create. The objVar controls the memory handling as
explained above.
dom createDocumentNode ?objVar?
Creates a new, 'empty' DOM document object without any element
node. objVar controls the memory handling as explained above.
dom setResultEncoding ?encodingName?
If encodingName is not given the current global result encoding
is returned. Otherwise the global result encoding is set to
encodingName. All character data, attribute values, etc. will
then be converted from UTF-8, which is delivered from the Expat
XML parser, to the given 8 bit encoding at XML/DOM parse time.
Valid values for encodingName are: utf-8, ascii, cp1250, cp1251,
cp1252, cp1253, cp1254, cp1255, cp1256, cp437, cp850, en,
iso8859-1, iso8859-2, iso8859-3, iso8859-4, iso8859-5,
iso8859-6, iso8859-7, iso8859-8, iso8859-9, koi8-r.
dom createNodeCmd ?-returnNodeCmd? (element|comment|text|cdata|pi)Node
commandName
This method creates Tcl commands, which in turn create tDOM
nodes. Tcl commands created by this command are only avaliable
inside a script given to the domNode method appendFromScript. If
a command created with createNodeCmd is invoked in any other
context, it will return error. The created command commandName
replaces any existing command or procedure with that name. If
the commandName includes any namespace qualifiers, it is created
in the specified namespace.
If such command is invoked inside a script given as argument to
the domNode method appendFromScript, it creates a new node and
appends this node at the end of the child list of the invoking
element node. If the option -returnNodeCmd was given, the com‐
mand returns the created node as Tcl command. If this option was
omitted, the command returns nothing. Each command creates
always the same type of node. Which type of node is created by
the command is determined by the first argument to the createN‐
odeCmd. The syntax of the created command depends on the type of
the node it creates.
If the first argument of the method is elementNode, the created
command will create an element node. The tag name of the created
node is commandName without namespace qualifiers. The syntax of
the created command is:
elementNodeCmd ?attributeName attributeValue ...? ?script?
elementNodeCmd ?-attributeName attributeValue ...? ?script?
elementNodeCmd name_value_list script
The command syntax allows three different ways to specify the
attributes of the resulting element. These could be specified
with attributeName attributeValue argument pairs, in an "option
style" way with -attriubteName attributeValue argument pairs
(the '-' character is only syntactical sugar and will be
stripped off) or as a Tcl list with elements interpreted as
attribute name and the corresponding attribute value. The
attribute name elements in the list may have a leading '-' char‐
acter, which will be stripped off.
Every elementNodeCmd accepts an optional Tcl script as last
argument. This script is evaluated as recursive appendFromScript
script with the node created by the elementNodeCmd as parent of
all nodes created by the script.
If the first argument of the method is textNode, the command
will create a text node. The syntax of the created command is:
textNodeCmd ?-disableOutputEscaping? data
If the optional flag -disableOutputEscaping is given, the escap‐
ing of the ampersand character (&) and the left angle bracket
(<) inside the data is disabled. You should use this flag care‐
fully.
If the first argument of the method is commentNode, or cdataN‐
ode, the command will create an comment node or CDATA section
node. The syntax of the created command is:
nodeCmd data
If the first argument of the method is piNode, the command will
create a processing instruction node. The syntax of the created
command is:
piNodeCmd target data
dom setStoreLineColumn ?boolean?
If switched on, the DOM nodes will contain line and column posi‐
tion information for the original XML document after parsing.
The default is, not to store line and column position informa‐
tion.
dom setNameCheck ?boolean?
If NameCheck is true, every method which expects an XML Name, a
full qualified name or a processing instructing target will
check, if the given string is valid according to his production
rule. For commands created with the createNodeCmd method to be
used in the context of appendFromScript the status of the flag
at creation time decides. If NameCheck is true at creation time,
the command will check his arguments, otherwise not. The set‐
NameCheck set this flag. It returns the current NameCheck flag
state. The default state for NameCheck is true.
dom setTextCheck ?boolean?
If TextCheck is true, every command which expects XML Chars, a
comment, a CDATA section value or a processing instructing value
will check, if the given string is valid according to his pro‐
duction rule. For commands created with the createNodeCmd method
to be used in the context of appendFromScript the status of the
flag at creation time decides. If TextCheck is true at creation
time, the command will check his arguments, otherwise not.The
setTextCheck method set this flag. It returns the current
TextCheck flag state. The default state for TextCheck is true.
dom setObjectCommands ?(automatic|token|command)?
Controls, if documents and nodes are created as tcl commands or
as token to be used with the domNode and domDoc commands. If the
mode is 'automatic', then methods used at tcl commands will cre‐
ate tcl commands and methods used at doc or node tokes will cre‐
ate tokens. If the mode is 'command' then always tcl commands
will be created. If the mode is 'token', then always token will
be created. The method returns the current mode. This method is
an experimental interface.
dom isName name
Returns 1, if name is a valid XML Name according to production 5
of the XML 1.0 recommendation. This means, that name is a valid
XML element or attribute name. Otherwise it returns 0.
dom isPIName name
Returns 1, if name is a valid XML processing instruction target
according to production 17 of the XML 1.0 recommendation. Other‐
wise it returns 0.
dom isNCName name
Returns 1, if name is a valid NCName according to production 4
of the of the Namespaces in XML recommendation. Otherwise it
returns 0.
dom isQName name
Returns 1, if name is a valid QName according to production 6 of
the of the Namespaces in XML recommendation. Otherwise it
returns 0.
dom isCharData string
Returns 1, if every character in string is a valid XML Char
according to production 2 of the XML 1.0 recommendation. Other‐
wise it returns 0.
dom isComment string
Returns 1, if string is a valid comment according to production
15 of the XML 1.0 recommendation. Otherwise it returns 0.
dom isCDATA string
Returns 1, if string is valid according to production 20 of the
XML 1.0 recommendation. Otherwise it returns 0.
dom isPIValue string
Returns 1, if string is valid according to production 16 of the
XML 1.0 recommendation. Otherwise it returns 0.
KEYWORDS
XML, DOM, document, node, parsing
Tcldom(n)