-
Notifications
You must be signed in to change notification settings - Fork 3
Use Case: Getting a Single Element
Consider the following XML given as a string named xmlSource
:
<?xml version="1.0"?><SOAP-ENV:Envelope xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/">
<SOAP-ENV:Header/>
<SOAP-ENV:Body>
<SendRequestRequest xmlns="urn://some-schema" xmlns:ns0="urn://another-schema">
<SenderProvidedRequestData Id="Ue7e71ce1-7ce3-4ca5-a689-1a8f2edbb1af">
<MessageID>3931cda8-3245-11ec-b0bc-000c293433a0</MessageID>
<ns0:MessagePrimaryContent>
<ExportDebtRequestsRequest>
<!-- ... and so on ... -->
Find the first SenderProvidedRequestData
element and get it as a js Object of the following structure:
{
Id: "Ue7e71ce1-7ce3-4ca5-a689-1a8f2edbb1af",
MessageID: "3931cda8-3245-11ec-b0bc-000c293433a0",
MessagePrimaryContent: {
ExportDebtRequestsRequest: {
// and so on
}
}
}
const {XMLReader, XMLNode} = require ('xml-toolkit')
const data = await new XMLReader ({
filterElements : 'SenderProvidedRequestData',
map : XMLNode.toObject ({})
}).process (xmlSource).findFirst ()
Here:
- an XMLReader is created;
- with the
filterElements
that tells him to ignore anything before the firstSenderProvidedRequestData
element occurs - and the
map
option requiring the XMLNode.toObject transformation;
- with the
- the
process
method implicitly creates an XMLLexer instance, performs all the necessary piping; - the
findFirst
asynchronous method waits for the first result, then frees all the resources used and returns the result.
By default, yes (which is a standard practice with XML to JSON mapping).
But they are easy to reveal by using custom getName
for XMLNode.toObject. For example:
map: XMLNode.toObject ({
getName: (localName, namespaceURI) => `{${namespaceURI}}${localName}`,
//...
},
By supplying filterElements
in the form of a function mapping XMLNode to Boolean:
filterElements: e =>
e.localName === 'SenderProvidedRequestData' &&
e.namespaceURI === 'urn://some-schema'
Same as above: by specifying all necessary conditions as the filterElements
function. See XMLNode page for more details on its properties.
Yes, it's necessary anyway.
Without filterElements
, the first object emitted by XMLReader will be about the <?xml version="1.0"?>
prolog, not an element at all. If the prololog is missing, it will be the root element, but without any inner content (corresponding to StartElement
event instead of EndElement
).
To get the root element without mentioning its name, one can use
filterElements: e => e.level === 0
The result will be null
.
The result will be a string representing its text content. For instance, in the example above
const id = await new XMLReader ({
filterElements : 'MessageID',
map : XMLNode.toObject ({})
}).process (xmlSource).findFirst () // will be '3931cda8-3245-11ec-b0bc-000c293433a0'
If the string is empty (zero length), null
will be returned, as if the element is missing.
Using filterElements
implicitly sets on the stripSpace
option, that causes trimming of all (merged) text nodes.
For example:
<Poem>
Onion juice in the eyes
Is the reason she cries.
</Poem>
(source)
will be translated to 'Onion juice in the eyes\nIs the reason she cries.'
, not '\nOnion juice in the eyes\nIs the reason she cries.\n'
Note: the inner line feed is guaranteed to be preserved. Usually, this is the desired behavior.
But the application developer is always free to explicitly set
stripSpace: false,
filterElements: // ...
and to process all the characters read from the XML source on his own.
XMLNode.toObject cannot properly handle such data structures (specific for documentation and other loosely structured, not database bound, XML).
Without a map
option set, the result will be an XMLNode instance. All the parsed content is available from there. See the correnponding docs for more details on the API.
To find a small piece of a huge XML, one can specify a readable stream instead of a string as xmlSource
. The process will stop right after finding the element, and all the resources will be released immediately.
If the desired element is missing, the source will be scanned completely, but with only a minimal memory buffer in use.
But when the root element is required, XMLReader has no other option than to build the complete document tree.