February 18, 2004

Streaming XInclude and Intra-document References

It's definitely love-to-steaming-strikes-back day today. Here is another sample of how streaming XML processing approach fails. The only XInlcude feature still not implemented in XInlcude.NET project is intra-document references. And basically I have no idea how to implement it in .NET pull environment (as well as Elliotte Rusty Harold has ...

XInclude allows the following constructs:

<root>
   <element id="bar"/>
   <xi:include xpointer="bar"/>
</root>
After XInclude processinig above XML should resolve to
<root>
   <element id="bar"/>
   <element id="bar"/>
</root>
This is called intra-document reference. <xi:include> instruction having no href attribute refers to the same document currently processed. That opens Pandora's box of implications that basically prevents streaming XInclude processing altogether, as one obviously can't arbitrary navigate over XML stream, neither with XmlReader nor with SAX. "bar" as XPointer is a shorthand pointer, pointing to the element with "bar" ID. (Btw, XInclude processing is recursive so the same way it may point to another <xi:include> element in the same document, causing double processing of the same <xi:include> instruction).

As the core class of XInclude.NET - XIncludingReader is just an XmlReader, how on earth I can get backward in forward-only XmlReader??? Seems like to implement this feature I have to cache source XML document as a whole. Too bad.

ForwardXPathNavigator vs XSE: a class vs API

Meanwhile I managed to create simple dummy online demo of ForwardXPathNavigator (XPathNavigator implementation over XmlReader) I was talking about. Here it is. ...

It allows to test what ForwardXPathNavigator can and what cannot select. Upload XML document you like to test (please don't abuse loading huge ones), then enter XPath expression and click "RunQuery" button. I know it looks badly in mozilla, but I have no idea how to insert transformation result into HTML page so it gets styled in mozilla too. There are lots of issues, such as namespace declararation isn't showed etc, come on, that's not online XPath tutorial, but just simple demo.

Talking about difference between XSE and ForwardXPathNavigator, Daniel writes:

Back to the issue, there's a fundamental difference in the approach between his class and my XSE API: his will consume the stream with a single query. Mine supports multiple handlers matching multiple elements at the same time. And it's still a pull-based API, where you have to iterate results, instead of being called when something you care happened (was matched).
Well, ForwardXPathNavigator wasn't designed to be compared with XSE! It's simple poor man's (XmlReader) XPathNavigator. But as XPathNavigator it allows not only evaluate XPath queries, but to navigate node by node over XML too. I was planning to build XSE-like system based on ForwardXPathNavigator. Actually I must admit I didn't go far from proving the concept and don't have code to publish yet (in the face of brilliant XSE impl :). The idea behind XmlUpdater/XPathFilter was the following: just navigate over XML using ForwardXPathNavigator and check each node if it matches any registered XPath patterns. On each matched node call associated with the pattern callback method, providing it with enough context to to what it want - to skip node (transparency), to modify it etc.

I found pattern matching cheap enough operation and the whole prototype quite satisfying. What I dislike is too fragile nature of ForwardXPathNavigator. It's forward-only, so XPath patterns and the whole application must be too-carefully defined with forward-only concept in mind, what's not usual concept when working with XPath, right? Funny thing - ForwardXPathNavigator may move irrevocably when you just inspecting its properties in the debugger! Count property of XPathNodeIterator becomes obviously unusabe too. To put it another way - it's to hard to work with this stuff. And benefits are not so striking by the way. May be that's my bad design, dunno...

XSE idea

Here is Daniel clarifies things about XSE: XSE is not about querying with an specific expression language/format (i.e. XPath or SXPath). XSE is just a mechanism for encapsulating state machines checking for matches against a given expression. What the expression looks like depends on the factory that creates the strategy ...

The Man's patenting XML?

Looks like Microsoft's patenting its XML investments. Recently we had a hubbub about Office 2003 schemas patenting, then XML scripting. Daniel like many others feel alarm, you too? Well, I'm not. Patenting software ideas is stupid thing, but that's a matter of unperfect reality we live in. Everything is patented ...

New XQuery book

Michael Brundage's excellent XQuery reference book is finally available. [Via Michael Rys] Dr. Rys is talking about just published (February 2004) "XQuery : The XML Query Language" book. Michael Brundage is Technical Lead for XQuery processing at Microsoft and the recommendations are so weighty... I feel I want this book ...