XInclude allows the following constructs:
<root> <element id="bar"/> <xi:include xpointer="bar"/> </root>After XInclude processinig above XML should resolve to
<root> <element id="bar"/> <element id="bar"/> </root>This is called intra-document reference. <xi:include> instruction having no href attribute refers to the same document currently processed. That opens Pandora's box of implications that basically prevents streaming XInclude processing altogether, as one obviously can't arbitrary navigate over XML stream, neither with XmlReader nor with SAX. "bar" as XPointer is a shorthand pointer, pointing to the element with "bar" ID. (Btw, XInclude processing is recursive so the same way it may point to another <xi:include> element in the same document, causing double processing of the same <xi:include> instruction).
As the core class of XInclude.NET - XIncludingReader is just an XmlReader, how on earth I can get backward in forward-only XmlReader??? Seems like to implement this feature I have to cache source XML document as a whole. Too bad.
It allows to test what ForwardXPathNavigator can and what cannot select. Upload XML document you like to test (please don't abuse loading huge ones), then enter XPath expression and click "RunQuery" button. I know it looks badly in mozilla, but I have no idea how to insert transformation result into HTML page so it gets styled in mozilla too. There are lots of issues, such as namespace declararation isn't showed etc, come on, that's not online XPath tutorial, but just simple demo.
Talking about difference between XSE and ForwardXPathNavigator, Daniel writes:
Back to the issue, there's a fundamental difference in the approach between his class and my XSE API: his will consume the stream with a single query. Mine supports multiple handlers matching multiple elements at the same time. And it's still a pull-based API, where you have to iterate results, instead of being called when something you care happened (was matched).Well, ForwardXPathNavigator wasn't designed to be compared with XSE! It's simple poor man's (XmlReader) XPathNavigator. But as XPathNavigator it allows not only evaluate XPath queries, but to navigate node by node over XML too. I was planning to build XSE-like system based on ForwardXPathNavigator. Actually I must admit I didn't go far from proving the concept and don't have code to publish yet (in the face of brilliant XSE impl :). The idea behind XmlUpdater/XPathFilter was the following: just navigate over XML using ForwardXPathNavigator and check each node if it matches any registered XPath patterns. On each matched node call associated with the pattern callback method, providing it with enough context to to what it want - to skip node (transparency), to modify it etc.
I found pattern matching cheap enough operation and the whole prototype quite satisfying. What I dislike is too fragile nature of ForwardXPathNavigator. It's forward-only, so XPath patterns and the whole application must be too-carefully defined with forward-only concept in mind, what's not usual concept when working with XPath, right? Funny thing - ForwardXPathNavigator may move irrevocably when you just inspecting its properties in the debugger! Count property of XPathNodeIterator becomes obviously unusabe too. To put it another way - it's to hard to work with this stuff. And benefits are not so striking by the way. May be that's my bad design, dunno...
So, it's forward-only XPath subset and BizTalk's XPathReader isn't hidden. Nice to hear.
I wonder who this guy is. He's definitely an expert in the area. Why he doesn't blog? I'm looking forward to see the article, what a pity XML dev center is postponed.
When the article describing the XPathReader is done it will provide source and if there is interest I'll create a GotDotNet Workspace for the project although it is unlikely I nor the dev who originally wrote the code will have time to maintain it.I'm volunteering here. I think it's important-to-have option in XML processing under .NET.
Meanwhile Daniel has released XSE stuff at last (btw, I'm musing if I have to adopt hype-before-release strategy? :). Really interesting. But I still believe XPath (forward-only subset of course) is the way to go.
Anyway, here is ForwardXPathNavigator I was talking about - ForwardXPathNavigator.zip. It's written by my buddy dev Vladimir Nesterovsky. And here are some basic samples.
Selecting feed titles from RSSBandit feed list (pure forward-only selection):
XmlReader r = new XmlTextReader("feedlist.xml"); ForwardXPathNavigator nav = new ForwardXPathNavigator(r); XmlNamespaceManager nsm = new XmlNamespaceManager(nav.NameTable); nsm.AddNamespace("r", "http://www.25hoursaday.com/2003/RSSBandit/feeds/"); XPathExpression expr = nav.Compile("/r:feeds/r:feed/r:title"); expr.SetContext(nsm); XPathNodeIterator ni = nav.Select(expr); while (ni.MoveNext()) { Console.WriteLine(ni.Current.Value); }Obviously ForwardXPathNavigator doesn't allow you to peek to forward or backward nodes. What it only stores is current node XmlReader is positioned at and some details about its direct ancestors. As Dare pointed out, expression such as /r:feeds/r:feed[count(r:stories-recently-viewed)>10]/r:title are not supported, because it cannot be done in forward-only manner. That wasn't ForwardXPathNavigator's goal anyway. In fact such query can be done in forward-only way to some extent though, but not without a help from the host environment. E.g. to select the most viewed feeds, one can select each feed, store its title, then calculate count(r:stories-recently-viewed/r:story) and determine if the feed is popular enough to be selected:
XmlReader r = new XmlTextReader("feedlist.xml"); ForwardXPathNavigator nav = new ForwardXPathNavigator(r); XmlNamespaceManager nsm = new XmlNamespaceManager(nav.NameTable); nsm.AddNamespace("r", "http://www.25hoursaday.com/2003/RSSBandit/feeds/"); XPathExpression expr = nav.Compile("/r:feeds/r:feed"); expr.SetContext(nsm); XPathExpression countExpr = nav.Compile("count(r:stories-recently-viewed/r:story)"); countExpr.SetContext(nsm); XPathExpression titleExpr = nav.Compile("string(r:title)"); titleExpr.SetContext(nsm); XPathNodeIterator ni = nav.Select(expr); while (ni.MoveNext()) { string title = ni.Current.Evaluate(titleExpr) as string; if ((double)ni.Current.Evaluate(countExpr) > 20) Console.WriteLine(title); }Not so elegant (mostly because lack of XPathNavigator.Select(string, XmlNamespaceManager) method), but still feasible. Btw, instroducing some extension function, which could control ForwardXPathNavigator's cach would be quite interesting. Something like /r:feeds/r:feed[ext:store(r:title)][count(r:stories-recently-viewed)>10]/r:title. That's a pity XPath doesn't allow to create variables...
As I said ForwardXPathNavigator keeps some track of ancestor nodes (name, attributes etc), thus enabling some limited backward selections, such as /r:feeds/r:feed[r:title='The XML Files']/@category! I'm going to provide small aspx page where ForwardXPathNavigator can be tested online by anyone interested.
Tomorrow I'll go on spinning up the topic by presenting XmlUpdater (which is based on ForwardXPathNavigator), SAX-filter-like approach to modify XML on the fly.
May be I'm mistaken, but anyway here is the idea - "ForwardOnlyXPathNavigator" is XPathNavigator implementation over XmlReader, which obviously supports forward-only XPath subset. My fellow developer wrote such one so may be we should publish it anyway. Having such navigator it's easy to write a class (I called it XPathFilter), which allows to register callbacks to specific nodes, identified by XPath pattern. XPathFilter travers XML document moving ForwardOnlyXPathNavigator in document order and on each node matching any registered pattern it calls callback method. In the callback it's possible to skip or modify matched node, just like in ordinar SAX filter. I've implemented XmlUpdater class based on such technique and it's proven to be effectieve on modifying huge XML documents on the fly. For instance here is how I can change element into attribute:
FileStream output = File.Create("ot2.xml"); XmlUpdater updater = new XmlUpdater(File.OpenRead("otbig.xml"), output); updater.AddHandler("/tstmt/book/chapter/chtitle", new NodeMatchedEventHandler(MyHandler)); updater.Start(); ... public static void MyHandler(XmlUpdater xu, XPathNavigator nav, XmlWriter w) { w.WriteAttributeString("title", nav.Value); }
And after I played enough with and implemented that stuff I discovered BizTalk 2004 Beta classes contain much better implementation of the same functionality in such gems as XPathReader, XmlTranslatorStream, XmlValidatingStream and XPathMutatorStream. They're amazing classes that enable streaming XML processing in much rich way than trivial XmlReader stack does. I only wonder why they are not in System.Xml v2 ? Is there are any reasons why they are still hidden deeply inside BizTalk 2004 ? Probably I have to evangelize them a bit as I really like this idea.
Anyway, back to XSEReader. What I like in this approach is that it's streaming event based one (do I still miss SAX?). What I dislike is proprietary XPath-like patterns like ":*" (why not *.* ?), "^kzu:*", XPath-like sugar like RootedPath(), RelativePath() etc. I think XPath is the way to go, no need to reinvent the wheel. Anyway, let's wait Daniel unveils all API and impl details.