RE: Streaming XPath and ForwardXPathNavigator

| No Comments | 2 TrackBacks

Ok, Dare great deal clarified things in his "Combining XPath-based Filtering with Pull-based XML Parsing" post:

Actually Oleg is closer and yet farther from the truth than he realizes. Although I wrote about a hypothetical ForwardOnlyXPathNavigator in my article entitled Can One Size Fit All? for XML Journal my planned article which should show up when the MSDN XML Developer Center launches in a month or so won't be using it. Instead it will be based on an XPathReader that is very similar to the one used in BizTalk 2004, in fact it was written by the same guy. The XPathReader works similarly to Daniel Cazzulino's XseReader but uses the XPath subset described in Arpan Desai's Introduction to Sequential XPath paper instead of adding proprietary extensions to XPath as Daniel's does.

So, it's forward-only XPath subset and BizTalk's XPathReader isn't hidden. Nice to hear.
I wonder who this guy is. He's definitely an expert in the area. Why he doesn't blog? I'm looking forward to see the article, what a pity XML dev center is postponed.

When the article describing the XPathReader is done it will provide source and if there is interest I'll create a GotDotNet Workspace for the project although it is unlikely I nor the dev who originally wrote the code will have time to maintain it.
I'm volunteering here. I think it's important-to-have option in XML processing under .NET.

Meanwhile Daniel has released XSE stuff at last (btw, I'm musing if I have to adopt hype-before-release strategy? :). Really interesting. But I still believe XPath (forward-only subset of course) is the way to go.

Anyway, here is ForwardXPathNavigator I was talking about - ForwardXPathNavigator.zip. It's written by my buddy dev Vladimir Nesterovsky. And here are some basic samples.

Selecting feed titles from RSSBandit feed list (pure forward-only selection):

XmlReader r = new XmlTextReader("feedlist.xml");
ForwardXPathNavigator nav = new ForwardXPathNavigator(r);
XmlNamespaceManager nsm = new XmlNamespaceManager(nav.NameTable);
nsm.AddNamespace("r", 
    "http://www.25hoursaday.com/2003/RSSBandit/feeds/");
XPathExpression expr = 
    nav.Compile("/r:feeds/r:feed/r:title");
expr.SetContext(nsm);
XPathNodeIterator ni = nav.Select(expr);
while (ni.MoveNext()) {
    Console.WriteLine(ni.Current.Value);
}
Obviously ForwardXPathNavigator doesn't allow you to peek to forward or backward nodes. What it only stores is current node XmlReader is positioned at and some details about its direct ancestors. As Dare pointed out, expression such as /r:feeds/r:feed[count(r:stories-recently-viewed)>10]/r:title are not supported, because it cannot be done in forward-only manner. That wasn't ForwardXPathNavigator's goal anyway. In fact such query can be done in forward-only way to some extent though, but not without a help from the host environment. E.g. to select the most viewed feeds, one can select each feed, store its title, then calculate count(r:stories-recently-viewed/r:story) and determine if the feed is popular enough to be selected:
XmlReader r = new XmlTextReader("feedlist.xml");
ForwardXPathNavigator nav = new ForwardXPathNavigator(r);
XmlNamespaceManager nsm = new 
    XmlNamespaceManager(nav.NameTable);
nsm.AddNamespace("r", 
    "http://www.25hoursaday.com/2003/RSSBandit/feeds/");
XPathExpression expr = 
    nav.Compile("/r:feeds/r:feed");
expr.SetContext(nsm);
XPathExpression countExpr = 
    nav.Compile("count(r:stories-recently-viewed/r:story)");
countExpr.SetContext(nsm);
XPathExpression titleExpr = 
    nav.Compile("string(r:title)");
titleExpr.SetContext(nsm);
XPathNodeIterator ni = nav.Select(expr);
while (ni.MoveNext()) {
    string title = ni.Current.Evaluate(titleExpr) as string;
    if ((double)ni.Current.Evaluate(countExpr) > 20)
        Console.WriteLine(title);
}
Not so elegant (mostly because lack of XPathNavigator.Select(string, XmlNamespaceManager) method), but still feasible. Btw, instroducing some extension function, which could control ForwardXPathNavigator's cach would be quite interesting. Something like /r:feeds/r:feed[ext:store(r:title)][count(r:stories-recently-viewed)>10]/r:title. That's a pity XPath doesn't allow to create variables...

As I said ForwardXPathNavigator keeps some track of ancestor nodes (name, attributes etc), thus enabling some limited backward selections, such as /r:feeds/r:feed[r:title='The XML Files']/@category! I'm going to provide small aspx page where ForwardXPathNavigator can be tested online by anyone interested.

Tomorrow I'll go on spinning up the topic by presenting XmlUpdater (which is based on ForwardXPathNavigator), SAX-filter-like approach to modify XML on the fly.

Related Blog Posts

2 TrackBacks

TrackBack URL: http://www.tkachenko.com/cgi-bin/mt-tb.cgi/163

ForwardXPathNavigator vs XSE from IXml* - Welcome to the real world on February 16, 2004 9:17 PM

TITLE: ForwardXPathNavigator vs XSE URL: http://weblogs.asp.net/cazzu/archive/2004/02/16/73985.aspx IP: 66.129.67.203 BLOG NAME: IXml* - Welcome to the real world DATE: 02/16/2004 09:17:06 PM Read More

ForwardXPathNavigator vs XSE from IXml* - Welcome to the real world on February 16, 2004 9:17 PM

TITLE: ForwardXPathNavigator vs XSE URL: http://weblogs.asp.net/cazzu/archive/0001/01/01/73985.aspx IP: 66.129.67.203 BLOG NAME: IXml* - Welcome to the real world DATE: 02/16/2004 09:17:59 PM Read More

Leave a comment