February 22, 2005

On XmlBookmarkReader

Helena Kupkova, known before by FastXML (she claimed it's 5x faster than MSXML), now working for Microsoft on XmlReader and who's behind amazing "Microsoft XML Diff and Patch 1.0" tool, has published an article at MSDN XML Dev Center called "XML Reader with Bookmarks". ...

In the article Helena discusses XmlBookmarkReader, which is XmlReader implementation enabling you to set a bookmark at an XML node, read on and then rewind the reader back to the bookmarked node if you wish. That's really cool. It's implemented by caching all XML nodes after bookmarked one along with their context (such as Depth, attributes, namespaces etc). If you think for a moment about how XmlReader works you realize that it can be modeled as just traversing of a non-circular singly linked list of nodes. The nodes in that list are ordered in a document order (except for attributes and namespaces), which is usually called preorder tree traversal in non-XML circles. So at any moment you can start recording nodes XmlReader reads to the linked list and then come back and replay nodes by reading them from that list instead of source XML.

As a demonstration Helena shows an example of XML filtering, which requires look ahead - like selecting "/books/book[contains(title, 'XML')]". Obviously this can't be done with XmlTextReader, nor with XPathReader, but done easily with XmlBookmarkReader. As a matter of fact, XmlBookmarkReader is the feature XPathReader really needs. We can leverage XmlBookmarkReader when evaluating predicates in XPathReader so we can get back to the context node once we done with a predicate. Then XPathReader will finally be able to work with notorious "book[contains(title, 'XML')]". That's the way to go. With "look ahead" and "look back through ancestors" features XPathReader can finally be really useful. Btw, XPathReader workspace is open to everybody interested to participate. I just found out I'm admin there :)

PS. There is a small typo in the XmlBookmarkReader.cs - the line

if ( bookmarks.Count > 0 != null ) {
should probably be
if ( bookmarks != null ) {