December 22, 2004

XQuery in .NET story isn't over yet

Btw, talking with .NET developers recently (XML geeks and non-geeks) about XQuery and XSLT support in .NET 2.0 I realized that shocking fact - about 80% of devs I was talking to still have no idea XQuery support in .NET 2.0 was cut. They were listening all the road to ...

Another Microsoft XML blogger

Another good news - Dave Remy, a Lead Program Manager on Core XML Technologies at Microsoft is blogging. Subscribed. ...

December 20, 2004

XInclude goes W3C Recommendation!

Hey, what a surprise from the W3C! XInclude 1.0 has been published as W3C Recommendation today. That was fast! Less than 3 months in Proposed Rec status - and here it is, XInclude 1.0 - another standard XML Core technology. ...

I was about to release XInclude.NET version conforming to the September's XInclude spec tomorrow. So it's just in time. As far as I can see no significant changes were instroduced, so couple days for aligning, fixing documentation - and then expect new release of the Mvp.Xml library (including Common, XInclude.NET and XPointer.NET modules) and then nxslt.exe update.

For those unfamiliar with XInclude, take a look at my MSDN article "Combining XML Documents with XInclude".

Btw, AFAIR Microsoft cut XInclude implementation too because of the same "not Rec yet" issue. Now that XInclude is Recommendation, small, simple (read not contaminated by the XML Schema) nice useful core technology and there is still plenty of time till .NET 2.0 is out, I hope the XML Team will consider implementing it. I'd like to see support for XInclude in .NET 2.0, do you?

December 19, 2004

Kurt Cagle makes a business case for XSLT 2.0

As usually very long post (an article actually) by Kurt Cagle on "The Business Case for XSLT 2.0". Explains why XSLT 2.0 is good and why Microsoft should implement it. With Michael Champion's comments, worth reading. ...

Would you like to see XSLT1.1 + EXSLT in .NET2?

Hey, I've got another idea. XQuery and XSLT2 are surely huge undertakings (we can truly thank W3C for that), but still there is plenty of plain poor .NET devs struggling with limitations of XSLT 1.0 and XPath 1.0. What if Microsoft implements XSLT 1.1 + EXSLT in .NET 2.0, would ...

XSLT 1.1 is that officially frozen XSLT version, which was supposed to improve XSLT in an evolutionary way - by solving only the most irritating problems in XSLT 1.0. Changes from XSLT 1.0 are small: that nasty result tree fragment data type is eliminated, so no need for xxx:node-set() function; XML namespaces quirks are fixed; support for XML Base is added; multiple output is supported via new xsl:document instruction; xsl:apply-imports can have parameters; standard way to define embedded extension functions is defined - xsl:script; new "external object" data type is defined for better interop with extension languages.

No big deal to implement IMO, but what a relief for .NET devs working with XSLT. And it's quite stable - XSLT 1.1 isn't Recommendation actually, but it's oficially frozen and won't change anymore. Saxon and jd.xslt support it. And implementing EXSLT would provide rich function library to allow XSLT developers to be much more productive by eliminating the boring needs every time to reimplement from scratch such trivial tasks as string tokenizing, formatting dates or getting list of unique values. EXSLT.NET project proves it's pretty implementable.

Yes, one can use EXSLT.NET right now, but EXSLT.NET library has some serious limitations. It's perf and security problems I'm talking about. The main problem is about how EXSLT.NET is implemented. Main idea behind EXSLT was that XSLT vendors would implement it, while EXSLT.NET is just external layer on top of the XslTransform class. It's implemented as user extension functions, not system extensions like msxsl:node-set() function. Hence - awful lots of reflection work is done during each function call and on returning a node-set and of course FullTrust security demand, which makes EXSLT.NET plain useless in any not fully trusted environment such as ASP.NET. All these problems could be fixed easily by just moving EXSLT.NET into the core of the XSLT implementation - it would make it faster, safer and more reliable.

Well, just an idea to evaluate actually.

In other .NET related XML news

Some XML news in no any order: Irwin Dolobowsky says we should expect very interesting articles at MSDN XML Dev Center, especially I'm looking forward to this one - "Helena Kupkova will show us how to create bookmarks in XML Streams with the ResetableXmlReader." Hmmm, sweet. AFAIR we've been discussing ...

Red pill for Michael Champion

Oh that big news - Michael Champion is now Program Manager for XML Standards in the Microsoft's XML WebData team. Wow, wow, wow - that's the only words I can say. Here is his intro on his new blog (hey, he is a Microsoft employee, so it's http://blogs.msdn.com/mikechampion, not http://weblogs.asp.net/mikechampion ...

Architecture of the World Wide Web, Volume One

W3C at last published the "Architecture of the World Wide Web, Volume One" as W3C Recommendation. It was cooked in long hot discussions by Web heavyweights and geeks. Here is what's that about: This document describes the properties we desire of the Web and the design choices that have been ...

It's 47 printed pages and I had no time to read it thoroughly yet, but I skimmed XML-related parts. There are some normative answers to some bloated questions finally.

Binary vs Text data formats:

The trade-offs between binary and textual data formats are complex and application-dependent. Binary formats can be substantially more compact, particularly for complex pointer-rich data structures. Also, they can be consumed more rapidly by agents in those cases where they can be loaded into memory and used with little or no conversion. Note, however, that such cases are relatively uncommon as such direct use may open the door to security issues that can only practically be addressed by examining every aspect of the data structure in detail.

Textual formats are usually more portable and interoperable. Textual formats also have the considerable advantage that they can be directly read by human beings (and understood, given sufficient documentation). This can simplify the tasks of creating and maintaining software, and allow the direct intervention of humans in the processing chain without recourse to tools more complex than the ubiquitous text editor. Finally, it simplifies the necessary human task of learning about new data formats; this is called the "view source" effect.

It is important to emphasize that intuition as to such matters as data size and processing speed is not a reliable guide in data format design; quantitative studies are essential to a correct understanding of the trade-offs. Therefore, designers of a data format specification should make a considered choice between binary and textual format design.
Oh yeah, well said.

When to use XML:

XML defines textual data formats that are naturally suited to describing data objects which are hierarchical and processed in a chosen sequence. It is widely, but not universally, applicable for data formats; an audio or video format, for example, is unlikely to be well suited to expression in XML. Design constraints that would suggest the use of XML include:

1. Requirement for a hierarchical structure.
2. Need for a wide range of tools on a variety of platforms.
3. Need for data that can outlive the applications that currently process it.
4. Ability to support internationalization in a self-describing way that makes confusion over coding options unlikely.
5. Early detection of encoding errors with no requirement to "work around" such errors.
6. A high proportion of human-readable textual content.
7. Potential composition of the data format with other XML-encoded formats.
8. Desire for data easily parsed by both humans and machines.
9. Desire for vocabularies that can be invented in a distributed manner and combined flexibly.

On linking in XML:

Designers of XML-based formats may consider using XLink and, for defining fragment identifier syntax, using the XPointer framework and XPointer element() Schemes.
Note that "may". It means "we'd like to see at least anybody using XLink, though we admit it's not so good." It's still an issue.
XLink is not the only linking design that has been proposed for XML, nor is it universally accepted as a good design.

On our favorite nightmare - XML namespaces. It's always an issue (aka it's too long), go read it. Some related to the misunderstanding Dare was writing about:

Attributes are always scoped by the element on which they appear. An attribute that is "global," that is, one that might meaningfully appear on elements of many types, including elements in other namespaces, should be explicitly placed in a namespace. Local attributes, ones associated with only a particular element type, need not be included in a namespace since their meaning will always be clear from the context provided by that element.
The type attribute from the W3C XML Schema Instance namespace "http://www.w3.org/2001/XMLSchema-instance" ([XMLSCHEMA], section 4.3.2) is an example of a global attribute. It can be used by authors of any vocabulary to make an assertion in instance data about the type of the element on which it appears. As a global attribute, it must always be qualified. The frame attribute on an HTML table is an example of a local attribute. There is no value in placing that attribute in a namespace since the attribute is unlikely to be useful on an element other than an HTML table.

And here are some new definitions for a very bloated topic:

Another benefit of using URIs to build XML namespaces is that the namespace URI can be used to identify an information resource that contains useful information, machine-usable and/or human-usable, about terms in the namespace. This type of information resource is called a namespace document. When a namespace URI owner provides a namespace document, it is authoritative for the namespace.

There are many reasons to provide a namespace document. A person might want to:

- understand the purpose of the namespace,
- learn how to use the markup vocabulary in the namespace,
- find out who controls it and associated policies,
- request authority to access schemas or collateral material about it, or
- report a bug or situation that could be considered an error in some collateral material.
A processor might want to:

- retrieve a schema, for validation,
- retrieve a style sheet, for presentation, or
- retrieve ontologies, for making inferences.
In general, there is no established best practice for creating representations of a namespace document; application expectations will influence what data format or formats are used. Application expectations will also influence whether relevant information appears directly in a representation or is referenced from it.
Well, I'm not sure I fully agree with this practice, but at least it sounds reasonable and clear.

On QNames in content problem:

Do not allow both QNames and URIs in attribute values or element content where they are indistinguishable.

XML ID problem - still not solved.

Media types for XML:

In general, a representation provider SHOULD NOT assign Internet media types beginning with "text/" to XML representations.
Read again that. Use what RFC 3023 says - "application/xml" and all that jazz with "+xml" suffix (e.g. "image/svg+xml"). Also:
In general, a representation provider SHOULD NOT specify the character encoding for XML data in protocol headers since the data is self-describing.

So lots of cool stuff to read and follow.

December 18, 2004

Don't open till Xmas - free XSL-FO Debugger from Altsoft

Hmmm, debugging XSL-FO... That might be great idea actually. Here is interesting innovation from Altsoft N.V. (maker of the Xml2PDF formatting engine for .NET) - XSL-FO debugger. And it's even free! ...

December 16, 2004

What's wrong with XslTransform's API?

I wonder if is there is something inherently wrong with XslTransform's class API? I was stunned again today reading this post in microsoft.public.dotnet.xml newsgroup: I still don't see any way to create a XslTransform from a XmlDocument? That's not the first time I see it actually. The answer of course ...

On Introduction to MSIL by Kenny Kerr

Kenny Kerr has posted another instalment in his amazing "Introduction to MSIL" blog series. It's about brilliant for-each construct, which was introduced by Visual Basic and now adopted by VB.NET, C#, C++ and even Java. Worth reading. Besides I very like that idea of learning from blogs - you know ...

On XmlPreprocess tool

That guy Loren Halvorson has relased XmlPreprocess tool for preprocessing XML files, e.g. config files in .NET. It allows to perform the following tricks: <configuration> <system.web> <!-- ifdef ${production} --> <!-- <compilation defaultLanguage="c#" debug="false"/> --> <!-- else --> <compilation defaultLanguage="c#" debug="true"/> <!-- endif --> </system.web> </configuration> As you can see ...

December 14, 2004

A letter from a dead house

I was doing some catch up reading feeds I'm subscribed and I found this one item that made me feeling some sort of bitter nostalgia. It's right on MSDN TV site, a new episode where Mark Fussel explains new XML features in upcoming .NET 2.0. The episode is dated December ...

Mark is very energetic guy and nicely talks about editable XPathDocument as a preffered XML store in .NET2 (has been cut, stick with DOM for several more years) and of course XQuery implementation in .NET2 (cut). So basically all new XML features that still not cut are about performance, some API rearranging and XSLT 1.0. Performace was an old pain and had to be fixed anyway. XSLT 1.0 stuff is good - finally XSLT debugger (which actually was hidden as internal stuff within System.Xml.dll since .NET 1.0 - just run Reflector to see it), better XML and XSLT editor in Visual Studio .NET 2005, and brand new XSLT 1.0 processor, which compiles XSLT down to MSIL - this one is just perfect and I'm looking forward to give it a whirl. So as a matter of curiosity .NET 2.0 is going to include two different XSLT 1.0 processors (one obsoleted though).

Well, I'm not saying that's nothing. But let me be harsh - it's all catch up stuff. It would be excellent to have in .NET 1.1, but for year 2005 it's quite disappointing. May be that's because all that hype about XQuery and how it's better than XSLT. There were really hot debates on Microsoft decision to implement XQuery, but not XSLT 2.0 and now guess what - neither XQuery nor XSLT2.0 and developers running out.

I can imagine how much resources have been spent on XQuery! That's not a small one. Apparently managed XQuery turned out to be just another black hole project. Who knew, right?

Oh well, there is Saxon.NET and may be will be XQP. But I fell a little bitter taste when I realize I'll still be helping people struggling with XSLT 1.0 limitations at least next 3 years, while XSLT 2.0 is shining in a better world. And my fellow XML MVPs are sharing my feelings. Still - what a sad irony - to listen to Mark, who has left XML team talking about XQuery impementation, which was cut...

December 12, 2004

Some attractive XQuery papers

Some goodies from Daniela Florescu and the Database Group at the University of Heidelberg: "The BEA Streaming XQuery Processor" (full version?), D. Florescu, C. Hillery, D. Kossmann, P. Lucas, F. Riccardi, T. Westmann, M.J. Carey, A. Sundararajan. VLDB Journal. "Implementing Memoization in a Streaming XQuery Processor", Y. Diao, D. Florescu ...

Adam Kinney's new site is running XSLT2 (Saxon.NET engine)

This is amazing. Adam Kinney (Xamlon guy) runs his new blogsite on XSLT 2.0 (using Saxon.NET as XSLT engine): Adam Kinney.com has been redesigned, restructured and refactored. The new site has been inspired by my hate fo comment spam, interest in XSLT 2.0, desire to lose SQL and move to ...

December 8, 2004

Mike Kay on benefits of using XML syntax for XSLT

Here is a really nice wrap up by Mike Kay on what benefits XSLT gets from using XML syntax: I think the benefits are: (a) many stylesheets consist of two-thirds data to be copied into the result tree, and one-third instructions to extract data from the source document. An XML-based ...

Jerusalem - satellite view

Cool new image from the NASA Earth Observatory - satellite view of the Jerualem city and area. ...

December 6, 2004

Understanding XSLT project starts on Monday

m.david starts his new project on Monday - sort of community XSLT learning using wonderful "Beginning XSLT" book by Jeni Tennison Anyone and everyone is welcome to join in this effort to become better XSLT programmers. While I intend to do all I can to keep things moving forward throughout ...

December 5, 2004

Quotes of the day

I arrived at work and found 200+ new posts in xml-dev list. Lovely. XML is still extra hot topic. Here are some nice quotes: For my money, XQuery is a heroic effort by a bunch of incredibly smart people which is crippled - we don't know how seriously - by ...

Client-side: XSLT is coming

As another non-obvious outcome of the recent browser war wave and the raise of Firefox browser is growing appreciation of XSLT as a useful client-side Web technology. That "An Introduction to Client-Side XSLT: It's Not Just for Server Geeks Anymore" article at digital-web.com is making me believe XSLT is finally ...

December 2, 2004

Don't miss chat with C# IDE Team today

C# Chat: The C# IDE Have some questions about expansions, intellisense, or type colorization? Have some suggestions for or comments about refactoring support? Join the C# IDE team to discuss the past, present and future of the C# IDE. December 2, 2004 1:00 - 2:00 P.M. Pacific time Add to ...

December 1, 2004

MSN is about to strike back

They say Microsoft is about to unveil MSN Spaces blogging service, may be even this week: Microsoft's MSN division is expected to take the wraps off its MSN Spaces blogging service this week, according to sources close to the company. MSN is expected to tout MSN Spaces as a direct ...

Hardware XSLT Acceleration

Wow, I've heard about some hardware XML routers, but today I saw an ad banner about hardware XSLT accelerator. Holy cow! Here is some marketing blah-bkah-blah: Standards based XSLT processing is computationally intensive - it overburdens the server infrastructure resulting in poor user experience, high server infrastructure costs and scalability ...