I was about to release XInclude.NET version conforming to the September's XInclude spec tomorrow. So it's just in time. As far as I can see no significant changes were instroduced, so couple days for aligning, fixing documentation - and then expect new release of the Mvp.Xml library (including Common, XInclude.NET and XPointer.NET modules) and then nxslt.exe update.
For those unfamiliar with XInclude, take a look at my MSDN article "Combining XML Documents with XInclude".
Btw, AFAIR Microsoft cut XInclude implementation too because of the same "not Rec yet" issue. Now that XInclude is Recommendation, small, simple (read not contaminated by the XML Schema) nice useful core technology and there is still plenty of time till .NET 2.0 is out, I hope the XML Team will consider implementing it. I'd like to see support for XInclude in .NET 2.0, do you?
XSLT 1.1 is that officially frozen XSLT version, which was supposed to improve XSLT in an evolutionary way - by solving only the most irritating problems in XSLT 1.0. Changes from XSLT 1.0 are small: that nasty result tree fragment data type is eliminated, so no need for xxx:node-set() function; XML namespaces quirks are fixed; support for XML Base is added; multiple output is supported via new xsl:document instruction; xsl:apply-imports can have parameters; standard way to define embedded extension functions is defined - xsl:script; new "external object" data type is defined for better interop with extension languages.
No big deal to implement IMO, but what a relief for .NET devs working with XSLT. And it's quite stable - XSLT 1.1 isn't Recommendation actually, but it's oficially frozen and won't change anymore. Saxon and jd.xslt support it. And implementing EXSLT would provide rich function library to allow XSLT developers to be much more productive by eliminating the boring needs every time to reimplement from scratch such trivial tasks as string tokenizing, formatting dates or getting list of unique values. EXSLT.NET project proves it's pretty implementable.
Yes, one can use EXSLT.NET right now, but EXSLT.NET library has some serious limitations. It's perf and security problems I'm talking about. The main problem is about how EXSLT.NET is implemented. Main idea behind EXSLT was that XSLT vendors would implement it, while EXSLT.NET is just external layer on top of the XslTransform class. It's implemented as user extension functions, not system extensions like msxsl:node-set() function. Hence - awful lots of reflection work is done during each function call and on returning a node-set and of course FullTrust security demand, which makes EXSLT.NET plain useless in any not fully trusted environment such as ASP.NET. All these problems could be fixed easily by just moving EXSLT.NET into the core of the XSLT implementation - it would make it faster, safer and more reliable.
Well, just an idea to evaluate actually.
It's 47 printed pages and I had no time to read it thoroughly yet, but I skimmed XML-related parts. There are some normative answers to some bloated questions finally.
Binary vs Text data formats:
The trade-offs between binary and textual data formats are complex and application-dependent. Binary formats can be substantially more compact, particularly for complex pointer-rich data structures. Also, they can be consumed more rapidly by agents in those cases where they can be loaded into memory and used with little or no conversion. Note, however, that such cases are relatively uncommon as such direct use may open the door to security issues that can only practically be addressed by examining every aspect of the data structure in detail.Oh yeah, well said.
Textual formats are usually more portable and interoperable. Textual formats also have the considerable advantage that they can be directly read by human beings (and understood, given sufficient documentation). This can simplify the tasks of creating and maintaining software, and allow the direct intervention of humans in the processing chain without recourse to tools more complex than the ubiquitous text editor. Finally, it simplifies the necessary human task of learning about new data formats; this is called the "view source" effect.
It is important to emphasize that intuition as to such matters as data size and processing speed is not a reliable guide in data format design; quantitative studies are essential to a correct understanding of the trade-offs. Therefore, designers of a data format specification should make a considered choice between binary and textual format design.
When to use XML:
XML defines textual data formats that are naturally suited to describing data objects which are hierarchical and processed in a chosen sequence. It is widely, but not universally, applicable for data formats; an audio or video format, for example, is unlikely to be well suited to expression in XML. Design constraints that would suggest the use of XML include:
1. Requirement for a hierarchical structure.
2. Need for a wide range of tools on a variety of platforms.
3. Need for data that can outlive the applications that currently process it.
4. Ability to support internationalization in a self-describing way that makes confusion over coding options unlikely.
5. Early detection of encoding errors with no requirement to "work around" such errors.
6. A high proportion of human-readable textual content.
7. Potential composition of the data format with other XML-encoded formats.
8. Desire for data easily parsed by both humans and machines.
9. Desire for vocabularies that can be invented in a distributed manner and combined flexibly.
On linking in XML:
Designers of XML-based formats may consider using XLink and, for defining fragment identifier syntax, using the XPointer framework and XPointer element() Schemes.Note that "may". It means "we'd like to see at least anybody using XLink, though we admit it's not so good." It's still an issue.
XLink is not the only linking design that has been proposed for XML, nor is it universally accepted as a good design.
On our favorite nightmare - XML namespaces. It's always an issue (aka it's too long), go read it. Some related to the misunderstanding Dare was writing about:
Attributes are always scoped by the element on which they appear. An attribute that is "global," that is, one that might meaningfully appear on elements of many types, including elements in other namespaces, should be explicitly placed in a namespace. Local attributes, ones associated with only a particular element type, need not be included in a namespace since their meaning will always be clear from the context provided by that element.
The type attribute from the W3C XML Schema Instance namespace "http://www.w3.org/2001/XMLSchema-instance" ([XMLSCHEMA], section 4.3.2) is an example of a global attribute. It can be used by authors of any vocabulary to make an assertion in instance data about the type of the element on which it appears. As a global attribute, it must always be qualified. The frame attribute on an HTML table is an example of a local attribute. There is no value in placing that attribute in a namespace since the attribute is unlikely to be useful on an element other than an HTML table.
And here are some new definitions for a very bloated topic:
Another benefit of using URIs to build XML namespaces is that the namespace URI can be used to identify an information resource that contains useful information, machine-usable and/or human-usable, about terms in the namespace. This type of information resource is called a namespace document. When a namespace URI owner provides a namespace document, it is authoritative for the namespace.Well, I'm not sure I fully agree with this practice, but at least it sounds reasonable and clear.
There are many reasons to provide a namespace document. A person might want to:
- understand the purpose of the namespace,
- learn how to use the markup vocabulary in the namespace,
- find out who controls it and associated policies,
- request authority to access schemas or collateral material about it, or
- report a bug or situation that could be considered an error in some collateral material.
A processor might want to:
- retrieve a schema, for validation,
- retrieve a style sheet, for presentation, or
- retrieve ontologies, for making inferences.
In general, there is no established best practice for creating representations of a namespace document; application expectations will influence what data format or formats are used. Application expectations will also influence whether relevant information appears directly in a representation or is referenced from it.
On QNames in content problem:
Do not allow both QNames and URIs in attribute values or element content where they are indistinguishable.
XML ID problem - still not solved.
Media types for XML:
In general, a representation provider SHOULD NOT assign Internet media types beginning with "text/" to XML representations.Read again that. Use what RFC 3023 says - "application/xml" and all that jazz with "+xml" suffix (e.g. "image/svg+xml"). Also:
In general, a representation provider SHOULD NOT specify the character encoding for XML data in protocol headers since the data is self-describing.
So lots of cool stuff to read and follow.
Mark is very energetic guy and nicely talks about editable XPathDocument as a preffered XML store in .NET2 (has been cut, stick with DOM for several more years) and of course XQuery implementation in .NET2 (cut). So basically all new XML features that still not cut are about performance, some API rearranging and XSLT 1.0. Performace was an old pain and had to be fixed anyway. XSLT 1.0 stuff is good - finally XSLT debugger (which actually was hidden as internal stuff within System.Xml.dll since .NET 1.0 - just run Reflector to see it), better XML and XSLT editor in Visual Studio .NET 2005, and brand new XSLT 1.0 processor, which compiles XSLT down to MSIL - this one is just perfect and I'm looking forward to give it a whirl. So as a matter of curiosity .NET 2.0 is going to include two different XSLT 1.0 processors (one obsoleted though).
Well, I'm not saying that's nothing. But let me be harsh - it's all catch up stuff. It would be excellent to have in .NET 1.1, but for year 2005 it's quite disappointing. May be that's because all that hype about XQuery and how it's better than XSLT. There were really hot debates on Microsoft decision to implement XQuery, but not XSLT 2.0 and now guess what - neither XQuery nor XSLT2.0 and developers running out.
I can imagine how much resources have been spent on XQuery! That's not a small one. Apparently managed XQuery turned out to be just another black hole project. Who knew, right?
Oh well, there is Saxon.NET and may be will be XQP. But I fell a little bitter taste when I realize I'll still be helping people struggling with XSLT 1.0 limitations at least next 3 years, while XSLT 2.0 is shining in a better world. And my fellow XML MVPs are sharing my feelings. Still - what a sad irony - to listen to Mark, who has left XML team talking about XQuery impementation, which was cut...