December 20, 2005

The Raise of XSLT Compilation

Slowly, gradually and with not much loud buzz both modern managed platforms - Java and .NET have switched to compiling XSLT implementations by default. First Java 5.0 made compiling Apache XSLTC processor a default transformer in JAXP 1.3 (instead of interpreting Apache XALAN). Then Microsoft released .NET 2.0 with new ...

Both Java and .NET declare the same reason for adopting XSLT compilation - performance. Here is a snippet from JAXP 1.3 documentation:

o XSLTC, the fast, compiling transformer, which is now the default engine for XSLT processing.
The XSLTC transformer generates a transformation engine, or translet, from an XSL stylesheet. This approach separates the interpretation of stylesheet instructions from their runtime application to XML data.

XSLTC works by compiling a stylesheet into Java byte code (translets), which can then be used to perform XSLT transformations. This approach greatly improves the performance of XSLT transformations where a given stylesheet is compiled once and used many times. It also generates an extremely lightweight translet, because only the XSLT instructions that are actually used by the stylesheet are included.
And here is what Microsoft XML Team writes about XslCompiledTransform:
To improve XSLT execution performance in the .NET Framework version 2.0, the XslTransform class has been replaced with a new XSLT 1.0 implementation: the XslCompiledTransform class. XslCompiledTransform compiles XSLT stylesheets to Microsoft Intermediate Language (MSIL) methods and then executes them. Execution time of the new processor is on average 4 times better than XslTransform and matches the speed of MSXML, the native XML processor.

Is it true that only XSLT compilation can provide the best XML transformation performance on managed platforms like Java and .NET? I have no fresh benchmark results, but AFAIR XSLTC was always one of the fastest XSLT processors undeservedly underused because of its unique processing model. And Microsoft also claims that new XslCompiledTransform now matches the speed of MSXML4. But what about Saxon? It's interpreting XSLT engine and it's pretty fast. I believe Saxon is fast only due to numerous very smart and unique optimizations and so can't beat compiling optimizing XSLT processor.

The idea that ideal XSLT engine is optimizing compiling one sounds pretty obvious. XSLT was and is meant to be compiled, not interpreted and despite the fact that for years there was only a single semi-experimental compiling XSLT engine around - Sun's XSLTC (now Apache XSLTC), XSLT 2.0 is still looks like more traditional compiled language than a dynamic one.

Ok, but what about the future? I think that's safe to say that in the future XSLT compilation will be even more pervasive. Apache community (with IBM behind contributing developers) have chosen XSLTC, not XALAN as a basis for their future XSLT 2.0 implementation and I have no doubts that Microsoft will implement XSLT 2.0 only as a compiling engine too. And I love it. I predict that in a near future we will be compiling XSLT stylesheets as we do with ordinary Java or C# classes and call "translets" at run time as usual classes without bothering to load stylesheet sources first.

Btw, it should be noted that for Java users the switch to another default XSLT engine went mostly unnoticed thanks to JAXP, while Microsoft has no JAXP analog so users have to migrate to the XslCompiledTransform explicitly modifying their code. I'll address that in a separate post though.

RSS Bandit Nightcrawler release in Russian

RSSBandit users who are interested in Russian localization were probably disapointed when found no Russian language support in the Nightcrawler release. Sorry about that, I was too late for the deadline. Good news is that RSS Bandit bugfix release with Russian localization is expected really soon - most likely before ...