XSLT scripting (msxsl:script) in .NET - pure fast evil

| 10 Comments | No TrackBacks

Another coding horror story was reported in the microsoft.public.dotnet.xml newsgroup:

I've been experiencing OutOfMemory errors on our prodution webserver for a few weeks now. I've finally managed to isolate (I think) the problem to our use of c# script blocks in our xsl files.
While debugging I discovered that the app domain for one of our sites had 13000+ assemblies loaded.

Cool. This is just a remainder for those who use XSLT scripting (msxsl:script) in .NET: watch out, this feature can be pure evil if used unwisely - it leaks memory and there is nothing you can do about it.

The problem is that when XSLT stylesheet is loaded in .NET, msxsl:script is compiled into an assembly via CodeDOM and then loaded into memory, into the current application domain. Each time the stylesheet is loaded above process is repeated - new assembly is being generated and loaded into the application domain. But it's impossible to unload an assembly from application domain in .NET!

Here is KB article on the topic. It says it applies to .NET 1.0 only, but don't be confused - the problem exists in .NET 1.1 and 2.0. Moreover I'm pretty much pessimistic about if it's gonna be fixed in the future.

The solution is simple - just don't use script in XSLT unless you really really really have to. Especially on the server side - XSLT script and ASP.NET should never meet unless you take full resonsibility for caching compiled XslCompiledTransform. Use XSLT extension objects instead.

Update. Of couse Yuriy reminds me that msxsl:script runs faster than an extension object, because msxsl:script is available at compile time and so XSLT compiler can generate direct calls, while extension objects are only available at run-time and so can only be called via reflection.

That makes msxsl:script a preferrable but danger solution when your stylsheet makes lots of calls to extension functions.

In a perfect world of course msxsl:script would be compiled into dynamic methods (just like XSLT itself), which are GC reclaimable, but I don't think CodeDOM is capable of doing this currently. I wonder if it's possible to compile C#/VB/J# method source into dynamic method anyway?

Also it's interesting how to improve extension objects performance - what if extension objects could be passed at compile time? They are usually available anyway at that time too. Or what if compiled stylesheet could be "JITted" to direct calls instead of reflection?

Sergey, Anton, can you please comment on this?

Related Blog Posts

No TrackBacks

TrackBack URL: http://www.tkachenko.com/cgi-bin/mt-tb.cgi/622

10 Comments

Another idea to improve the situation might be to use the XSLTC.exe app to pre-compile the XSLT (and associated xsl scripts) into an assembly and then use the XSLCompiledTransform's new Load(...) method to avoid the run-time compilations. More info at: http://blogs.msdn.com/antosha/archive/2006/07/16/667221.aspx.

That said, I still have to vet that out. Thoughts?

Can you provide more details? If you prefer email, you can contact "oleg" at this domain name.

This is not isolated to script in the xslt's, there's a major bug somewhere in the XslCompiled Transform code. Our site's content all runs through an xslt to clean-up the html created by the content teams. There is NO script in the Xslt's (Only 2 are used for the whole site) and the XslCompiled Transfom has a file dependancy cache on it. Admitedly the problem only ocours when both the GoogleBot and Yahoo Slurp bot handily walk the site at the same time but the 1k-2k exception emails we get as a result is rather irritating!!! If anyone has any ideas how to fix this, other than removing the transforms (Too much work and no time to do it) they'd be very welcome!!

Thanks for the great comment Sergey!
That's amazing still there is no compiler to dynamic methods, somebody has to work on this.
If not that must be interesting project if I only have any spare time :(

I'd not agree that scrip in XSLT is evil. I like the feature. The fact that the implementation of this feature in the .NET Framework 1 & 2 has serious usability problem doesn't mean the feature is bad and can't be implemented better.

In the days of MSXML scripts ware slow because they required starting scripting environment to interpret vbscript or jscript. This is quite expensive and extension objects were introduced as workaround.
.NET allowed compiling all scripts to the assemblies and scripts became much faster then extension objects, especially in XslCompiledTransform, which binds to the script functions at compiled time. The latter fact benefits from the real advantage of scripts over extension objects – scripts are available at compile time. This makes possible name and type verification, early binding and many forms of optimizations including type conversion elimination.
In the script, I also like the ability to write them in the same file where I call them. (And you still can put scripts in the separate file using xsl:include).

The well known problem with scripts in .NET Framework 2.0 is that the CodeDOM in order to compile scripts generates types and types can be unloaded only with entire AppDomain. The problem is not with the scripts themselves but with luck of adequate technology in .NET to compile them. The script unload problem would not exist if CLR solves the type unloading problem (no evidence that this going to happen soon), or if CodeDOM or some other technology would allow compiling script blocs to dynamic methods (from my perspective C# and Co. compilers should be managed classes that take TextReader as input and write results to the MethodBuilder.).
The script unloading problem also disappears with XSLT compiler. In this case compiled stylesheets become normal assemblies and can't be unloaded either. Scripts are not special in this case any more. (In this case we have packaging problem – stylesheet with script blocks compiles to multiple assemblies instead of one, but this is different story.)

Thus, scripts have chances to overcome the usability problem. Extension objects are unlikely to become better in the way they are designed.
Instead we discuss adding another CLR binding mechanism where users would be able to describe in the stylesheet which CLR types they want use and call methods of these types from XPath expressions in the stylesheet. (Similar to what Common Java Binding does.) In this case XslCompiledTransfor would be able to bind to methods at compile time.

In addition I'd like to point to one more victim of script unloading problem:
http://forums.microsoft.com/MSDN/ShowPost.aspx?PostID=846988&SiteID=1

Sergey

maxtoroq, doesn't matter which language you are using. Above applies to XslTransform and XslCompiledTransform classes in .NET.

Do you know if this apply if you use jscript or javascript as the msxsl:script language?

Good point, Yuriy! Now I have to update my post.

Compare the following call stacks to see the difference how the same function is invoked depending on msxsl:script on XSLT extension object

EXTENSION OBJECT:
=================
} net2xslt.exe!net2xslt.Program.MSE.test(object o = Position=0, Current=null) Line 19
[Native to Managed Transition]
[Managed to Native Transition]
System.Data.SqlXml.dll!System.Xml.Xsl.Runtime.XmlExtensionFunction.Invoke(object extObj, object[] args) + 0x42 bytes
System.Data.SqlXml.dll!System.Xml.Xsl.Runtime.XmlQueryContext.InvokeXsltLateBoundFunction(string name, string namespaceUri,

System.Collections.Generic.IList{System.Xml.XPath.XPathItem}[] args) + 0x345 bytes
System.Xml.Xsl.CompiledQuery!{xsl:template match="/"}() Line 18 + 0x12b bytes XSLT

MSXSL:SCRIPT
============
} l_yjrpgf.dll!System.Xml.Xsl.CompiledQuery.Script1.test(object o = Position=0, Current=null) Line 10
System.Xml.Xsl.CompiledQuery!{xsl:template match="/"}() Line 18 + 0x123 bytes XSLT
System.Xml.Xsl.CompiledQuery!System.Xml.Xsl.CompiledQuery.Query.{xsl:apply-templates}(System.Xml.Xsl.Runtime.XmlQueryRuntime

{urn:schemas-microsoft-com:xslt-debug}runtime = {System.Xml.Xsl.Runtime.XmlQueryRuntime}) + 0xc5 bytes


However, if used carefully msxsl:script provides much value. XSLT compiler generates direct calls to msxsl:script functions while for the XSLT extension objects in generates invokes via reflection API. If you replace calls to EXSLT.NET string handling functions with C# equialents in msxsl:script blocks you get significant performance boost, if they are invoked often. so, if you can manage to cache XSLT compiled instances in memory they are very useful. We dramatically increased performance of the application, by replacing them with msxl:script blocks. It applies to XslCompiledTransform only.

Leave a comment