<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:random="http://exslt.org/random" exclude-result-prefixes="random"> <xsl:template match="/"> 10 random numbers: <xsl:for-each select="random:random-sequence(10)"> <xsl:value-of select="format-number(., '##0.000')"/> <xsl:if test="position() != last()">, </xsl:if> </xsl:for-each> </xsl:template> </xsl:stylesheet>The result is
10 random numbers: 0.311, 0.398, 0.698, 0.929, 0.418, 0.523, 0.667, 0.215, 0.915, 0.007The function accepts optional number of random numbers (1 by default) to generate and optional seed (DateTime.Now.Ticks by default) and returns nodeset of <random> elements, each one containing generated random number.
EXSLT.NET team members are encouraged to review my implementation in the projects's source repository and if nobody objects we can release EXSLT.NET 1.1 version.
I'm talking about "keep reading till element foo" pattern all we familiar with:
while (reader.Read()) { if (reader.NodeType==XmlNodeType.Element && reader.Name=="foo") { ... } }Bolded part is the crux here. reader.Name property returns parsed element name, with respect to the XmlReader's NameTable, while "foo" string doesn't belong to any NameTable. That means usual string comparison (pointers/length/char-by-char) occurs, which is obviously slow. That's not how it was meant to be! "Object Comparison Using XmlNameTable with XmlReader" article of the .NET Framework Developer's Guide suggests different usage pattern:
object cust = reader.NameTable.Add("Customer"); while (reader.Read()) { // The "if" uses efficient pointer comparison. if (cust == reader.Name) { ... } }Both strings compared here are belong to the same NameTable, thus taking the comparison down to a single cheap pointer comparison!
And what do you think Sun does it in their XML Processing Performance Java and .NET. comparison? The same reader.Name != "LineItemCountValue" stuff! It's interesting to run their tests with such lines fixed.
According to my rough measurements this unfortunate usage pattern costs about 1-20% of the parsing time dependig on many factors. Below is my testing. I'm parsing books.xml document, counting "price" elements.
The result on my Win2K box is:
D:\projects\Test\bin\Release>Test.exe Warming up... Testing... Time with NameTable: 1308.86 ms Time with no NameTable: 1403.60 msBenchmarking is a really fragile stuff and I'm sure the results will differ drastically, but basically what I wanted to say is that something needs to be done to fix this particular usage pattern of XmlReader to not ignore great NameTable idea. I encourage fellow MVPs, XmlInsiders and others not to post XmlReader samples, where NameTable is neglected.