December 27, 2006

The contest winners

And the winners are Dave Pawson and Leon Bambrick. Both of them are getting Visual Studio 2005 Team Suite with 1 year MSDN Premium Subscription. Congrats guys! I hope it will help with your work and so benefit the community.

Sorry to the rest - I only have 2 cards to give away...

Now, Dave and Leon please contact me ASAP. I'm on vacation in heavily raining Seattle and tomorrow will be on 2 days flight back to Israel, while your offer is expired Dec 31.

December 19, 2006

Using ms:string-compare() and the rest MS extension functions in XPath-only context

XslCompiledTransform implements the following useful MSXML extension functions. But what if you need to use them in XPath-only context - when evaluating XPath queries using XPathNavigator?

Function Signature and description
ms:string-compare number ms:string-compare(string x, string y[, string language[, string options]])
Performs lexicographical string comparison.
ms:utc string ms:utc(string time)
Converts the prefixed date/time related values into Coordinated Universal Time and into a fixed (normalized) representation that can be sorted and compared lexicographically.
ms:namespace-uri string ms:namespace-uri(string name)
Resolves the prefix part of a qualified name into a namespace URI.
ms:local-name string ms:local-name(string name)
Returns the local name part of a qualified name by stripping out the namespace prefix.
ms:number number ms:number(string value)
Takes a string argument in XSD format and converts it into an XPath number.
ms:format-date string ms:format-date(string datetime[, string format[, string locale]])
Converts standard XSD date formats to characters suitable for output.
ms:format-time string ms:format-time(string datetime[, string format[, string locale]])
Converts standard XSD time formats to characters suitable for output.

Here is a quick sketch on how to leverage XslCompiledTransform implementation of these functions to create custom XslContext class. The code above implments only ms:string-compare(), but other functions can be added in a similar way. Here is how you use it:

string xml = "ABCDEFGH";
XPathExpression expr = 
    XPathExpression.Compile("ms:string-compare(value[1], value[2])");
MsXsltContext ctx = new MsXsltContext();
ctx.AddNamespace("ms", "urn:schemas-microsoft-com:xslt");
expr.SetContext(ctx);
XmlDocument doc = new XmlDocument();
doc.LoadXml(xml);
XPathNavigator nav = doc.DocumentElement.CreateNavigator();
Console.WriteLine(nav.Evaluate(expr));

And here is sample MsXsltContext implementation:

using System;
using System.Xml.Xsl;
using System.Xml.XPath;
using System.Xml;
using System.Reflection;
using System.Xml.Xsl.Runtime;
using System.Globalization;

public class MsXsltContext : XsltContext
{
    // Function to resolve references to my custom functions.
    public override IXsltContextFunction ResolveFunction(string prefix, 
        string name, XPathResultType[] argTypes)
    {
        string namespaceUri = this.LookupNamespace(prefix);
        if (namespaceUri == "urn:schemas-microsoft-com:xslt")
        {
            switch (name)
            {
                case "string-compare":
                    return new MsExtensionFunction(name, 2, 4,
                        new XPathResultType[] { XPathResultType.String, 
                        XPathResultType.String, XPathResultType.String, 
                        XPathResultType.String },
                        XPathResultType.Number);
            }
        }

        return null;
    }

    public override IXsltContextVariable ResolveVariable(string prefix, 
        string name)
    {
        return null;
    }

    public override int CompareDocument(string baseUri, string nextBaseUri)
    {
        return 0;
    }

    public override bool PreserveWhitespace(XPathNavigator node)
    {
        return true;
    }

    public override bool Whitespace
    {
        get
        {
            return true;
        }
    }
}

public class MsExtensionFunction : IXsltContextFunction
{
    private XPathResultType[] argTypes;
    private XPathResultType returnType;
    private string name;
    private int minArgs;
    private int maxArgs;
    private MethodInfo method;

    public int Minargs
    {
        get
        {
            return minArgs;
        }
    }

    public int Maxargs
    {
        get
        {
            return maxArgs;
        }
    }

    public XPathResultType[] ArgTypes
    {
        get
        {
            return argTypes;
        }
    }

    public XPathResultType ReturnType
    {
        get
        {
            return returnType;
        }
    }

    public MsExtensionFunction(string name, int minArgs, 
        int maxArgs, XPathResultType[] argTypes, XPathResultType returnType)
    {
        this.name = name;
        this.minArgs = minArgs;
        this.maxArgs = maxArgs;
        this.argTypes = argTypes;
        this.returnType = returnType;
    }

    public object Invoke(XsltContext xsltContext, object[] args, 
        XPathNavigator docContext)
    {
        switch (name)
        {
            case "string-compare":
                if (method == null)
                {
                    method = typeof(XsltFunctions).GetMethod("MSStringCompare");
                }

                object[] fullArgs = new object[maxArgs];
                fullArgs[0] = ConvertToString(args[0]);
                fullArgs[1] = ConvertToString(args[1]);
                fullArgs[2] = args.Length > 2 ? ConvertToString(args[2]) : "";
                fullArgs[3] = args.Length > 3 ? ConvertToString(args[3]) : "";

                return method.Invoke(null, fullArgs);
        }
        return null;
    }

    private static string ConvertToString(object argument)
    {
        XPathNodeIterator it = argument as XPathNodeIterator;
        if (it != null)
        {
            return IteratorToString(it);
        }
        else
        {
            return ToXPathString(argument);
        }
    }

    private static string IteratorToString(XPathNodeIterator it)
    {
        if (it.MoveNext())
        {
            return it.Current.Value;
        }
        return string.Empty;
    }

    private static String ToXPathString(Object value)
    {
        string s = value as string;
        if (s != null)
        {
            return s;
        }
        else if (value is double)
        {
            return ((double)value).ToString("R", 
                NumberFormatInfo.InvariantInfo);
        }
        else if (value is bool)
        {
            return (bool)value ? "true" : "false";
        }
        else
        {
            return Convert.ToString(value, 
                NumberFormatInfo.InvariantInfo);
        }
    }
}
Don't forget to add a reference to the System.Data.SqlXml.dll.

December 14, 2006

My daily WTF: Gmail for mobile stores password in clear text???

 Gmail client for mobile devices was released by Google a month ago. It's Java ME MIDP2 application, cool looking as one could expect from Google. I went and installed it last week on my Motorola V3X.

Well, I found out that while Gmail for mobile work on hundreds of different mobile devices, it doesn't work on mine. I've got weird error message "Sorry, the Gmail mobile app will not work on your phone. Your phone doesn't have the appropriate certificate to communicate with Gmail. Try accessing Gmail on your mobile browser at http://m.gmail.com". It sucks.

Apparently my phone lacks that Verisign Class 3 public certificate.  Apparently that's known problem and on some phones it can be solved by adding that certificate available from Verisign. Alas, it seems to be impossible to add another root certificate to Motorola V3X phone - I was trying every single way - via Motorola Phone Tools, Bluetooth obex, P2K drivers - nothing helps. Even if I put new certificate into /a/mobile/certs/root/x509/kjava/ folder the phone still won't recognize it. Motodev support didn't help - "Can I help you? What is Gmail for mobile? Give me URL. It clearly says Download Gmail for the Motorola V3 RAZR (US/Canada). You are from Israel. Issue closed." Well, I still hope someone would solve this problem for Motorola phones too.

Anyway, while digging around my phone filesystem I found a folder where J2ME applications are installed (/a/mobile/kjava/installed/) and there I found Gmail jar, image png file and other working files including RMS file. RMS stands for MIDP Record Management System (RMS) - a persistent storage for J2ME MIDlets. Seeing string "Login store" inside it I couldn't resist to scan it. What I found though was my Gmail username and password in clear text!

0000000FF0:  FF FF FF FF FF FF FF FF │ FF FF FF FF FF FF FF FF                  
0000001000:  00 10 6F 6C 65 67 74 6B │ 40 67 6D 61 69 6C 2E 63   ►olegtk@gmail.c
0000001010:  6F 6D 00 0A 6D 79 70 61 │ 73 73 77 6F 72 64 FF FF  om ◙mypassword  
0000001020:  FF FF FF FF FF FF FF FF │ FF FF FF FF FF FF FF FF                  

 WTF??? 

Well, I can't affirm that Gmail for mobile application indeed stores user password in clear text, because I never got it fully working on my phone. Chances are they encrypt it after first successful login.

I need somebody to confirm this. If you've got Gmail for mobile application installed on your mobile, please take a look how your password is stored. I have no idea which mobile devices allow direct access to the file system, but at least it's very easy for Motorola phones. Just install P2K drivers and P2K Phone File Manager, run it and open /a/mobile/kjava/installed/ folder. Find Gmail's RMS file and inspect it.

PS. I did contact Google about this issue, but never got any response.

December 12, 2006

HtmlAgilityPack - DOM and XPath over HTML

I saw today Josh Christie post about "Better HTML parsing and validation with HtmlAgilityPack".

HtmlAgilityPack is an open source project on CodePlex.  It provides standard DOM APIs and XPath navigation -- even when the HTML is not well-formed!

Well, DOM and XPath over malformed HTML isn't new idea. I've been using XPath when screenscraping HTML for years - it seems to me way more reliable method that regular expressions. All you need in .NET is to read HTML as XML using wonderful SgmlReader from Chris Lovett. SgmlReader is an XmlReader API over any SGML document such as HTML.

But what I don't get is why would anyone (but browser vendors) want to implement DOM and XPath over HTML as is? Reimplementing not-so-simple XML specs over malformed source instead of making it wellformed and using standard API? May be I'm not agile anough but I don't think that's a good idea. I prefer standard proven XML API.

Here is Josh's sample that validates that Microsoft's home page lists Windows as the first item in the navigation sidebar implemented using SgmlReader:

SgmlReader r = new SgmlReader();
r.Href = "http://www.microsoft.com";                        
XmlDocument doc = new XmlDocument();
doc.Load(r);                
//pick the first <li> element in navigation section
XmlNode firstNavItemNode = 
  doc.SelectSingleNode("//div[@id='Nav']//li");
//validate the first list item in the Nav element says "Windows"        
Debug.Assert(firstNavItemNode.InnerText == "Windows"); 
I stay with SgmlReader.

December 10, 2006

nxslt v2.1 released - now including NAnt/MSBuild task

I just uploaded nxslt v2.1 release. In addition to the nxslt.exe command line tool it now also includes nxslt task implementation for NAnt and MSBuild.

Why another XSLT task? Because existing ones suck. NAnt includes standard "style" task, but it uses obsolete slow and buggy XslTransform engine to perform transformations. MSBuild doesn't include XSLT task at all, while the Xslt task from the MSBuild Community Tasks Project is broken. Not no mention these tasks are barebone ones. If you need a better XSLT task for NAnt or MSBuild - nxslt task is for you.

Here is some highlights on this new nxslt task.

nxslt task is a free feature-rich task for NAnt and MSBuild that allows to perform XSL Transformations (XSLT) using .NET Framework 2.0 XSLT 1.0 implementation - XslCompiledTransform class. nxslt task supports plenty of advanced features:

  • XML Base, XInclude, XPointer
  • Embedded stylesheets
  • <?xml-stylesheet?> processing instruction
  • Multiple output documents via exsl:document extension element
  • Custom URI resolving
  • Custom extension functions
  • 70+ EXSLT and EXSLT.NET extension functions
  • Credentials to access XML documents and XSLT stylesheets
  • Pretty printing
  • Batch processing

nxslt and nxslt task are free tools under BSD license. Download here.

Btw, besides transforming XML documents nxslt task can also be used for pretty printing or resolving XIncludes. I'll post on this later.

December 6, 2006

The Coolest XML Project Contest

I completely forgot that I still have one Visual Studio 2005 Team Suite with MSDN Premium Subscription gift card to give away. And it expires 12/31! Oh boy, what do I do now??? So for the next 2 weeks I'll be holding the "The Coolest XML Project Contest".

Here is the deal. If you are working on a cool project, product, web site, service or whatever no matter open source or commercial one, which uses any XML technology in any way (hey, isn't anything matches this description nowadays?) you have a chance to win this $10K worth gray box from Microsoft called "Visual Studio 2005 Team Suite with MSDN Premium Subscription for one year". Everybody is eligible, no limitations or restrictions (well, microsofties and devs working for Al-Qaeda are obviously out).

To enter the contest you only need to describe your thing (project/product/web site/service/whatever) to me. Keep it simple. Don't forget to mention how any XML technology is involved. Just post it as a comment on this page. Or if you need pictures to describe it - write on your blog and post here a link. If your are not ready to disclose your stuff to the public eye, send me an email (and then if you win I promise not to unveil project details until you say me so).

It's me who will decide which entry is the coolest and so who is the winner. I'll probably consult my friends though.

Well, I'm XML and XSLT geek doing open source projects so I'm naturally biased toward such kind of things, but that means nothing. What I really want is to give it away to somebody who actually needs it and who is it going to use it to build something cool and preferably to benefit the community.

I'm accepting entries till December 24, 2006. Hurry up. Take a chance.

For a small site doing web page design might be best done without a full-time web designer and simply learning how to do web site design on your own, working on your web site design from scratch or from a template.

Java 6 gets pull XML API

Better late than never - forthcoming Java 6 (currently Release Candidate) will include StAX, pull based streaming XML API.  .NET has pull based XML parser (XmlReader) from the very beginning and Microsoft was arguing .NET's XmlReader is better than SAX since at least 2002. No, I'm not saying Java catches .NET up with one more feature, no. I'm just glad I wil be able to parse XML using the same model and very similar API on both platforms.

December 5, 2006

NAnt doesn't suck, but MSBuild does it big time

I was building NAnt and MSBuild tasks for the nxslt tool last two days and the bottom line of my experience is "previously I thought NAnt sucks, but now I know NAnt is brilliant and it's MSBuild who sucks really big way".

My complaints about NAnt were that

  1. NAnt being .NET Ant clone somehow has different license - while Java Ant is under Apache License, NAnt is under GPL. Now that Sun GPL-ed Java it might sound no big deal, but I personally was in a situation when a project manager said no we won't use NAnt because it's GPL and we don't want such a component in our big bucks product.
  2. NAnt core dlls aren't signed. That in turn means I can't sign my assembly and so can't put it into GAC. Weird.

Really minor ones as I realize now. Besides - NAnt is brilliant. While MSBuild appears to be more rigid and limited. Apparently it's impossible to create MSBuild task that uses something more than just attributes. I mean in NAnt I have this:

<nxslt in="books.xml" style="books.xsl" out="out/params1.html">
  <parameters>
    <parameter name="param2" namespaceuri="foo ns" value="param2 value"/>
    <parameter name="param1" namespaceuri="" value="param1 value"/>
  </parameters>
</nxslt>

 MSBuild doesn't seem to be supporting such kind of tasks. MSBuild task only can have attributes, not children elements. It can have references to some global entities defined at the project level, such as properties and task items. At first I thought task items seem good candidates for holding XSLT parameters, because task items can have arbitrary metadata. And that's exactly how the Xslt task from the MSBuild Community Tasks Project passes XSLT parameters:

<ItemGroup>
  <MyXslFile Include="foo.xsl">
    <param>value</param>
  </MyXslFile>
</ItemGroup>
            
<Target Name="report" >
  <Xslt Inputs="@(XmlFiles)"
    Xsl="@(MyXslFile)" 
    Output="$(testDir)\Report.html" />
</Target>

 Parameters here get attached to an XSLT file item definition, which seems to be reasonable until you realize that you might want to run the same stylesheet with different parameters?

And what worse - above is actually plain wrong because it only provides "name=value" for a parameter, while in XSLT a parameter name is QName, i.e. XSLT parameter is a "{namespace URI}localname=value". And item metadata happens to be limited only to plain name=value. Metadata element can't have attributes or namespace prefix or be in a namespace... It's clear that MSBuild task item is a bad place to define XSLT parameters for my task.

Last option I tried and on which I settled down is defining XSLT task parameters as global MSBuild project properties. Thanks God at least properties can have arbitrary XML substructure! Here is how it looks:

<PropertyGroup>
  <XsltParameters>
    <Parameter Name="param1" Value="value111"/>
    <Parameter Name="param2" NamespaceUri="foo ns" Value="value222"/>
  </XsltParameters>
</PropertyGroup>

<Target Name="transform">
  <Nxslt In="books.xml" Style="books.xsl" Out="Out/params1.html" 
    Parameters="$(XsltParameters)"/>
</Target>

 And here is how you implement it: create a string property "Parameters" in your task class. At the task execution time this property will receive <XsltParameters> element content (as a string!). Parse it with XmlReader and you are done. Beware - it's XML fragment, so parse it as such (ConformanceLevel.Fragment).

Two problems with this approach - it makes me to define parameters globally, not locally (as in NAnt) - hence if I have several transformations in one project I should carefully watch out which parameters are for which transformation. Second - XML content as a string??? Otherwise it's good enough.

Tomorrow I'm going to finish documenting the nxslt NAnt/MSBuild task and release it.

December 4, 2006

MSBuild custom task with a subtree?

I'm missing something obvious and spent already about two hours on that simple problem. I hope somebody profficient in MSBuild drops me a line. How do I build MSBuild custom task that has XML subtree?

Here is my NAnt task:

<nxslt in="books.xml" style="books.xsl" out="out/catalog.html">
  <parameters>
    <parameter name="param1" namespaceuri="" value="val1"/>
  </parameters>
</nxslt>		
How do I build custom MSBuild task that accepts such <parameters> subtree??? The documentation on MSBuild sucks. I mean it's fine if you just using tasks, but if you want to build your own task you screwd up.