June 21, 2006

Rotor 2.0

This is old news, but I somehow missed it so I'll post for news-challenged like me. Microsoft has released "Shared Source Common Language Infrastructure 2.0" aka Rotor 2.0 - buildable source codes of the ECMA CLI and the ECMA C#. This is roughly .NET 2.0 sources with original comments. Priceless! It's released under "MICROSOFT SHARED SOURCE CLI, C#, AND JSCRIPT LICENSE".

New in this release:

  • Full support for Generics.
  • New C# 2.0 features like Anonymous Methods, Anonymous Delegates and Generics
  • BCL additions.
  • Lightweight Code Generation (LCG).
  • Stub-based dispatch. (What the hell is that?)
  • Numerous bug fixes.

There is always the Reflector, but Rotor is different - you build it, debug with it, learn and extend CLI. Now what do I want to play with? Editable XPathDocument or XSLT2DLL compiler or extendable XmlReader factory may be...

foActive <X>Styler

And going on with Word as XSL-FO editor theme - take a look at a brand new tool called foActive <X>Styler:

foActive <X>Styler is a plug-in for Microsoft Word 2003 Professional which allows a user to design and test dynamic document templates right from within the Word authoring environment.

<X>Styler is used to create XSL templates for server-based transformation for high-volume dynamic document print applications such as direct mail, correspondence, invoicing, statements, contracts, and legal forms.
And more:
Writing XSL templates that generate XSL FO output can be a difficult task, one suited for an engineer and not a marketing person. What the industry needed was an easy-to-use tool for designing templates to convert XML to XSL FO using XSL. There are applications that have recently emerged to do just this, however these are standalone applications designed from the ground-up for just this purpose. As such, they can be unnecessarily complex and require specific custom training to master. They expose all the functionality and complexities of XSL to the end-user.

And so foActive designed <X>Styler, merging the most common desktop application in use -- Microsoft Word -- with the difficult to master XSL design. We coupled the whole system to the industry's best XSL FO engine -- RenderX -- to deliver a complete solution for a wide variety of XSL design tasks.
That's what I was talking about all the way.

The price is set at $199, beta program is open. Sounds really cool.

XSLfast 3.0 released - XSL-FO WYSIWYG editor

jCatalog Software AG has releaed XSLfast 3.0 - XSL-FO WYSIWYG editor. What's new in version 3.0. In general XSL-FO doesn't meant to be authored, the idea is that XSL-FO is generated using XSLT. Unfortunately that requires knowledge of XSL-FO twisted vocabulary and, well, XSLT. I always knew WYSIWYG editor could save XSL-FO and XSLfast might be that one. If only the price wasn't freaking 890,00 EUR per license. And that probably doesn't include XSL-FO formatter itself!

Btw, after years and years Apache FOP Team's finally discussing 1.0 release...

June 15, 2006

New Microsoft XML API - XmlLite

And you thought XML is done? No way. It's alive and kicking technology. And here is just one more proof: yet another new XML API from Microsoft - the XmlLite. It's a native library for building high-performance secure XML-based applications. XmlLite library is a small one by design - it only includes pull XML parser (native analog of the .NET's XmlReader), XML writer (native analog of the .NET's XmlWriter) and XML resolver (similar to the .NET's XmlResolver). XmlLite's meant to be small, simple, secure, standards-compliant but damn fast library to read and write XML. It's claimed to be able to parse XML even faster than MSXML. What I found especially compelling is XmlLite API similarity with .NET - no need to learn yet another way to read and write XML, it's a lite version of the .NET's XmlReader/XmlWriter, but for native programming. It's a "lite", so: no validation, very limited DTD processing (entity expansion and defaults for attributes only), no ActiveX, no scripting languages, not thread-safe etc.

Why another XML API?

XmlLite doesn't use or link MSXML, it's a separate standalone DLL. The reason why it's a separate DLL and not a part of MSXML is probably MSXML DLL size and lots of dependencies not all applications are willing to tolerate. Latest msxml6.dll is 1.3 Mb and it depends on mlang.dll, wininet.dll, urlmon.dll (about 700Kb each). XmlLite.dll is just 115Kb and depends on nothing.

How do I develop with XmlLite?

XmlLite SDK is part of the "Microsoft® Windows® Software Development Kit (SDK) for Beta 2 of Windows Vista and WinFX Runtime Components" aka Windows SDK. That of course doesn't mean XmlLite works only on Windows Vista (while it's expected to be shipped with Vista). It's a plain Win32 DLL you can work with even in Visual Studio 6. So - install Windows SDK (don't forget to check "Windows Vista Headers and Libs" point while installing). That will give you XmlLite.h, XmlLite.lib and documentation. That's enough for compiling and linking your application. IN order to run it you also need XmlLite runtime - the DLL. Currently it only comes with IE7 and Vista betas, but if you don't want to install any of these here is a trick - download latest IE7 installer, but don't run it. Unzip it instead and extract xmllitesetup.exe. This is XmlLite runtime installer, which will install XmlLite.dll into your system.

XmlLite reader is a pull-based (as opposite to SAX, which is push-based) non-caching forward-only XML parser. If you are not familiar with pull-based parsing, read this. Pull XML parsing is so easier to program with that I think everybody using SAX/MSXML should consider switching to XmlLite - now you've got an alternative which is not only faster but also better.

Here is my sample (you can find more samples in Windows SDK documentation) of extracting some value out of XML file. Sample XML document:

<config>    
    <key name="mykey" value="myval"/>
    <key name="foo" value="bar"/>
</config>
And I want to read a value of a key named "foo":
#include "stdafx.h"
#include <atlbase.h>
#include "xmllite.h"
#include <strsafe.h>

int _tmain(int argc, _TCHAR* argv[])
{
  HRESULT hr;
  CComPtr<IStream> pFileStream;
  CComPtr<IXmlReader> pReader;
  XmlNodeType nodeType;
  const WCHAR* pName;
  const WCHAR* pValue;


  //Open XML document
  if (FAILED(hr = SHCreateStreamOnFile(L"config.xml", 
    STGM_READ, &pFileStream)))
  {
    wprintf(L"Error opening XML document, error %08.8lx", hr);
    return -1;
  }

  if (FAILED(hr = CreateXmlReader(__uuidof(IXmlReader), 
    (void**)&pReader, NULL)))
  {
    wprintf(L"Error creating XmlReader, error %08.8lx", hr);
    return -1;
  }

  if (FAILED(hr = pReader->SetInput(pFileStream)))
  {
    wprintf(L"Error setting input for XmlReader, error %08.8lx", hr);
    return -1;
  }

  while (S_OK == (hr = pReader->Read(&nodeType))) 
  {
    switch (nodeType)
    {
    case XmlNodeType_Element:
      if (FAILED(hr = pReader->GetQualifiedName(&pName, NULL)))                      
      {
        wprintf(L"Error reading element name, error %08.8lx", hr);
        return -1;
      }
      if (wcscmp(pName, L"key") == 0)
      {
        if (SUCCEEDED(hr = 
          pReader->MoveToAttributeByName(L"name", NULL)))                      
        {
          if (FAILED(hr = pReader->GetValue(&pValue, NULL)))                      
          {
            wprintf(L"Error reading attribute value, error %08.8lx", hr);
            return -1;
          }
          if (wcscmp(pValue, L"foo") == 0) 
          {
            //That's an element we are looking for
            if (FAILED(hr = 
              pReader->MoveToAttributeByName(L"value", NULL)))                      
            {
              wprintf(L"Error reading attribute \"value\", error %08.8lx", hr);
              return -1;
            }
            if (FAILED(hr = pReader->GetValue(&pValue, NULL)))                      
            {
              wprintf(L"Error reading attribute value, error %08.8lx", hr);
              return -1;
            }
            wprintf(L"Key \"foo\"'s value is \"%s\"", pValue);
          }
        }    
      }
      break;
    }
  }
  return 0;
}

XmlLite reader and writer work on an instance of the IStream. You can use standard SHCreateStreamOnFile/CreateStreamOnHGlobal functions to read from memory or a file. If you need something else, e.g. reading from a socket, XmlLite SDK contains handy sample of a class implementing IStream you can start with.

Instances of the XmlLite's XmlReader and XmlWriter are meant to be reusble - first you create XmlReader/XmlWriter instance and then attach a stream to read from or write to using SetInput() or SetOutput() methods. At anty time you can reset XmlReader/XmlWriter and start working with a new stream.

Security. Of course DTD processing is turned off by default just like in .NET 2.0 and MSXML6. Besides XmlLite supports also fairly advanced security featues such as limiting memory consumption, limiting maximum element depth and limiting entity expansion. The latter makes XmlLite immune to the notorious billion laughs attack - a 1 kb well-formed XML document that kills IE, Visual Studio and almost any other tool which tries to parse it and expand laughing entities. Even with DTD support turned on XmlLite just stops parsing when entity expansion limit is reached. Cool, huh?

One more cool feature - XmlLite can read XML fragments, which is a recommended way to store frequently updated data such as log files.

And that's not all. What about random access mode in which XmlLite parser will store not attribute values, but attribute positions in a stream instead? Non-Extractive XML Parsing comes true. This can seriously reduce memory consuption in some scenarios. Of course underlying stream must be seekable for this.

Reading attribute or element values in chunks, IXmlResolver for total control over resolving of external entities. IMalloc to control reader/writer memory allocation. That's all great stuff.

Now about the dark side. IXmlReader provides bare minimum needed for XML parsing. After all it's XmlLite. I used to .NET luxury XmlReader and miss utility methods such as GetAtribute(), MoveToAttribute(), MoveToContent(), ReadElementString(), ReadInnerXml(), ReadOutterXml(), ReadToDescendant(), Skip() etc. XmlLite doesn't implement those trying to keep minimum API as possible. But I believe these methods while being pure utility are very substantial to the very nature of pull XML parsing. Without them pull parsing erodes into pseudo-push parsing - you gotta build pseudo push engine (I mean that "while (reader.Read())" loop with a switch within) and hook handlers for nodes you are interested in - instead of reading data you need directly. Not to mention that implmenting these methods properly can be tricky and error-prone. I think I'll provide an IXmlReader helper class with those missing methods.

So meet XmlLite, tiny but mighty third Microsoft XML library.

June 13, 2006

Bruce Eckel's general purpose XML manipulation library

Bruce Eckel doesn't like XML. But alas - it's everywhere and he has to deal with it. So as you can expect, he goes and creates "general purpose XML manipulation library called xmlnode." for Python. That should be easy, right? Just one class, no need for more. Alas, it doesn't support namespaces, mixed content, CDATA sections, comments, processing instructions, DTD, Doctype, doesn't check well-formedness rules such as element and attribute names or allowed in XML characters etc. Well, that must be version 0.0...

XSLT2/XPath2/XQuery1 fresh CRs

W3C has released fresh versions of the Candidate Recommendations of XML Query 1.0, XSLT 2.0, XPath 2.0 and supporting documents. No big deal changes - xdt:* types has been moved to xs:* namespace (damn XML Schema). See new XQuery1/XPath2 type system below. Looks like XSLT2/XPath2/XQuery1 are moving fast toward Proposed Recommendation. What's weird is that new documents all say "This specification will remain a Candidate Recommendation until at least 28 February 2006." Must be a mistake. Anyway, what are now chances for XSLT 2.0 in the .NET? Next major .NET release (Orcas) is expected October 2007 or so (forget newly announced .NET 3.0, which is actually .NET 2.0 + Avalon + Indigo). Plenty of time for XSLT2 to reach Recommendation status, even provided that Microsoft actually freezes codebase 6 months before shipping.

We all remember that major arguments for Microsoft not implementing XSLT 2.0 were XQuery (they decided it's better) and XSLT2 draft status (so don't repeat WD-XSL story). Now that XQuery is wiped out from .NET and XSLT2 becoming full recommendation, what could be the next argument against implementing it? Probably XLinq. But as XLinq evolves it becomes clear that XLinq doesn't really replaces XSLT.

.NET provides amazing support for XSLT1. Developing, debugging with Visual Studio, one of the best XSLT processors - XslCompiledTransform. That's great and hence XSLT is everywhere nowadays and you know what - we want more! XSLT 1.0 sucks, give us XSLT 2.0!

Here is a new XPath2/XQuery1 type system. I'd say it's definitely more elegant than it was before. Ready to go I think.

June 12, 2006

MSDN Wiki

Hmmm, community-driven MSDN documentation... tempting.
Microsoft has launched the MSDN Wiki Beta - sort of a wrapper around MSDN documentation site, which adds "Community Content section" to the bottom of each MSDN page. Anybody can contribute any content to that section. Here is my test contribution to the "XslCompiledTransform Class" page. Basically such community-driven documentation could be awesome. MSDN documentation is huge and usually the subject you desperately need happens to be covered scarcely or even in a cryptic way. Microsoft admits they are just unable to cover all topics. Sadly but fact. So at least they can provide a centralized way for the community to contribute. One big question though is community content quality - somebody have to moderte all that stuff otherwise it's gonna be filled with spam and lame questions in just a week.

June 7, 2006

4 months database query

I was querying one remote and probably distributed database recently. I entered my name, sent query request and got "No such record" response - four freaking months later! What kind of crazy database is it? That's Russia's police database. I requested a police certificate from russian embassy in Israel and it took them four months to query that information. Wow. No doubts they are still using legacy database called "a huge pile of paper files" out there in Russia. As a matter of interest the same query in Ukraine takes 1 day, while in Israel - 5 minutes (mostly to print results).

June 4, 2006

New Microsoft certification - MCPD

I had a voucher for a free Microsoft certification exam which I got at the MVP summit last year and it was due to expire May 31. So I went to see how can I use it. As you probably know Microsoft has launched new wave of certifications with .NET 2.0 and Visiual Studio 2005 release. So I found out that "Microsoft Certified Professional Developer" series is now my target in this game. I'm MCAD already and in order to upgrade to MCPD I have to take 3 upgrade exams (there is no single MCPD credentials, it's MCPD Web, MCPD Windows and MCPD Enterprise). The problem is that those upgrade exams aren't released yet. Happily I managed not to waste my expiring voucher though. MCPD Windows requires 3 exams and two of them I've already passed in beta form, which actually counts. So last week I took 70-526 exam, "TS: Microsoft .NET Framework 2.0 - Windows-Based Client Development". I don't write much for Windows these days so it wasn't piece of cake. But not a rocket engineering either. I should admit new exams are much more comprehensive and tough. Well, anyway I'm MCPD Windows now.