January 2007 Archives

Microsoft to implement XSLT 2.0

| No Comments | No TrackBacks | , ,

Now it's official, from the Microsoft XML Team:

Our users have made it very clear that they want an XSLT 2.0 implementation once the Recommendation is complete.   A team of XSLT experts is now in place to do this, the same people who have been working on  the XSLT enhancements that will be shipped in the forthcoming "Orcas" release of Visual Studio / .NET 3.5.  Orcas development work is winding down in advance of Beta releases over the next several months, so there is no possibility of shipping  XSLT 2.0 in Orcas.   The XSLT team will, however, be putting out Community Technology Previews (CTP) with the XSLT 2 functionality and appropriate tooling as the implementation matures.  The eventual release date and ship vehicles (e.g. a future version of .NET or a standalone release over the Web) have not been determined, and depend on technical progress, customer demand, and other currently unknowable factors. 

Good. Very good news for those who invested in XSLT. XSLT 2.0 is sooooo much better,so much easier to develop with language. And I'm sure this new Microsoft XSLT 2.0 engine is gonna rock.

PDF to be ISO standard too

| 3 Comments | No TrackBacks |

Today's news from Adobe:

SAN JOSE, Calif. — Jan. 29, 2007 — Adobe Systems Incorporated (Nasdaq:ADBE) today announced that it intends to release the full Portable Document Format (PDF) 1.7 specification to AIIM, the Enterprise Content Management Association, for the purpose of publication by the International Organization for Standardization (ISO).

Looks like everybody nowadays wants to be open and ISO standardized. ODF is already ISO standard, OOXML on the way and now PDF joins the club.

Btw, Wikipedia article on PDF is definitely wrong (or written by Adobe) - how on earth this fully proprietary document format is called "an open file format created and controlled by Adobe Systems"?

Provided the fact that Adobe forced Microsoft to remove "Save as PDF" feature from Office 2007 - because they wanted to charge a fee for it, PDF format clearly cannot be called "open format" - it's proprietary format controlled by Adobe and they wanted a fee from at least one vendor trying to implement it. I don't think that is open format.

I'm going to try to change Wikipedia article on PDF to see how it works. I'll report my progress.

And at the end one more curious comparison showing how heavily biased Wikipedia is: PDF vs RTF. Both proprietary document formats, published and widely implemented by both commercial and open tools. But guess what:

Portable Document Format (PDF) is an open file format created and controlled by Adobe Systems, for representing two-dimensional documents in a device independent and resolution independent fixed-layout document format.

and

The Rich Text Format (often abbreviated to RTF) is a proprietary document file format developed by Microsoft since 1987 for cross-platform document interchange. Most word processors are able to read and write RTF documents.

With a piece of PDF conversion software, whether it's an individual license or a larger PDF server package, you may find that various PDF conversion options are more useful than you realized and that a PDF converter can help speed things up around the office.

This is disturbing story. An evil person doing phishing collected 56,000 MySpace user names and passwords and posted them to the "Full-Disclosure" mail list, which is open "unmoderated mailing list for the discussion of security issues" everybody can subscribe to.

Now, of course the mail list is open and is archived by dozens of sites and of course MySpace could just change passwords for compromised users, but no, they instead decided to shut down one particular security site (seclists.org, why only this one?) that happens to be also archiving the "Full-Disclosure" mail list.

And MySpace wanted to make it done real fast, so not bothering about bullshit like contacting seclists.org site owner or hosting company they contacted the domain name registrar (!) which happens to be well respected (so far) Go-Daddy.com, and somehow convinced them to remove the whole seclists.org domain name from the DNS. Now that's cool.

The site is back on now, but Go-Daddy still defends seclists.org takedown, which smells more and more bad. Go-Daddy used to be my favorite domain name registrar. Now I'm (and probably many others) not sure. It's amazing how Go-Daddy turned MySpace problem into their own problem.

While OOXML/ODF war starts to heat up, Microsoft published new version of their another document format "Word 2007: Rich Text Format (RTF) Specification, version 1.9":

The Rich Text Format (RTF) Specification provides a format for text and graphics interchange that can be used with different output devices, operating environments, and operating systems. Version 1.9 of the specification contains the latest updates introduced by Microsoft Office Word 2007.

If somebody forgot, RTF is  proprietary but widely supported non-XML document markup format, which looks like this:

{\rtf1\ansi{\fonttbl\f0\fswiss Helvetica;}\f0
Hello!\par
This is some {\b bold} text.\par
}

This was meant to be one big huge milestone. If only it was done 3 years ago. I hope it's not too late though:

XQuery, XSLT 2 and XPath 2 Are W3C Recommendations

2007-01-22: The World Wide Web Consortium has published eight new standards in the XML family for data mining, document transformation, and enterprise computing from Web services to databases. "Over 1,000 comments from developers helped ensure a resilient and implementable set of database technologies," said Jim Melton (Oracle). XSLT transforms documents into different markup or formats. XML Query can perform searches, queries and joins over collections of documents. Using XPath expressions, XSLT 2 and XQuery can operate on XML documents, XML databases, relational databases, search engines and object repositories.

Wow. Congrats to everybody envolved. Lots of reading now.

Everybody knows that XSLT stylesheet can be embedded into an assembly by setting in Visual Studio its "BuildAction" property to "Embedded Resource". Such stylesheet then can be loaded using Assembly.GetManifestResourceStream() method.

But in fact, this is actually wrong way of loading embedded stylesheets, because once smarty-pants XML developer goes and breaks XSLT stylesheet into modules it suddenly stops working - xsl:import/xsl:include are not compatible with loading stylesheet via  Assembly.GetManifestResourceStream().

The right way of loading embedded stylesheets is via XmlResolver. Having custom XmlResolver loading stylesheet from an assembly solves the problem. And even better - you can use such resolver to load main XSLT stylesheet:

using (XmlReader doc = XmlReader.Create("books.xml"))
{
  XslCompiledTransform xslt = new XslCompiledTransform();
  EmbeddedResourceResolver resolver = 
    new EmbeddedResourceResolver();
  xslt.Load("Catalog.xslt",
    XsltSettings.TrustedXslt, resolver);
  xslt.Transform(doc, XmlWriter.Create(Console.Out));
}

The EmbeddedResourceResolver class can be as simple as:

using System;
using System.Xml;
using System.Reflection;
using System.IO;

namespace EmbeddedStylesheetSample
{
  public class EmbeddedResourceResolver : XmlUrlResolver
  {        
    public override object GetEntity(Uri absoluteUri, 
      string role, Type ofObjectToReturn)
    {
      Assembly assembly = Assembly.GetExecutingAssembly();
      return assembly.GetManifestResourceStream(this.GetType(), 
      Path.GetFileName(absoluteUri.AbsolutePath));
    }
  }
}

Obviously above implementation is way too simple. Particularly it loads resources embedded into the assembly where EmbeddedStylesheetSample class is defined. This can be parametrized so e.g. the resolver can accept assembly name and optional culture name and load class from an appropriate assembly. I think I need to generalize EmbeddedResourceResolver and include it into the Mvp.Xml library so we could just use it and not reinventing again and again.

Seattle trip

| No Comments | No TrackBacks |

We spent this holidays season in Seattle area. For my wife and our little Cat it was first trip to the USA. And traveling between Middle East and North West USA with 20 month old girl is a challenge (Scott gives pretty good advices for kid-wise traveling).  She did very well though, mostly sleeping during those 33 hours in the sky.

Happily we missed the record storm that hit Western Washington so when we arrived things were normal and we got electricity and heat in the hotel and all our local friends got electricity in homes already.

The weather was Seattlish - rain rain rain and lots of gray color in the sky. There were couple of days though when we could see the sun.

Xmas evening we were at the Dimitre Novatchev's house. First time I met Dimitre in person. We talked about everything. He showed me some great stuff he's working on, including a new XPath related application he's developed and having trouble to release because he works for Microsoft now. That was really nice evening.

I also met my friend Sergey Dubinets who does XSLT in Microsoft. They are all about this new important XSLT debugger feature I will be posting about as soon as it goes public.

After all it was nice trip and we managed to get home without anybody catching even a cold.

2006

| No Comments | No TrackBacks |

Well, new year is here and I couldn't agree more with Kent Tegels - I'm glad 2006 is finally over. 2006 sucked on so many levels, but mostly on a personal one. 2007 must be a better year. This is my unexpressive but hopefully achievable new year resolution.

As a good sign the year started with Microsoft MVP award. This is fourth year in a row I'm getting MVP award (for XML of course) and this time I was really worried about getting reawarded - it was lousy year and I didn't accomplish much. Congratulations to all fellow MVPs getting their awards again this year.