August 22, 2006

Free Microsoft Training CD-ROMs

AppDev is giving away these Microsoft training CDs. Free shipping in the US, nominal shipping charge outside. Quite impressive list:

  • Visual C# 2005: Developing Applications
  • Visual Basic 2005: Developing Applications
  • ASP.NET Using Visual C# 2005
  • ASP.NET Using Visual Basic 2005
  • Visual Studio 2005 Tools for Microsoft Office
  • Exploring ASP.NET "Atlas" and Web 2.0
  • Exploring Visual C# 2005
  • Exploring Visual Basic 2005
  • Exploring ASP.NET Using Visual C# 2005
  • Exploring ASP.NET Using Visual Basic 2005
    Visual Studio .NET
  • Developing Applications Using Visual C# .NET
    Visual Basic .NET
  • ASP.NET Using Visual C# .NET
  • ASP.NET Using Visual Basic .NET
  • Exploring BizTalk Server 2006
  • Exploring Microsoft SQL Server 2005
  • Microsoft SQL Server 2005
  • Microsoft SQL Server 2000
  • Managing and Maintaining Windows Server 2003 (for MCSE or MCSA)
  • Developing Applications Using Visual C# .NET (for MCSD or MCAD)
  • Visual Basic .NET (for MCSD or MCAD)
  • Microsoft SQL Server 2000 (for MCDBA)

Extreme Markup Languages 2006 Proceedings Online

This is just a paradise for XML geeks: Extreme Markup Languages 2006 Conference Proceedings Online. Happy reading:

Blazevic, Mario. "Streaming component combinators." In Proceedings of Extreme Markup Languages 2006.

Brown, Alex. "Frozen streams: an experimental time- and space-efficient implementation for in-memory representation of XML documents using Java." In Proceedings of Extreme Markup Languages 2006.

Bryan, Martin. "DSRL - Bringing Revolution to XML Workers." In Proceedings of Extreme Markup Languages 2006.

Chatti, Noureddine, Sylvie Calabretto and Jean Marie Pinon. "MultiX: an XML based formalism to encode multi-structured documents." In Proceedings of Extreme Markup Languages 2006.

Clark, John L. "Structured Software Assurance." In Proceedings of Extreme Markup Languages 2006.

Collins, Brad. "Sticky Stuff: An Introduction to the Burr Metadata Framework." In Proceedings of Extreme Markup Languages 2006.

Dubin, David, Joe Futrelle and Joel Plutchak. "Metadata Enrichment for Digital Preservation." In Proceedings of Extreme Markup Languages 2006.

Freese, Eric. "From Metadata to Personal Semantic Webs." In Proceedings of Extreme Markup Languages 2006.

Gangemi, Joseph V. "XML for Publishing." In Proceedings of Extreme Markup Languages 2006.

Gutentag, Eduardo. "Intellectual property policy for the XML geek." In Proceedings of Extreme Markup Languages 2006.

Halpin, Harry. "XMLVS: Using Namespace Documents for XML Versioning." In Proceedings of Extreme Markup Languages 2006.

Hennum, Erik. "Representing Discourse Models in RDF." In Proceedings of Extreme Markup Languages 2006.

Lubell, Joshua, Boonserm (Serm) Kulvatunyou, KC Morris and Betty Harvey. "Implementing XML Schema Naming and Design Rules: Perils and Pitfalls." In Proceedings of Extreme Markup Languages 2006.

Marcoux, Yves. "A natural-language approach to modeling: Why is some XML so difficult to write?" In Proceedings of Extreme Markup Languages 2006.

M?ldner, Tomasz, Gregory Leighton and Jan Krzysztof Miziolek. "Using Multi-Encryption to Provide Secure and Controlled Access to XML Documents." In Proceedings of Extreme Markup Languages 2006.

Novatchev, Dimitre. "Higher-Order Functional Programming with XSLT 2.0 and FXSL." In Proceedings of Extreme Markup Languages 2006.

Pepper, Steve, Valentina Presutti, Lars Marius Garshol and Fabio Vitali. "Reusing data across Topic Maps and RDF." In Proceedings of Extreme Markup Languages 2006.

Quin, Liam. "Microformats: Contaminants or Ingredients? Introducing MDL and Asking Questions." In Proceedings of Extreme Markup Languages 2006.

Souzis, Adam. "RxPath: a mapping of RDF to the XPath Data Model." In Proceedings of Extreme Markup Languages 2006.

Sperberg-McQueen, C. M. "Rabbit/duck grammars: a validation method for overlapping structures." In Proceedings of Extreme Markup Languages 2006.

Tennison, Jeni. "Datatypes for XML: the Datatyping Library Language (DTLL)." In Proceedings of Extreme Markup Languages 2006.

Wrightson, Ann. "Conveying Meaning through Space and Time using XML: Semantics of Interoperability and Persistence." In Proceedings of Extreme Markup Languages 2006.

It's easier than ever to get an online computer degree from home.

FXSL 2.0

Dimitre Novatchev has uploaded another FXSL 2.0 release. FXSL is the best ever XSLT library:

The FXSL functional programming library for XSLT provides XSLT programmers with a powerful reusable set of functions and a way to implement higher-order functions and use functions as first class objects in XSLT .

Now XPath 2.0 functions, operators and constructors as well as XSLT 2.0 functions have "higher-order FXSL wrappers that makes possible to use them as higher order functions and to create partial applications from them".

To fully understand the value of this stuff take a look at Dimitre's article "Higher-Order Functional Programming with XSLT 2.0 and FXSL".

August 17, 2006

AdSense I18n

First hundred users of the AdSense Watch Toolbar and first nasty bug - when language other than English is set up for an AdSense account at Google the CSV report cannot be parsed. Apparently googlers generate CSV report in a localized form - headers are translated and numbers are in a locale-specific format. Weird. What about separation data and presentation, huh? Why on earth CSV report needs to be localized? It must be pure data in an easy to process form. Google doesn't think so.

Ok, I figured out that if I add "&hl=en" to the end of CSV report request I get CSV file in English no matter what language AdSense account works with. Good enough.

I updated AdSense Watch Toolbar installation to fix this issue. If anybody got "Cannot parse AdSense data" error, download new version please.

August 16, 2006

SPI Labs: AJAX Opens up the Whole New Opportunities for Hacker Attacks

SPI Dynamics has published a whitepaper "Ajax Security Dangers":

While Ajax can greatly improve the usability of a Web application, it can also
create several opportunities for possible attack if the application is not
designed with security in mind. Since Ajax Web applications exist on both the
client and the server, they include the following security issues:


• Create a larger attack surface with many more inputs to secure
• Expose internal functions of the Web application server
• Allow a client-side script to access third-party resources with no builtin
security mechanisms

From all dangers one sounds the most horrible - authors claim that "Ajax Amplifies XSS". Ajax allows  cross-site scripting (XSS) attacks to spread like a virus or worm. And that's not an imaginary threats, the attacks are already happening.

The first widely known AJAX worm was "Samy worm" or "JS.Spacehero worm" hits 1,000,000+ MySpace users in less than 20 hours back in 2005 and then again.

In 2006 "The Yamanner worm" infested Yahoo Mail and managed to capture thousands email addresses and uploaded them to a still unidentified Web site.

Provided that the problem wasn't that Yahoo or MySpace staff is incompetent:

"The problem isn't that Yahoo is incompetent. The problem is that filtering JavaScript to make it safe is very, very hard," said David Wagner, assistant professor of computer science at the University of California at Berkeley

It's for sure just a matter of time before Google or Microsoft Ajax based applications will be hacked, not to mention vendors with less experienced developers driving to Ajax by the hype and widely leveraging "cut and paste” coding technique.

"JavaScript was dangerous before Ajax came around," noted Billy Hoffman, lead R&D researcher at SPI Dynamics Inc., a computer security firm. With the addition of Ajax functionality in many other Web applications, the problem is going to get worse before it gets better, he said.

Pessimistic summary, but what would you expect in a "Worse is Better" world?

Dimitre Novatchev is blogging

Congratulations to all XSLT geeks - Dimitre Novatchev, XSLT extraordinaire is blogging! Whoha! Subscribed.

Ward Cunningham: Wiki is the original Web 2.0 application

Ward Cunningham: "Wiki is the original Web 2.0 application."

Read the Ward Cunningham talking on "Wikis, Patterns, Mashups and More". Interestng one.

August 15, 2006

Visual Studio 2005 XSLT Run Plugin Coming Soon

I'm finishing another plugin for Visual Studio 2005, which will allow to one-click run XSLT transformations using different XSLT processors.  Visual Studio 2005 can only perform XSLT transformations using XslCompiledTransform and that's not enough for XSLT geeks. To make Visual Studio 2005 a real XSLT IDE it must be able to run different XSLT engines, including XSLT 2.0 engines. The idea of such plugin comes from Dimitre Novatchev.

 I really hope to release first beta next week.

Testing Windows Live Writer Beta

This is pretty cool blog post editor. I'm gonna test it and if it's ok I switch to Windows Live Writer cause wbloggar seems to be dead. Works fine with my MovableType powered blog also and has all features I need.

Allows plugins to be added, cool huh? I want a plugin for autolinking certain words, wouldn't it be cool? I wonder if the SDK allows such plugins. What other plugins would be useful?

August 13, 2006

Introducing AdSense Watch Toolbar

This Lebanon war had a terrible impact on my personal productivity. Too much TV, too much internet, too much pain, too little work. Hope it ends soon. Anyway I decided I need some short victorious war, oops I mean small interesting project to get me back on track. I've seen AdSense Notifier plugin for Firefox another day and I thought - cool, but I don't run Firefox 100% time, I want it on Windows taskbar, not a browser statusbar. So I had a spike project and got it working in just one night. Then I spent another two weeks polishing it. Ahhhhh, a joy of good old pure win32, MFC-free, just Windows and you and nothing in between. Unmanaged C++, LPTSTR, HWND, messages, win32 multithreading - sweet, I'm in The Old New Thing world again. The result is AdSense Watch Toolbar.

AdSense Watch is a Windows Explorer toolbar (a desk band technically speaking), usually docked to the Windows taskbar. AdSense Watch displays your current "Google AdSense for content" report - Page impressions, Clicks, Page CTR, Page eCPM and Earnings. The data is updated automatically or on demand. More info on AdSense Watch Toolbar usage can be found at the XML Lab site.

Latest AdSense Watch installation is available at the XML Lab Downloads page. The latest version is currently 1.0b and as any other beta software AdSense Watch is currently free (but not open-source). AdSense Watch is written in C++ in Visual Studio 2005. AdSense Watch was tested on Windows 2000, Windows XP Pro and Windows Server 2003.

Any suggestions, bug reports and comments are welcome at the AdSense Watch Toolbar forum.

Sorry in advance to Allen G Holman that AdSense Watch looks similar to his great AdSense Notifier. Basic things usually similar in any environment...

I wasn't aware of Google AdSense API (and I'm still unaware of what it provides) and so implemented AdSense login basically using screenscraping technique. I tried to make login code as robust as possible and I think I succeeded in that, at least AdSense Watch survived latest changes in Google AdSense login procedure AdSense Notifier stumbled upon. As for report data - AdSense Watch is using CSV data for reliability.

Btw, AdSense Watch Toolbar is Windows Explorer Desk Band, but from implementation perspective it's not much different from Internet Explorer toolbar, so with minimal changes (mostly WRT registering) I actually can make AdSense Watch IE toolbar version.

I want to investigate Google AdSense API possibilities and add more features in the next version if there will be any interest in this tool.

Anyway, download AdSense Watch Toolbar for free and enjoy. Any comments are welcome!

Streaming XML filtering in Java and .NET

XML processing is changing. In Java SAX slowly but steadily goes away or at least goes into low level and nowadays Java with StAX is not so different from .NET XmlReader. I found it pretty interesting to compare approaches to streaming filtering XML in Java and .NET. Filtering is a very useful technique for transforming XML on the fly, while XML is being read. Filtering out parts or branches application isn't interested to process is a great way to simplify XML reading code, which is especially important in streaming XML processing which usually tends to be more complicated than in-memory based (XML DOM) processing.

Let's say we have this dummy XML and we want to extract "interesting data" out of it.

<root> <ignoreme>junk</ignoreme> <data>interesting data</data> </root> StAX API has a dedicated built-in facility for filtering - StreamFilter/EventFilter (as it happens in Java world StAX is a bit overengineered and contains actually two APIs - iterator-style and cursor-based one). Here is how it looks in Java with wonderful StAX:
XMLInputFactory xif = XMLInputFactory.newInstance();
XMLStreamReader reader = xif.createXMLStreamReader(
    new StreamSource("foo.xml"));
reader = xif.createFilteredReader(reader, new StreamFilter() {
    private int ignoreDepth = 0;

    public boolean accept(XMLStreamReader reader) {
        if (reader.isStartElement()
            && reader.getLocalName().equals("ignoreme")) {
            ignoreDepth++;
            return false;
        } else if (reader.isEndElement()
           && reader.getLocalName().equals("ignoreme")) {
           ignoreDepth--;
           return false;
        }
        return (ignoreDepth == 0);
    }
});
// move to <root>
moveToNextTag(reader);
// move to <data>
moveToNextTag(reader);
// read data
System.out.println(reader.getElementText());
reader.close();
Where moveToNextTag() is an utility method doing what its name says:
do {
    reader.next();
} while (!reader.isStartElement() && !reader.isEndElement());
XmlStreamReader actually provides method nextTag(), but weirdly enough it can't skip text (even text filtered out by an underlying filter!) and throws an exception.

Now .NET code. Unlike StAX, .NET doesn't provide any facility for XML filtering so usual approach is to implement filter as a full-blown custom XmlReader and then chain it to another XmlReader instance. As I said before implementing custom XmlReader even .NET 2.0 still sucks (holy cow - 26 abstract methods or deriving from legacy nonconormant XmlTextReader). So I'm going to use XmlWrappingReader helper I was recommending to use:

public class Test
{
    private class XmlFilter : XmlWrappingReader
    {
        public XmlFilter(string uri)
            : base(XmlReader.Create(uri)) { }

        public override bool Read()
        {
            bool baseRead = base.Read();
            if (NodeType == XmlNodeType.Element &&
                LocalName == "ignoreme")
            {
                Skip();
                return base.Read();
            }
            return baseRead;
        }
    }

    static void Main(string[] args)
    {
        XmlFilter filter = new XmlFilter("../../foo.xml");
        XmlReader r = XmlReader.Create(filter, null);
        //move to <root>
        r.MoveToContent();
        //Move to <data>
        MoveToNextTag(r);
        Console.WriteLine(r.ReadString());
    }

    private static void MoveToNextTag(XmlReader r)
    {
        do
        {
            r.Read();
        } while (!(r.NodeType == XmlNodeType.Element) &&
        !(r.NodeType == XmlNodeType.EndElement));

    }
}
Amazingly similar but not so cool because of lack of anonymous classes in .NET 2.0 (expected in .NET 3.0).

In short - what I like in Java version - built-in support for XML filtering, anonymous classes. What I don't like in Java version: filter can be called more than one time on the same position, what means that real filter implementation must support such scenario; very ascetic API, too few utility methods. What I like in .NET version: lots of useful methods in XmlReader such as Skip(), ReadToXXX() etc. What I don't like - no built-in support for filters, no anonymous methods.

Besides - if you work with StAX you can readily work with .NET XmlReader and the other way. Great unification saves hours learning for developers. I wonder if streaming XML processing API should be standardized?

August 7, 2006

Service Modeling Language based on XML Schema and Schematron

Microsoft, BEA, IBM, Cisco, Intel , HP etc mix XML Schema, Schematron and XPointer to create a draft of

the Service Modeling Language (SML) used to model complex IT services and systems, including their structure, constraints, policies, and best practices.
A model in SML is realized as a set of interrelated XML documents. The XML documents contain information about the parts of an IT service, as well as the constraints that each part must satisfy for the IT service to function properly. Constraints are captured in two ways:
1. Schemas - these are constraints on the structure and content of the documents in a model. SML uses a profile of XML Schema 1.0 [2,3] as the schema language. SML also defines a set of extensions to XML Schema to support inter-document references.

2. Rules - are Boolean expressions that constrain the structure and content of documents in a model. SML uses a profile of Schematron [4,5,6] and XPath 1.0 [9] for rules.

Once a model is defined, one of the important operations on the model is to establish its validity. This involves checking whether all data in a model satisfies the schemas and rules declared.
This specification focuses primarily on defining the profile of XML Schema and Schematron used by SML, as well as the process of model validation.
Sort of XML Schema without some crappy features enhanced with Schemtron rules and XPointer based partial inclusions. Sounds cool not only in the domain of the service modeling. I wish I could use it for plain XML validation.

[Via Don Box]