March 28, 2007

SourceForge Marketplace

Apparently SourceForge.net is planning to come up with a feature that would allow to buy or sell services or support for open source projects. Here is a mail I received:

Dear SourceForge.net community member,

As an active participant in the Open Source community, you may be excited
to learn about a new feature that we will add to SourceForge.net in late
spring/early summer. This feature will allow you to buy or sell services
for Open Source software on SourceForge.net.

Interested? Follow the link below and we'll keep you updated as we move
towards the official launch of this feature:

https://ostg.wufoo.com/forms/marketplace-interest-list/

Thank you for your continued support,
The SourceForge.net Team

Sounds interesting. Another way to get rich - create great open source product, make your code unreadable, provide no documentation and then sell support :)

How to register automatically HTTP handler required by a Web server control

In ASP.NET when you building a server control that includes an HTTP handler you have this problem - the HTTP handler has to be registered in Web.config. That means it's not enough that your customer developer drops control on her Web form and sets up its properties. One more step is required - manual editing of the config, which is usability horror.

How do you make your customer aware she needs to perform this additional action? Documentation? Yes, but who reads documentation on controls? I know I never, I usually just drop it on the page and poke around its properties to figure out what I need to set up to make it working asap.

So here is nice trick how to avoid manual Web.config editing (found it in the ScriptAculoUs autocomplete web control).

  1. Make sure your control has a designer.
  2. In your control's designer class override ControlDesigner.GetDesignTimeHtml() method, which is called each time your control needs to be represented in design mode.
  3. In the GetDesignTimeHtml() method check if your HTTP handler in already registered in Web.config and if it isn't - just register it.
Here is a sample code that worth hundred words: 
using System;
using System.Web.UI.Design;
using System.Security.Permissions;
using System.Configuration;
using System.Web.Configuration;
using System.Windows.Forms;

namespace XMLLab.WordXMLViewer
{
    [SecurityPermission(SecurityAction.Demand, 
        Flags = SecurityPermissionFlag.UnmanagedCode)]
    public class WordXMLViewerDesigner : ControlDesigner
    {
        private void RegisterImageHttpHandler()
        {
            IWebApplication webApplication = 
                (IWebApplication)this.GetService(typeof(IWebApplication));

            if (webApplication != null)
            {
                Configuration configuration = webApplication.OpenWebConfiguration(false);
                if (configuration != null)
                {
                    HttpHandlersSection section = 
                        (HttpHandlersSection)configuration.GetSection(
                        "system.web/httpHandlers");
                    if (section == null)
                    {
                        section = new HttpHandlersSection();
                        ConfigurationSectionGroup group = 
                            configuration.GetSectionGroup("system.web");
                        if (group == null)
                        {
                            configuration.SectionGroups.Add("system.web", 
                                new ConfigurationSectionGroup());
                        }
                        group.Sections.Add("httpHandlers", section);
                    }
                    section.Handlers.Add(Action);
                    configuration.Save(ConfigurationSaveMode.Minimal);
                }
            }
        }


        private bool IsHttpHandlerRegistered()
        {
            IWebApplication webApplication = 
                (IWebApplication)this.GetService(typeof(IWebApplication));

            if (webApplication != null)
            {
                Configuration configuration = 
                    webApplication.OpenWebConfiguration(true);

                if (configuration != null)
                {
                    HttpHandlersSection section = 
                        (HttpHandlersSection)configuration.GetSection(
                        "system.web/httpHandlers");

                    if ((section != null) && (section.Handlers.IndexOf(Action) >= 0))
                        return true;
                }
            }
            return false;
        }


        static HttpHandlerAction Action
        {
            get
            {
                return new HttpHandlerAction(
                    "image.ashx", 
                    "XMLLab.WordXMLViewer.ImageHandler, XMLLab.WordXMLViewer", 
                    "*"
                );
            }
        }

        public override string GetDesignTimeHtml(DesignerRegionCollection regions)
        {
            if (!IsHttpHandlerRegistered() && 
                (MessageBox.Show(
                "Do you want to automatically register the HttpHandler needed by this control in the web.config?", 
                "Confirmation", MessageBoxButtons.YesNo, 
                MessageBoxIcon.Exclamation) == DialogResult.Yes))
                RegisterImageHttpHandler();
            return base.CreatePlaceHolderDesignTimeHtml("Word 2003 XML Viewer");
        }
    }
}
Obviously it only works if your control gets rendered at least once in Design mode, which isn't always the case. Some freaks (including /me) prefer to work with Web forms in Source mode, so you still need to write in the documentation how to update Web.config to make your control working.

March 26, 2007

Word 2003 XML Viewer Control v1.0 released

I was cleaning up my backyard and found this control I never finished. So I did. Here is Word 2003 XML Viewer Control v1.0 just in case somebody needs it. It's is ASP.NET 2.0 Web server control, which allows to display arbitrary Microsoft Word 2003 XML documents (aka WordML aka WordprocessingML) on the Web so people not having Microsoft Office 2003 installed can browse documents using only a browser.

The control renders Word 2003 XML documents by transforming content to HTML preserving styling and extracting images. Both Internet Explorer and Firefox are supported.

Word 2003 XML Viewer Control is Web version of the Microsoft Word 2003 XML Viewer tool and uses the same WordML to HTML transformation stylesheet thus providing the same rendering quality.

The control is free open-source, download it here, find documentation here.

I'm doing interesting trick with images in this control. The problem is that in WordML images are embedded into the document, so they need to be extracted when transforming to HTML. And I wanted to avoid writing images to file system. So the trick is to extract image when generating HTML (via XSLT), assign it guid, put it into session and generate <img> src attribute requesting image by guid. Then when browser renders HTML it requests images by guid and custom HTTP handler gets them from the session.

Having HTTP handler in ASP.NET control posed another problem - how do you register HTTP handler in Web.config automatically? AFAIK there is no out of box solution for the problem, but happily I found a solution that covers major use case. Here is piece of documentation:

When you are adding the first Word 2003 XML Viewer Control in your Web project, you should see the following confirmation dialog: "Do you want to automatically register the HttpHandler needed by this control in the web.config?". You must answer Yes to allow the control to register image handler in the Web.config. If don't answer Yes or if you add the control not in Design mode, you have to add the following definition to the Web.config in the <system.web> section:
<httpHandlers>
   <add path="image.ashx" verb="*" type="XMLLab.WordXMLViewer.ImageHandler, XMLLab.WordXMLViewer" />
</httpHandlers>

Yep. the hint is the Design mode. I'll post about this trick tomorrow.

The usage is simple - just drop control and assign "DocumentSource" property (Word 2003 XML file you want to show).

I deliberately named this control "Word 2003 XML Viewer Control" to avoid confusion. But I'll update it to support Word 2007 as soon as there is Word 2007 to HTML transformation problem solution.

Any comments are welcome. Enjoy.

Many people use converting PDF to Word as a way to change to a more easily editable document that only PDF conversion can easily accomplish--if you didn't convert PDF to Word then you'd have to manually transcribe the document, while PDF to Word software does this in a few clicks.

March 20, 2007

substring-before()/substring-after() for C#

XSLT isn't the best language for string processing, but it (XPath actually) has a very handy pair of substring functions: substring-before() and substring-after(). I used to write XSLT a lot and always missed them in C# or Java. Yes, C# and Java have indexOf() and regex, but indexOf() is too low-level and so makes code more complicated than it needs to be, while regular expressions is overkill for such simple, but very common operation.

Anyway, as an exercise in C# 3.0 I built these string extension functions following XPath 1.0 semantics for the substring-before() and substring-after(). Now I can have clean nice

arg.SubstringAfter(":")

instead of ugly

arg.Substring(arg.IndexOf(":")+1)

and shorter and more natural than

StringUtils.SubstringAfter(arg, ":")

Anyway, here is the code:

using System;
using System.Linq;
using System.Globalization;

namespace XmlLab.Extensions
{
    public static class StringExtensions
    {
        public static string SubstringAfter(this string source, string value)
        {
            if (string.IsNullOrEmpty(value))
            {
                return source;
            }
            CompareInfo compareInfo = CultureInfo.InvariantCulture.CompareInfo;            
            int index = compareInfo.IndexOf(source, value, CompareOptions.Ordinal);
            if (index < 0)
            {
                //No such substring
                return string.Empty;
            }
            return source.Substring(index + value.Length);            
        }

        public static string SubstringBefore(this string source, string value)
        {
            if (string.IsNullOrEmpty(value))
            {
                return value;
            }
            CompareInfo compareInfo = CultureInfo.InvariantCulture.CompareInfo;
            int index = compareInfo.IndexOf(source, value, CompareOptions.Ordinal);
            if (index < 0)
            {
                //No such substring
                return string.Empty;                    
            }
            return source.Substring(0, index);
        }
    }
}

March 16, 2007

How you can have Ruby-style enumerations in C# 3.0

Well, I missed MVP Summit this year, so while fellow MVPs enjoying together in Redmond I'm playing with C# 3.0 at home. And I'm in the process of Ruby learning, so what I spotted immediately is the lack (correct me if I'm wrong) of Each() and Map() support in .NET 3.5 collections.

In Ruby you can apply a block of code to each element in a collection using very elegant each() method:

[1,2,3].each { |item| puts item*item }

Each() method is basically a Visitor pattern implementation. I wonder why no such handy method exists in .NET 3.5? Dozens and dozens of new extension methods on collections covering every single aspect of collection manipulation from filtering to aggregation, but no basic functional programming facilities like Each() and Map()? Probably whoever at Microsoft decided foreach loop is still preferable solution for processing each element in a collection. That's what foreach does and does well.

What would be advantages of having Each() method instead of foreach loop?

  1. Syntactically foreach loop is a statement, while Each() method is expression. Expressions are usually simpler structurally than statements and more readable.
  2. foreach loop follows and encourages imperative programming style, Each()  - functional one.
  3. Clean and beautiful

So I put together some quick implementation of IEnumerable.Each(subroutine) and IEnumerable.EachIndex(subroutine) extension methods after Ruby's  each() and each_index().

IEnumerable.Each(subroutine) extension method calls given subroutine (just a function returning void) once for each element in the collection, passing that element as a parameter.

IEnumerable.EachIndex(subroutine) extension method does the same, but but passes the index of the element instead of the element itself.

With those two methods I can process collections in Ruby-style functional code style:

int[] squares = new int[10];
squares.EachIndex(i => squares[i] = i * i);
squares.Each(val => Console.WriteLine(val)); 

The implementation is suspiciously easy:

using System;
using System.Collections.Generic;
using System.Linq;

namespace Test
{
       public static class MyIEnumerableExtensions
    {        
        public static void Each<TSource>(this IEnumerable<TSource> source, Action<TSource> action) 
        {
            if (source == null)
            {
                throw new ArgumentNullException("source");
            }
            if (action == null)
            {
                return;
            }
            foreach (TSource item in source)
            {
                action(item);
            }
        }

        public static void EachIndex<TSource>(this IEnumerable<TSource> source, Action<int> action)
        {
            if (source == null)
            {
                throw new ArgumentNullException("source");
            }
            if (action == null)
            {
                return;
            }
            int i = 0;
            IEnumerator<TSource> enumerator = source.GetEnumerator();
            while (enumerator.MoveNext())
            {
                action(i++);
            }
        }        
    }
}
Cool. Now what about transforming collection elements, I mean Map()?

March 15, 2007

Fast clean way to check if an array contains particular element in C# 3.0

I just started coding some real application with C# 3.0 (this is my favorite way of learning things) and immediately I can say I love it. C# 3.0 is going to be the best C# version ever. For me personally C# 1.0 was "hey look, kinda Java for Windows", C# 2.0 was "finally generics" and C# 3.0 is "wow, that's cool!".

Anyway, how do you check if a regular unsorted array contains particular element? Loop over elements checking each? Boring imperative crap.  C# 3.0 provides nice declarative solution via Contains<T>(this IEnumerable<T>, <T> value) extension method:

if (args.Contains("/help")) 
    PrintUsage();

That matches exactly Ruby include?() method:

PrintUsage() if args.include?("/help")

Alternatively you can use Any() extension method with lambda expression:

if (args.Any(arg => arg == "/help" || arg == "/?"))

"using System.Linq;" is all you need to turn this magic on.

I was wondering how fast extension methods are. So I put some quick benchmark code checking if a random data array contains some value using foreach loop, for loop, Contains() extension method and Any() extension method. The results I got were somehow surprising:

foreach loop search:            35 ms
for loop search:                28 ms
Contains() method search:       15 ms
Any() method search:            152 ms

Well, Any() method builds dynamic lambda expression in memory, so it's obviously the slowest one, but Contains() extension method somehow wins!

How come Contains() can beat almost twice in speed plain dumb for loop? I always new declarative style is the most optimization-friendly, but that was always kinda theory only.

Unfortunately I have no time to disassemble things, gotta go right now. Could be boxing or array boundaries checking or inlining. Can somebody shed some light into this? I'm including my benchmark code, chances are it's not fully correct. Run it in latest Visual Studio Orcas March CTP.

Anyway, it's good to know that really fast and the most elegant way to check if an array contains particular element in C# 3.0 seems to be using Contains() extension method.

using System;
using System.Diagnostics;
using System.Linq;

namespace Test
{
    class Program
    {        
        static void Main(string[] args)
        {
            int[] data = new int[10000000];
            //Fill in some random data
            Random rand = new Random();
            for (int i=0; i<data.Length; i++)
            {
                data[i] = rand.Next();
            }
            //Select some element for search
            int value = data[data.Length/2];

            //Warm up
            ContainsViaForEachLoop(data, value);
            ContainsViaForLoop(data, value);
            ContainsViaContainsExtMethod(data, value);
            ContainsViaAnyExtMethod(data, value);

            Stopwatch watch = new Stopwatch();
            watch.Start();
            if (ContainsViaForEachLoop(data, value))
            {
                watch.Stop();
                Console.WriteLine("foreach loop search:\t\t{0} ms", 
                    watch.ElapsedMilliseconds);
            }
            watch.Reset();            
            watch.Start();
            if (ContainsViaForLoop(data, value))
            {
                watch.Stop();
                Console.WriteLine("for loop search:\t\t{0} ms", 
                    watch.ElapsedMilliseconds);
            }
            watch.Reset();
            watch.Start();
            if (ContainsViaContainsExtMethod(data, value))
            {
                watch.Stop();
                Console.WriteLine("Contains() method search:\t{0} ms", 
                    watch.ElapsedMilliseconds);
            }
            watch.Reset();
            watch.Start();
            if (ContainsViaAnyExtMethod(data, value))
            {
                watch.Stop();
                Console.WriteLine("Any() method search:\t\t{0} ms", 
                    watch.ElapsedMilliseconds);
            }
        }

        public static bool ContainsViaForEachLoop(int[] data, int value)
        {
            foreach (int i in data)
            {
                if (i == value)
                {
                    return true;
                }
            }
            return false;
        }


        public static bool ContainsViaForLoop(int[] data, int value)
        {            
            for (int i=0; i<data.Length; i++)
            {
                if (data[i] == value)
                {
                    return true;
                }
            }
            return false;
        }

        public static bool ContainsViaContainsExtMethod(int[] data, int value)
        {
            return data.Contains(value);
        }

        public static bool ContainsViaAnyExtMethod(int[] data, int value)
        {
            return data.Any(i => i == value);
        }

    }
}

March 14, 2007

Google Mobile Proxy

This is not particularly new, but I didn't know about it before today and it did save my ass when I needed to browse non-mobile-friendly site on my phone really bad this morning.

It's http://www.google.com/gwt/n - Google mobile proxy service. It allows to browse to any web site using your mobile by "adapting it" - reformatting content and soft of squizzing it. As a matter of interest it also strips out Google AdSense ads. Cool.

W3C wakes up before they lose control over HTML

After years wasted in XHTML and XForms development and before WHATWG totally taking over HTML, W3C woke up and restarted their HTML activity. Just about time.

Yes, believe it or not, but after HTML 4.01, which was finished back in 1999, W3C did nothing to improve HTML.

Meantime Google, Apple, Mozilla and Opera being disappointed in W3C lack of interest in HTML further development, have created WHATWG (Web Hypertext Application Technology Working Group), whose tagline is not less but "Maintaining and evolving HTML since 2004".

It's interesting to note that another major browser vendor never participated in WHATWG and guess who is chairing new W3C HTML working group? Chris Wilson, Microsoft and Dan Connolly, W3C/MIT (and if you look as far back as 2006/11 - it was Chris Wilson only).

The new W3C HTML working group is scheduled to deliver new HTML version (in both classic HTML and XML syntaxes) by 2010. It's 3 years only. I doubt W3C as it is now can deliver something as much important as HTML5 in just 3 years.

WHATWG must be pretty much pissed off now. From WHATWG blog:

Surprisingly, the W3C never actually contacted the WHATWG during the chartering process. However, the WHATWG model has clearly had some influence on the creation of this group, and the charter says that the W3C will try to “actively pursue convergence with WHATWG”. Hopefully they will get in contact soon.

Well, actually the chapter says more:

Web Hypertext Application Technology Working Group (WHATWG)
The HTML Working Group will actively pursue convergence with WHATWG, encouraging open participation within the bounds of the W3C patent policy and available resources.

Good enough. 

I'm only afraid that W3C can kill WHATWG and then bury HTML5 down in endless meetings settling down dependencies, IP issues, conflicting corporate interests and such. W3C can spend on HTML5 5-10 years easily.