March 15, 2007

Fast clean way to check if an array contains particular element in C# 3.0

I just started coding some real application with C# 3.0 (this is my favorite way of learning things) and immediately I can say I love it. C# 3.0 is going to be the best C# version ever. For me personally C# 1.0 was "hey look, kinda Java for Windows", C# 2.0 was "finally generics" and C# 3.0 is "wow, that's cool!".

Anyway, how do you check if a regular unsorted array contains particular element? Loop over elements checking each? Boring imperative crap.  C# 3.0 provides nice declarative solution via Contains<T>(this IEnumerable<T>, <T> value) extension method:

...
if (args.Contains("/help")) 
    PrintUsage();

That matches exactly Ruby include?() method:

PrintUsage() if args.include?("/help")

Alternatively you can use Any() extension method with lambda expression:

if (args.Any(arg => arg == "/help" || arg == "/?"))

"using System.Linq;" is all you need to turn this magic on.

I was wondering how fast extension methods are. So I put some quick benchmark code checking if a random data array contains some value using foreach loop, for loop, Contains() extension method and Any() extension method. The results I got were somehow surprising:

foreach loop search:            35 ms
for loop search:                28 ms
Contains() method search:       15 ms
Any() method search:            152 ms

Well, Any() method builds dynamic lambda expression in memory, so it's obviously the slowest one, but Contains() extension method somehow wins!

How come Contains() can beat almost twice in speed plain dumb for loop? I always new declarative style is the most optimization-friendly, but that was always kinda theory only.

Unfortunately I have no time to disassemble things, gotta go right now. Could be boxing or array boundaries checking or inlining. Can somebody shed some light into this? I'm including my benchmark code, chances are it's not fully correct. Run it in latest Visual Studio Orcas March CTP.

Anyway, it's good to know that really fast and the most elegant way to check if an array contains particular element in C# 3.0 seems to be using Contains() extension method.

using System;
using System.Diagnostics;
using System.Linq;

namespace Test
{
    class Program
    {        
        static void Main(string[] args)
        {
            int[] data = new int[10000000];
            //Fill in some random data
            Random rand = new Random();
            for (int i=0; i<data.Length; i++)
            {
                data[i] = rand.Next();
            }
            //Select some element for search
            int value = data[data.Length/2];

            //Warm up
            ContainsViaForEachLoop(data, value);
            ContainsViaForLoop(data, value);
            ContainsViaContainsExtMethod(data, value);
            ContainsViaAnyExtMethod(data, value);

            Stopwatch watch = new Stopwatch();
            watch.Start();
            if (ContainsViaForEachLoop(data, value))
            {
                watch.Stop();
                Console.WriteLine("foreach loop search:\t\t{0} ms", 
                    watch.ElapsedMilliseconds);
            }
            watch.Reset();            
            watch.Start();
            if (ContainsViaForLoop(data, value))
            {
                watch.Stop();
                Console.WriteLine("for loop search:\t\t{0} ms", 
                    watch.ElapsedMilliseconds);
            }
            watch.Reset();
            watch.Start();
            if (ContainsViaContainsExtMethod(data, value))
            {
                watch.Stop();
                Console.WriteLine("Contains() method search:\t{0} ms", 
                    watch.ElapsedMilliseconds);
            }
            watch.Reset();
            watch.Start();
            if (ContainsViaAnyExtMethod(data, value))
            {
                watch.Stop();
                Console.WriteLine("Any() method search:\t\t{0} ms", 
                    watch.ElapsedMilliseconds);
            }
        }

        public static bool ContainsViaForEachLoop(int[] data, int value)
        {
            foreach (int i in data)
            {
                if (i == value)
                {
                    return true;
                }
            }
            return false;
        }


        public static bool ContainsViaForLoop(int[] data, int value)
        {            
            for (int i=0; i<data.Length; i++)
            {
                if (data[i] == value)
                {
                    return true;
                }
            }
            return false;
        }

        public static bool ContainsViaContainsExtMethod(int[] data, int value)
        {
            return data.Contains(value);
        }

        public static bool ContainsViaAnyExtMethod(int[] data, int value)
        {
            return data.Any(i => i == value);
        }

    }
}