Fast clean way to check if an array contains particular element in C# 3.0

| 4 Comments | No TrackBacks

I just started coding some real application with C# 3.0 (this is my favorite way of learning things) and immediately I can say I love it. C# 3.0 is going to be the best C# version ever. For me personally C# 1.0 was "hey look, kinda Java for Windows", C# 2.0 was "finally generics" and C# 3.0 is "wow, that's cool!".

Anyway, how do you check if a regular unsorted array contains particular element? Loop over elements checking each? Boring imperative crap.  C# 3.0 provides nice declarative solution via Contains<T>(this IEnumerable<T>, <T> value) extension method:

if (args.Contains("/help")) 
    PrintUsage();

That matches exactly Ruby include?() method:

PrintUsage() if args.include?("/help")

Alternatively you can use Any() extension method with lambda expression:

if (args.Any(arg => arg == "/help" || arg == "/?"))

"using System.Linq;" is all you need to turn this magic on.

I was wondering how fast extension methods are. So I put some quick benchmark code checking if a random data array contains some value using foreach loop, for loop, Contains() extension method and Any() extension method. The results I got were somehow surprising:

foreach loop search:            35 ms
for loop search:                28 ms
Contains() method search:       15 ms
Any() method search:            152 ms

Well, Any() method builds dynamic lambda expression in memory, so it's obviously the slowest one, but Contains() extension method somehow wins!

How come Contains() can beat almost twice in speed plain dumb for loop? I always new declarative style is the most optimization-friendly, but that was always kinda theory only.

Unfortunately I have no time to disassemble things, gotta go right now. Could be boxing or array boundaries checking or inlining. Can somebody shed some light into this? I'm including my benchmark code, chances are it's not fully correct. Run it in latest Visual Studio Orcas March CTP.

Anyway, it's good to know that really fast and the most elegant way to check if an array contains particular element in C# 3.0 seems to be using Contains() extension method.

using System;
using System.Diagnostics;
using System.Linq;

namespace Test
{
    class Program
    {        
        static void Main(string[] args)
        {
            int[] data = new int[10000000];
            //Fill in some random data
            Random rand = new Random();
            for (int i=0; i<data.Length; i++)
            {
                data[i] = rand.Next();
            }
            //Select some element for search
            int value = data[data.Length/2];

            //Warm up
            ContainsViaForEachLoop(data, value);
            ContainsViaForLoop(data, value);
            ContainsViaContainsExtMethod(data, value);
            ContainsViaAnyExtMethod(data, value);

            Stopwatch watch = new Stopwatch();
            watch.Start();
            if (ContainsViaForEachLoop(data, value))
            {
                watch.Stop();
                Console.WriteLine("foreach loop search:\t\t{0} ms", 
                    watch.ElapsedMilliseconds);
            }
            watch.Reset();            
            watch.Start();
            if (ContainsViaForLoop(data, value))
            {
                watch.Stop();
                Console.WriteLine("for loop search:\t\t{0} ms", 
                    watch.ElapsedMilliseconds);
            }
            watch.Reset();
            watch.Start();
            if (ContainsViaContainsExtMethod(data, value))
            {
                watch.Stop();
                Console.WriteLine("Contains() method search:\t{0} ms", 
                    watch.ElapsedMilliseconds);
            }
            watch.Reset();
            watch.Start();
            if (ContainsViaAnyExtMethod(data, value))
            {
                watch.Stop();
                Console.WriteLine("Any() method search:\t\t{0} ms", 
                    watch.ElapsedMilliseconds);
            }
        }

        public static bool ContainsViaForEachLoop(int[] data, int value)
        {
            foreach (int i in data)
            {
                if (i == value)
                {
                    return true;
                }
            }
            return false;
        }


        public static bool ContainsViaForLoop(int[] data, int value)
        {            
            for (int i=0; i<data.Length; i++)
            {
                if (data[i] == value)
                {
                    return true;
                }
            }
            return false;
        }

        public static bool ContainsViaContainsExtMethod(int[] data, int value)
        {
            return data.Contains(value);
        }

        public static bool ContainsViaAnyExtMethod(int[] data, int value)
        {
            return data.Any(i => i == value);
        }

    }
}

Related Blog Posts

No TrackBacks

TrackBack URL: http://www.tkachenko.com/cgi-bin/mt-tb.cgi/684

4 Comments

Check this: http://msdn.microsoft.com/en-us/library/system.collections.sortedlist.contains.aspx

Towards the middle of the page you will find this:

"This method uses a binary search algorithm; therefore, this method is an O(log n) operation..."

The loop methods that you tried are linear search algorithms O(n). The Any method, God knows!

Hi Oleg,

This is indeed very interesting finding!
I tried to replicate your results and found out that what you're saying is true for Debug solution configuration, but things look different if you set you config to Release from the VS toolbar. Here is a quick comparison on my box:

Debug x86:
foreach loop search: 35 ms
for loop search: 31 ms
Contains() method search: 6 ms
Any() method search: 66 ms

Release x86:
foreach loop search: 4 ms
for loop search: 6 ms
Contains() method search: 7 ms
Any() method search: 57 ms

Nikola Mihaylov
Visual Studio Product Team
Microsoft

Very good!!
I expanded your example by investigating also Array.IndexOf() method.
Well, it seems to be that Contains() in some way uses IndexOf().
I increased the size of array to 100 Millions!
...
 int[] data = new int[100000000];
...
 ContainsViaIndex(data,value);
...
  if (ContainsViaIndex(data, value))
  {
   watch.Stop();
   Console.WriteLine("Via Index:\t\t{0} ms", watch.ElapsedMilliseconds);
  }
  }
  public static bool ContainsViaIndex(int[] data, int value)
 {
  return Array.IndexOf(data, value)>-1 ? true : false;
 }
...

Results:

foreach loop search:    391 ms
for loop search:    412 ms
Contains() method search:  185 ms
Any() method search:   1011 ms
Via Index:     141 ms

Strange you didn't mention Array.IndexOf method

Leave a comment