C# 3.0 and LINQ
Peter Himschoot
Applies to:U2U - Brussels
Anders Hejlsberg, chief designer of C#, unveiled C#’s newest version at PDC2005. Some of the most notable innovations are extension methods, lambda expressions, anonymous types, type inference, and LINQ (.NET Language Integrated Query). In this article we are going to look at these new features of C#, focusing on LINQ. Contents: Meet LINQWhen we want to loop-up elements in an array that meet certain requirements we typically write this in C# 2.0 using the foreach statement:
string[] names = { "Peter", "Jan", "Wim", "Patrick", "Ann" };
foreach (string n in names)
{
if (n.Length == 3)
Console.WriteLine();
}Using C# 3.0 and LINQ we can write this like so:
string[] names = { "Peter", "Jan", "Wim", "Patrick", "Ann" };
var result = from n in names
where n.Length == 3
select n;
foreach( string s in result ) Console.WriteLine( s );Let’s now see how this simple syntax works through the new language extensions in C# 3.0.
![]() C# 2.0 Conditional EnumerationsWhen we write query-code with somewhat complex filtering, foreach and if combinations quickly form a complex statement. If we could build the condition into the foreach we would get simpler code like in the following pseudo-code:
foreach ( string n in names.Where( n.Length == 3 ))
{
Console.WriteLine(n);
}How could we build something like this in C# 2.0? Let’s start by looking at the equivalent statement written without the foreach keyword, i.e. using the IEnumerator interface:
IEnumerator ie = (names as IEnumerable).GetEnumerator();
while (ie.MoveNext())
{
string n = (string) ie.Current;
if (n.Length == 3)
Console.WriteLine();
}The C# compiler actually transforms a foreach statement to this kind of code; foreach is called syntactic sugar because it makes writing this kind of code a lot easier (developers like their code sweet and their coffee blackJ). An enumerator in .NET is an object that implements the IEnumerable interface, and works like a cursor on our data; allowing iteration over a collection. So if we want to enumerate over a subset of the collection, we can do this by building our own enumerator. C# 2.0 generics allow us to build a collection with a Where enumerator taking a filter delegate as argument: public delegate bool Filter<T>(T element);
public class MyStringCollection
{
private string[] elements;
public MyStringCollection(string[] elements)
{
this.elements = elements;
}
public IEnumerable<string> Where(Filter<string> f)
{
foreach( string element in elements )
if( f(element) == true )
yield return element;
}
}The MyStringCollection class’ Where method returns an enumerator that will only return strings that match the condition, for example only strings with length equal to 3:
MyStringCollection col = new MyStringCollection(names);
foreach (string n in col.Where(delegate(string s){return s.Length == 3;}))
{
Console.WriteLine(n);
}Please note that we’re using a C# 2.0 anonymous method to implement the filter.
![]() Lambda-expressionsLambda-expressions delegate(string s){return s.Length == 3;}We have to write an anonymous method just to check the length of a string. Lambda-expressions make this a lot easier; the previous piece of code looks like this using a Lambda-expression:
s => s.Length == 3Lambda-expressions really come in handy when writing filter expressions like in our Where method: MyStringCollection col = new MyStringCollection(names);
foreach (string n in col.Where( s => s.Length == 3 ))
{
Console.WriteLine(n);
}Lambda-expressions are really syntactic sugaring for creating a delegate defined like:
delegate R Func< T, R >(T element);T is the type used at the left side of the => symbol, and R is the type of the expression at the right side. So in our example: s => s.Length == 3T is type string, and R is type bool. So we can create a reference to our filter expression like this (reference to delegate object): Func<string, bool> filter = s => s.Length == 3;So using a Lambda-expression we can change our Where method to look like this: public IEnumerable<string> Where(Func<string, bool> f) {...}Let’s take this one step further and create a generic static Where method to be used for any collection:
public static class Query
{
public static IEnumerable<T> Where<T>(IEnumerable<T> col, Func<T,bool> f)
{
foreach (T element in col)
{
if (f(element) == true)
yield return element;
}
}
}C# 2.0 generics allow us to define a class with a static generic method.Now to iterate over our names collection where strings have length equal to 3: foreach (string s in Query.Where<string>(names, s => s.Length == 3)) {...}The C# 2.0 compiler is even capable of inferring the type of the collection, so we can write this even shorter:
foreach (string s in Query.Where(names, s => s.Length == 3)) {...}Wouldn’t we like to make this even shorter, using a Where method of the collection itself?
foreach ( string s in names.Where( s => s.Length == 3 )) {...}However this would mean that every collection needs to have a Where method, and we can only do this for our own custom collections because .NET’s built-in collections don’t have this. Or we can use C# 3.0’s extension methods.
![]() Extension methodsExtension methods public static class StringExtensions
{
public static int ToInt32( this string s )
{
return int.Parse(s);
}
}Note that an extension method is a static method, and the first argument is of the type we want to extend. Also note the special this keyword.When the class StringExtensions is in scope (so the compiler can find it) we can write: string five = "5"; int i = five.ToInt32();So now we can rewrite our Where method as an extension method like this: public static IEnumerable<T> Where<T>(
this IEnumerable<T> col, Func<T, bool> f)
{
foreach (T element in col)
{
if (f(element) == true)
yield return element;
}
}Because the this argument is of type IEnumerable<T> our Where method can be used to extend any IEnumerable<T> type, for example:
foreach ( string s in names.Where( s => s.Length == 3 )) {...}![]() Type InferenceWhen declaring variables in C# 2.0 (and in 1.0 too) we have to declare the variable and then assign it an object of the same type, for example:
List<Employee> employees = new List<Employee>();If we want to, we can ask the C# 3.0 compiler to infer the type of the variable by looking at the right side of the assignment, for example: var employees = new List<Employee>();Please note that the employees variable is still of type List<Employee>, it is not a new kind of VARIANT: employee = new Department(); // Compiler errorThis syntax simply makes it a lot easier to declare and initialize variables, for example: var department = new Dictionary< Department, List<Employee>>();Instead of Dictionary< Department, List<Employee>> department = new Dictionary< Department, List<Employee>>(); ![]() Object Initialization Expressions and Anonymous TypesTaking the iteration example a little further, we could also add an extension method to convert a list of one type to a list of another type. The Project method converts a collection of T’s to a collection of K’s:
public static IEnumerable<K> Project<T, K>(
this IEnumerable<T> col, Func<T, K> p)
{
foreach (T element in col)
{
yield return p(element);
}
}For example, if we have an Employee type:
class Employee
{
public Employee(string name) { this.name = name; }
private string name;
public string Name { get { return name; } set { name = value; } }
private int age ;
public int Age { get { return age; } set { age = value; } }
}We can convert our list of names to a list of Employees (but only with names of length 3):
var employees = names.Where(n => n.Length == 3)
.Project( empName => new Employee(empName));
foreach (Employee e in employees) { ... }If our Employee class wouldn’t have a constructor taking a string argument then we can use an object initialization expression, for example:
new Employee { Name = empName , Age = 18 }Actually, you may note that this syntax looks a lot like the initialization for attributes and is conceptually equivalent to:
Employee e = new Employee(); e.Name = empName; e.Age = 18;In some cases the Employee class is only used as a kind of temporary storage, and we can use an anonymous type (again a new C# feature): ...( new { Name = empName, Age = 18 } )This anonymous type has two properties: Name of type string and Age of type int.
Our previous example thus becomes: var employees = names.Where(n => n.Length == 3)
.Project( empName => new {Name = empName, Age = 18});The importance of the var keyword now becomes clear, because declaring an IEnumerable<T> where T is an anonymous type wouldn’t work (how would you write this?).
![]() Language Integrated Query (LINQ)C# 3.0 also gives us a new syntax (LINQ) that gets translated to the appropriate extension methods, lambda-expression, etc… For example:
var result = from n in names
where n.Length == 3
select n;… is equivalent to:
var result = names.Where(n => n.Length == 3);Note that the result is an enumerator and not the final result. The selection only takes place when you start using the enumerator, for example using a foreach statement: foreach ( string s in result ) ...This construct calls the IEnumerator<T>’s MoveNext method, which (simply put) calls the Where extension method: IEnumerator<string> ie = result.GetEnumerator(); while( ie.MoveNext() ) ...So the extension methods are only executed when you iterate over the result. This way execution is a lot more efficient memory-wise because there are a lot less, or even none, intermediary results. Let’s look at a more complex example: from n in names
where n.Length == 3
orderby n
select new { Name = n, Length = n.Length };In this case we get an enumeration over a collection of anonymous typed objects in alphabetical (ascending) order.
Let’s now take a look at how to do a join of two collections: var pairs = from a in names, b in children
where b.Father.Name == a
select new { Name = a, Child = b };We get an enumeration over a collection of pairs of the father’s name and the child.
Or we have a collection of customers and orders and we would like to retrieve each customer’s orders since 1998: var orders = from c in customers,
o in c.Orders
where o.OrderDate >= new DateTime(1998, 1, 1)
select new {c.CustomerID, o.OrderID, o.OrderDate};Or we would like to order our customers by city then name:
var sortedCustomers = from c in customers
orderby c.City, c.Name
select cWe can also group, for example to create an index:
var wordGroups = from n in names
group n by n[0] into g
select new {FirstLetter = g.Key, Words = g.Group};
foreach (var g in wordGroups)
{
Console.WriteLine("Words that start with the letter '{0}':",
g.FirstLetter);
foreach (var w in g.Words)
{
Console.WriteLine(w);
}
}![]() DLINQ en XLINQThe power of LINQ is that we will be able to use the same LINQ syntax for collections, XML, and relational databases – and more!
[Table(Name=”Customers”)]
public class Customer {
[Column]
public string Name;
[Column]
public int Age;
[Column]
public bool IsVip;
}And then write a query in C#:
Table<Customer> customers = ...
Table<Orders> orders = ...
var query = from c in customers, o in orders
where o.Customer == c.Name
select new { c.Name, o.OrderID, o.Amount, c.Age };Actually, DLINQ uses its own version of the Where extension method (and others) that uses an expression tree to represent the query (of type Expression<T>). This tree can then be converted to a SQL query which is sent to the database:
SELECT [t0].[Age], [t1].[Amount], [t0].[Name], [t1].[OrderID] FROM [Customers] AS [t0], [Orders] AS [t1] WHERE [t1].[Customer] = [t0].[Name] ![]() ConclusionWith C# 2.0 Microsoft started the trend towards language constructs to build simpler but still powerful code, using generics, the yield keyword, anonymous methods, etc… They continued this trend with C# 3.0, and now we can extend types with extension methods, write delegates with lambda-expressions, and write database queries in C# instead of SQL.
![]() ReferencesTo learn more about LINQ and download the C# 3.0 compiler (VS2005 RTM):
The C# 3.0 web-site: ![]() About the author
| ||||||