C# LINQ

LINQ

  • SQL-like syntax in C# and Visual Basic
  • Query any type of collection (IEnumerable<T>)
  • Query external data sources (xml, databases, JSON, CSV)

SQL Query vs LINQ Query Syntax

SQL LINQ
SELECT * FROM Products FROM prod IN Products SELECT prod
SELECT Name FROM Products FROM prod in Products SELECT prod.Name
SELECT * FROM Products WHERE ListPrice > 10 FROM prod in Products WHERE prod.ListPrice > 10 SELECT prod

Two LINQ Syntaxes

Query Method
FROM prod in Products SELECT prod Products.Select(prod => prod)
FROM prod in Products SELECT prod.Name Products.Select(prod => prod.Name)
FROM prod in Products WHERE prod.ListPrice > 10 SELECT prod Products.Where(prod => prod.ListPrice > 10).Select(prod => prod)

LINQ Operations

  • Select
  • Projection (Change shape, select only certain properties from an object)
  • Order (ascending/descending)
  • Get an Element (find, first, last, single)
  • Filter (where)
  • Iteration/Partioning (foreach, skip, take)
  • Quantify (any, all, contains)
  • Set Comparison (equal, except, intersection)
  • Set Operations (union, concat)
  • Joining (inner joins, outer joins)
  • Grouping (groupby, subquery, groupjoin)
  • Distinct Sets (distinct)
  • Aggregation (count, sum, min, max, average)

Select and Order Operations

Projection (only select specific columns from an object)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
if (UseQuerySyntax)
{
// Query Syntax
Products = (from prod in Products
select new Product
{
ProductID = prod.ProductID,
Name = prod.Name,
Size = prod.Size,
}).ToList();
}
else
{
// Method Syntax
Products = Products.Select(prod => new Product
{
ProductID = prod.ProductID,
Name = prod.Name,
Size = prod.Size
}).ToList();
}

Projection with Anonymous Class

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
if (UseQuerySyntax)
{
// Query Syntax
var products = (from prod in Products
select new
{
Identifier = prod.ProductID,
ProductName = prod.Name,
ProductSize = prod.Size
});

// Loop through anonymous class
foreach (var prod in products)
{
sb.AppendLine($"Product ID: {prod.Identifier}");
sb.AppendLine($" Product Name: {prod.ProductName}");
sb.AppendLine($" Product Size: {prod.ProductSize}");
}
}
else
{
// Method Syntax
var products = Products.Select(prod => new
{
Identifier = prod.ProductID,
ProductName = prod.Name,
ProductSize = prod.Size
});

// Loop through anonymous class
foreach (var prod in products)
{
sb.AppendLine($"Product ID: {prod.Identifier}");
sb.AppendLine($" Product Name: {prod.ProductName}");
sb.AppendLine($" Product Size: {prod.ProductSize}");
}
}

Ordering Data

When using Method Syntax to order data, the .Select() method is optional when you are simply selecting the complete object as the return value

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
public void OrderBy()
{
if (UseQuerySyntax)
{
// Query Syntax
Products = (from prod in Products orderby prod.Name select prod).ToList();
}
else
{
// Method Syntax
Products = Products.OrderBy(prod => prod.Name).ToList();
}

ResultText = $"Total Products: {Products.Count}";
}
  • Order by Descending
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
public void OrderByDescending()
{
if (UseQuerySyntax)
{
// Query Syntax
Products = (from prod in Products orderby prod.Name descending select prod).ToList();
}
else
{
// Method Syntax
Products = Products.OrderByDescending(prod => prod.Name).ToList();
}

ResultText = $"Total Products: {Products.Count}";
}

Order by Two Fields

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
public void OrderByTwoFields()
{
if (UseQuerySyntax)
{
// Query Syntax
Products = (from prod in Products orderby prod.Color descending, prod.Name).ToList();
}
else
{
// Method Syntax
Products = Products.OrderByDescending(prod => prod.Color).ThenBy(prod => prod.Name).ToList();
}

ResultText = $"Total Products: {Products.Count}";
}

Extract Multiple or Single Elements

Where Expression

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
public void WhereExpression()
{
string search = "L";

if (UseQuerySyntax)
{
// Query Syntax
Products = (from prod in Products where prod.Name.StartsWith(search) select prod).ToList();

}
else
{
// Method Syntax
Products = Products.Where(prod => prod.Name.StartsWith(search)).ToList();
}

ResultText = $"Total Products: {Products.Count}";
}
  • Where with multiple fields
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
public void WhereTwoFields()
{
string search = "L";
decimal cost = 100;

if (UseQuerySyntax)
{
// Query Syntax
Products = (from prod in Products where prod.Name.StartsWith(search) && prod.StandardCost > cost).ToList();
}
else
{
// Method Syntax
Products = Products.Where(prod => prod.Name.StartsWith(search) && prod.StandardCost > cost).ToList();
}

ResultText = $"Total Products: {Products.Count}";
}
  • Using Custom Extension Method

Extension Method

Extension methods enable you to “add” methods to existing types without creating a new derived type, recompiling, or otherwise modifying the original type. Extension methods are static methods, but they’re called as if they were instance methods on the extended type.

The most common extension methods are the LINQ standard query operators that add query functionality to the existing System.Collections.IEnumerable and System.Collections.Generic.IEnumerable<T> types.

Extension methods are defined as static methods but are called by using instance method syntax. Their first parameter specifies which type the method operates on. The parameter is preceded by the this modifier.

Extension method is just a static method under the hood.

In the example, the result of (from prod in Products select prod) is an IEnumerable<Product> which is why ByColor() can be applied to this

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
public static IEnumerable<Product> ByColor(
this IEnumerable<Product> query, string color)
{
return query.Where(prod => prod.Color == color);
}

if (UseQuerySyntax)
{
// Query Syntax
Products = (from prod in Products select prod).ByColor(search).ToList();
}
else
{
// Method Syntax
Products = Products.ByColor(search).ToList();
}

Select a Single Item

  • First, will throw an Exception if item not found.
  • Last, same as First
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
try
{
if (UseQuerySyntax)
{
// Query Syntax
value = (from prod in Products select prod).First(prod => prod.Color == search);
}
else
{
// Method Syntax
value = Products.First(prod => prod.Color == search);
}

ResultText = $"Found: {value}";
}
catch
{
ResultText = "Not Found";
}
  • FirstOrDefault, value will be null if item not found, will not throw an Exception
  • LastOrDefault, same as FirstOrDefault
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
if (UseQuerySyntax)
{
// Query Syntax
value = (from prod in Products select prod).First(prod => prod.Color == search);
}
else
{
// Method Syntax
value = Products.First(prod => prod.Color == search);
}

if (value == null)
{
ResultText = "Not Found";
}
else
{
ResultText = $"Found: {value}";
}
  • Single, will throw an Exception if item not found or multiple items found.

Single is supposed to be used to found a unique item, like primary key. The Exception thrown if multiple items are found is InvalidOperationException.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
try
{
if (UseQuerySyntax)
{
// Query Syntax
value = (from prod in Products select prod).Single(prod => prod.ProductID == search);
}
else
{
// Method Syntax
value = Products.Single(prod => prod.ProductID == search);
}

ResultText = $"Found: {value}";
}
catch
{
ResultText = "Not Found, or multiple elements found";
}
  • SingleOrDefault, value will be NULL if no item found, but will still throw an Exception if multiple items found.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
try
{
if (UseQuerySyntax)
{
// Query Syntax
value = (from prod in Products select prod).SingleOrDefault(prod => prod.ProductID == search);
}
else
{
// Method Syntax
value = Products.SingleOrDefault(prod => prod.ProductID == search);
}

if (value == null)
{
ResultText = "Not Found";
}
else
{
ResultText = $"Found: {value}";
}
}
catch
{
ResultText = "Multiple elements found";
}

Extract Distinct Values, Assign Values and Partition Collections

Set Operations

  • Iterate over entire collection
  • Set a property value in collection (similar to a SQL UPDATE)

In this example, the object has a NameLength property and we need to assign the value prop.Name.Length to this prop.NameLength property.

For the Query approach, we need to declare a tmp variable because it has to be a statement, not an assignment. But the Method approach doesn’t have this issue.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
public void ForEach()
{
if (UseQuerySyntax)
{
// Query Syntax
Products = (from prod in Products
let tmp = prod.NameLength = prod.Name.Length
select prod).ToList();

}
else
{
// Method Syntax
Products.ForEach(prod => prod.NameLength = prod.Name.Length);
}

ResultText = $"Total Products: {Products.Count}";
}

In this example we have a Sales object, and we need to calculate how many item we have sold for a certain product.

1
2
3
4
5
private decimal SalesForProduct(Product prod)
{
return Sales.Where(sale => sale.ProductID == prod.ProductID)
.Sum(sale => sale.LineTotal);
}

We could then use this to set the TotalSales property for each Product

1
2
3
4
5
6
7
8
9
10
11
12
if (UseQuerySyntax)
{
// Query Syntax
Products = (from prod in Products
let tmp = prod.TotalSales = SalesForProduct(prod)
select prod).ToList();
}
else
{
// Method Syntax
Products.ForEach(prod => prod.TotalSales = SalesForProduct(prod));
}

Take Specific Amount of Elements

Take the first 5 elements from the list

1
2
3
4
5
6
7
8
9
10
11
12
if (UseQuerySyntax)
{
// Query Syntax
Products = (from prod in Products
orderby prod.Name
select prod).Take(5).ToList();
}
else
{
// Method Syntax
Products = Products.OrderBy(prod => prod.Name).Take(5).ToList();
}
  • TakeWhile(): take elements while condition is true
1
2
3
4
5
6
7
8
9
10
11
12
if (UseQuerySyntax)
{
// Query Syntax
Products = (from prod in Products
orderby prod.Name
select prod).TakeWhile(prod => prod.Name.StartsWith("A")).ToList();
}
else
{
// Method Syntax
Products = Products.OrderBy(prod => prod.Name).TakeWhile(prod => prod.Name.StartsWith("A")).ToList();
}

Skip specific amount of elements

1
2
3
4
5
6
7
8
9
10
11
12
if (UseQuerySyntax)
{
// Query Syntax
Products = (from prod in Products
orderby prod.Name
select prod).Skip(20).ToList();
}
else
{
// Method Syntax
Products = Products.OrderBy(prod => prod.Name).Skip(20).ToList();
}
  • Skip elements while condition is true
1
2
3
4
5
6
7
8
9
10
11
12
if (UseQuerySyntax)
{
// Query Syntax
Products = (from prod in Products
orderby prod.Name
select prod).SkipWhile(prod => prod.Name.StartsWith("A")).ToList();
}
else
{
// Method Syntax
Products = Products.OrderBy(prod => prod.Name).SkipWhile(prod => prod.Name.StartsWith("A")).ToList();
}

Select Distinct Values

1
2
3
4
5
6
7
8
9
10
if (UseQuerySyntax)
{
// Query Syntax
colors = (from prod in Products select prod.Color).Distinct().ToList();
}
else
{
// Method Syntax
colors = Products.Select(prod => prod.Color).Distinct().ToList();
}

Identify What Kind of Data is Contained in Collections

All() will return a true or false value to see if all items meet the requirement.

1
2
3
4
5
6
7
8
9
10
if (UseQuerySyntax)
{
// Query Syntax
value = (from prod in Products select prod).All(prod => prod.Name.Contains(search));
}
else
{
// Method Syntax
value = Products.All(prod => prod.Name.Contains(search));
}

Any() will return true if any of the item meet the requirement. And will return false will all items doesn’t meet the requirement.

1
2
3
4
5
6
7
8
9
10
if (UseQuerySyntax)
{
// Query Syntax
value = (from prod in Products select prod).Any(prod => prod.Name.Contains(search));
}
else
{
// Method Syntax
value = Products.Any(prod => prod.Name.Contains(search));
}

Contains can be used in primitive types and objects

1
2
3
4
5
6
7
8
9
10
11
12
13
bool value = true;
List<int> numbers = new List<int> { 1, 2, 3, 4, 5 };

if (UseQuerySyntax)
{
// Query Syntax
value = (from num in numbers select num).Contains(3);
}
else
{
// Method Syntax
value = numbers.Contains(3);
}

When using Contains() on a collection of objects. We need to use EqualityComparer, because by default objects are compared by reference not value.

1
2
3
4
5
6
7
8
9
public class ProductIdComparer : EqualityComparer<Product> {
public override bool Equals(Product x, Product y) {
return (x.ProductID == y.ProductID);
}
}

public override int GetHashCode(Product obj) {
return obj.ProductID.GetHashCode();
}

Now when calling Contains() method, we pass in the Comparer object, so it will loop through all products and compare each one with our prodToFind Product. The Comparer will use prodToFind as the first parameter and each Product as the second parameter.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
int search = 744;
bool value = true;
ProductIdComparer pc = new ProductIdComparer();
Product prodToFind = new Product { ProductID = search };

if (UseQuerySyntax)
{
// Query Syntax
value = (from prod in Products select prod).Contains(prodToFind, pc);
}
else
{
// Method Syntax
value = Products.Contains(prodToFind, pc);
}

Compare and Union Two Collections

SequenceEqual()

Compares two collections for equlity.

  • For Simple data types (int, decimal, boolean…) it checks values
  • For object data types checks reference
  • If you want to compare values in objects, you need to create a comparer class to check the values inside each properties.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
// Use SequenceEqual on primitives
bool value = true;
// Create a list of numbers
List<int> list1 = new List<int> { 1, 2, 3, 4, 5 };
// Create a list of numbers
List<int> list2 = new List<int> { 1, 2, 3, 4, 5 };

if (UseQuerySyntax)
{
// Query Syntax
value = (from num in list1 select num).SequenceEqual(list2);
}
else
{
// Method Syntax
value = list1.SequenceEqual(list2);
}

If we want to compare each object in a collection by value, we need to create a new Comparer override the Compare method.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
bool value = true;
ProductComparer pc = new ProductComparer();
// Load all Product Data
List<Product> list1 = ProductRepository.GetAll();
// Load all Product Data
List<Product> list2 = ProductRepository.GetAll();

// Remove an element from 'list1' to make the collections different
list1.RemoveAt(0);

if (UseQuerySyntax)
{
// Query Syntax
value = (from num in list1 select num).SequenceEqual(list2, pc);
}
else
{
// Method Syntax
value = list1.SequenceEqual(list2, pc);
}

Except

  • It finds all values in one list, but not the other, returns a collection of items.
  • Similar to Contains and SequenceEqual, if we are comparing primitive types, we can just use it, but if we are comparing objects values, we need to create a Comparer class and override the Compare method.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
List<int> exceptions = new List<int>();
// Create a list of numbers
List<int> list1 = new List<int> { 1, 2, 3, 4 };
// Create a list of numbers
List<int> list2 = new List<int> { 3, 4, 5 };

if (UseQuerySyntax)
{
// Query Syntax
exceptions = (from num in list1 select num).Except(list2).ToList();
}
else
{
// Method Syntax
exceptions = list1.Except(list2).ToList();
}

Using Except on a collection of objects

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
ProductComparer pc = new ProductComparer();
// Load all Product Data
List<Product> list1 = ProductRepository.GetAll();
// Load all Product Data
List<Product> list2 = ProductRepository.GetAll();

// Remove all products with color = "Black" from 'list2'
// to give us a difference in the two lists
list2.RemoveAll(prod => prod.Color == "Black");

if (UseQuerySyntax)
{
// Query Syntax
Products = (from prod in list1 select prod).Except(list2, pc).ToList();
}
else
{
// Method Syntax
Products = list1.Except(list2, pc).ToList();
}

Intersect

  • It finds all values in common between both lists
  • Similar to Contains, SequenceEqual and Except, it compares values for primitive types and references for objects. We need to create comparer class to check values in properties.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
ProductComparer pc = new ProductComparer();
// Load all Product Data
List<Product> list1 = ProductRepository.GetAll();
// Load all Product Data
List<Product> list2 = ProductRepository.GetAll();

// Remove 'black' products from 'list1'
list1.RemoveAll(prod => prod.Color == "Black");
// Remove 'red' products from 'list2'
list2.RemoveAll(prod => prod.Color == "Red");

if (UseQuerySyntax)
{
// Query Syntax
Products = (from num in list1 select num).Intersect(list2, pc).ToList();
}
else
{
// Method Syntax
Products = list1.Intersect(list2, pc).ToList();
}

Unions

  • It adds the contents of two lists together.
  • Union() checks for duplicates
  • Concat() does not check for duplicates
  • Use comparer class with objects

Union() need Comparer to eliminate duplicates

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
ProductComparer pc = new ProductComparer();
// Load all Product Data
List<Product> list1 = ProductRepository.GetAll();
// Load all Product Data
List<Product> list2 = ProductRepository.GetAll();

if (UseQuerySyntax)
{
// Query Syntax
Products = (from num in list1 select num).Union(list2, pc).OrderBy(prod => prod.Name).ToList();
}
else
{
// Method Syntax
Products = list1.Union(list2, pc).OrderBy(prod => prod.Name).ToList();
}

Concat()

  • Adds the contents of two collections with duplicates
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
// Load all Product Data
List<Product> list1 = ProductRepository.GetAll();
// Load all Product Data
List<Product> list2 = ProductRepository.GetAll();

if (UseQuerySyntax)
{
// Query Syntax
Products = (from num in list1 select num).Concat(list2).OrderBy(prod => prod.Name).ToList();
}
else
{
// Method Syntax
Products = list1.Concat(list2).OrderBy(prod => prod.Name).ToList();
}

Joining Two Collections Together

Inner Join

1
2
3
4
5
6
7
8
9
10
11
12
13
var query = Products.Join(Sales, prod => prod.ProductID, sale => sale.ProductID, (prod, sale) => new
{
prod.ProductID,
prod.Name,
prod.Color,
prod.StandardCost,
prod.ListPrice,
prod.Size,
sale.SalesOrderID,
sale.OrderQty,
sale.UnitPrice,
sale.LineTotal,
});

Inner Join with two fields

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
short qty = 6;

var query = Products.Join(
Sales,
prod => new { prod.ProductID, Qty = qty },
sale => new { sale.ProductID, Qty = sale.OrderQty },
(prod, sale) => new
{
prod.ProductID,
prod.Name,
prod.Color,
prod.StandardCost,
prod.ListPrice,
prod.Size,
sale.SalesOrderID,
sale.OrderQty,
sale.UnitPrice,
sale.LineTotal
});

Aggregating Data in Collections

Count()

1
2
3
4
value = Products.Count(prod => prod.Color == "Yellow");

// Another way using Where
value = Products.Where(prod => prod.Color == "Yellow").Count();

Min() and Max()

1
value = Products.Min(prod => prod.ListPrice);
1
value = Products.Max(prod => prod.ListPrice);

Average() and Sum()

1
value = Products.Average(prod => prod.ListPrice);
1
value = Products.Sum(prod => prod.ListPrice);

Custom Calculation using Aggregate()

The first parameter initialize an internal variable, which setup the start value.

The second parameter is an anonymous function which you pass the initial value and loop through each item in the collection

  • Aggregate Sum
1
value = Products.Aggregate(0m, (sum, prod) => sum += prod.ListPrice);
  • Aggregate Multiply
1
value = Products.Aggregate(0m, (sum, prod => sum += prod.ListPrice * prod.Qty));
  • Aggregate with GroupBy and Having
1
2
3
4
5
6
7
8
9
10
11
var stats = Products.GroupBy(sale => sale.Size)
.Where(sizeGroup => sizeGroup.Count() > 0)
.Select(sizeGroup => new {
Size = sizeGroup.Key,
TotalProducts = sizeGroup.Count(),
Max = sizeGroup.Max(s => s.ListPrice),
Min = sizeGroup.Min(s => s.ListPrice),
Average = sizeGroup.Average(s => s.ListPrice)
})
.OrderBy(result => result.Size)
.Select(result => result);

Deferred Execution

  • A LINQ query is a data structure ready to execute
  • Query is not executed until a value is needed
  • The execution happens with one of the folloing functions (foreach(), Count(), ToList(), OrderBy()…)

Streaming Operators

  • Results can be returned prior to the entire collection is read
  • Examples: Distinct(), GroupBy(), Join(), Select(), Skip(), Take(), Union(), Where()

Non-Streaming Operators

  • All data in collection must be read before a result can be returned
  • Examples: Except(), GroupBy(), GroupJoin(), Intersect(), Join(), OrderBy(), ThenBy()

The yield keyword

When write our own Filter function, we could use yield to make the function to be Streaming. So it returns data while looping through the collection.

1
2
3
4
5
6
7
public static IEnumrable<T> Filter<T> (this IEnumrable<T> source, Func<T, bool> predicate) {
foreach(var item in source) {
if (predicate(item)) {
yield return item;
}
}
}

In the below example, Where and Take are both Streaming Operators, so this query will loop through the collection until the requirement is met. That is when it found the first item that has Color red. It doesn’t need to go through the entire collection.

1
Products = Products.Where(prod => prod.Color == "red").Take(1).ToList();

However, in this example, because OrderBy() is an non-streaming operator, so it will loop through the entire list first, order them by prod.Name, then apply the Where condition. Non-streaming operator will go before the Streaming operator.

1
Products = Products.Where(prod => prod.Color == "red").OrderBy(prod => prod.Name).ToList();