linq
  1. linq-parallel-query-execution

Parallel Query Execution with PLINQ

Parallel LINQ (PLINQ) is an extension of LINQ (Language-Integrated Query) that enables parallel execution of queries. PLINQ divides a query into multiple independent subqueries and processes them concurrently, exploiting multiple cores in a CPU to improve performance. In this tutorial, we'll explore PLINQ and how it can improve query execution time.

PLINQ Syntax

PLINQ uses the same syntax as LINQ, with the addition of the .AsParallel() method, which converts a LINQ query into a PLINQ query. The .AsParallel() method can be applied to any IEnumerable collection or directly on any LINQ query.

Here's an example that demonstrates PLINQ syntax:

var parallelResult = from num in Enumerable.Range(1, 1000000).AsParallel()
                     where num % 2 == 0
                     select num;

In the above example, the Enumerable.Range() method is used to get a sequence of numbers from 1 to 1,000,000. The AsParallel() method converts this sequence into a parallel query. The query filters out all the odd numbers and returns the even numbers.

Example

Let's see an example of how PLINQ can be used to calculate the sum of squares of numbers in parallel:

var numbers = Enumerable.Range(1, 100000);

var sumOfSquares = numbers.AsParallel().Select(number => number * number).Sum();

In the above example, Enumerable.Range() method is used to create a sequence of 100000 numbers. AsParallel() method converts this sequence into a parallel query. The query applies a projection to each number in the sequence, squaring each number and then returns the sum of all squares.

Output

After executing the above code, the output will be:

  333338333350000

Explanation

PLINQ uses the concept of "data partitioning" to divide the query into multiple subqueries and execute them in parallel. The number of subqueries depends on the size of the input data, and the number of logical processors available on the machine.

In the above example, the input sequence of numbers is partitioned into multiple chunks, which are processed in parallel by different threads running on multiple processors. Each thread works on its part of the input data and calculates the square of each number, which is then added to the total sum. Finally, all sum results are aggregated to obtain the final output.

Use

PLINQ is useful when dealing with large datasets and computationally expensive queries. It can significantly improve query execution time by utilizing the available CPU cores for query processing.

Important Points

  • PLINQ is suitable for use with large data sets and computationally expensive queries.
  • Overusing Parallelism can lead to performance degradation.
  • PLINQ works best with a multi-core machine.

Summary

PLINQ (Parallel Language-Integrated Query) is an extension of LINQ, which allows parallel execution of queries across multiple processors simultaneously. In this tutorial, we explored the syntax, example, output, explanation, use, important points of PLINQ. We saw how PLINQ can improve query execution time by utilizing multiple logical processors to partition and execute independent sub-queries in parallel. We also went over considerations to keep in mind while using PLINQ to optimize its use.

Published on: