C#

Parallel.ForEach? & PLINQ?



When you need to optimize a program for multi-core machines, a great place to start is by asking if your program can be split up into parts that can execute in parallel. If your solution can be viewed as a compute-intensive
operation performed on each element in a large data set in parallel, it is a prime candidate for taking advantage of new data-parallel programming capabilities in .NET Framework 4: Parallel.ForEach and Parallel LINQ (PLINQ). This document will familiarize you with Parallel.ForEach and PLINQ, discuss how to use these technologies and explain the specific scenarios that lend themselves to each technology.


Parallel.ForEach


The Parallel class’s ForEach method is a multi-threaded implementation of a common loop construct in C#, the
foreach loop. Recall that a foreach loop allows you to iterate over an enumerable data set represented using
an IEnumerable<T>. Parallel.ForEach is similar to a foreach loop in that it iterates over an enumerable data
set, but unlike foreach, Parallel.ForEach uses multiple threads to evaluate different invocations of the loop
body. As it turns out, these characteristics make Parallel.ForEach a broadly useful mechanism for data-parallel
programming.
In order to evaluate a function over a sequence in parallel, an important thing to consider is how to break the
iteration space into smaller pieces that can be processed in parallel. This partitioning allows each thread to
evaluate the loop body over one partition.
Parallel.ForEach has numerous overloads; the most commonly used has the following signature:

public static ParallelLoopResult ForEach<TSource>(
IEnumerable<TSource> source,
    Action<TSource> body)

The IEnumerable<TSource> source specifies the sequence to iterate over, and the Action<TSource> body
specifies the delegate to invoke for each element. For the sake of simplicity, we won’t explain the details of the
other signatures of Parallel.ForEach.


PLINQ


Akin to Parallel.ForEach, PLINQ is also a programming model for executing data-parallel operations. The user defines a data-parallel operation by combining various predefined set-based operators such as projections, filters, aggregations, and so forth. Since LINQ is declarative, PLINQ is able to step in and handle parallelizing the query on the user’s behalf. Similarly to Parallel.ForEach, PLINQ achieves parallelism by partitioning the input sequence and processing different input elements on different threads. While each of these tools deserves an article on its own, this information is beyond to scope of this document. Rather, the document will focus on the interesting differences between the two approaches to data parallelism. Specifically, this document will cover the scenarios when it makes sense to use Parallel.ForEach instead of PLINQ, and vice versa.

No comments:

Post a Comment