Introducing the Task Parallel Library in Microsoft Visual C# 2010

  • 4/15/2010

Implementing Multitasking in a Desktop Application

Multitasking is the ability to do more than one thing at the same time. It is one of those concepts that is easy to describe but that, until recently, has been difficult to implement.

In the optimal scenario, an application running on a multicore processor performs as many concurrent tasks as there are processor cores available, keeping each of the cores busy. However, there are many issues you have to consider to implement concurrency, including the following:

  • How can you divide an application into a set of concurrent operations?

  • How can you arrange for a set of operations to execute concurrently, on multiple processors?

  • How can you ensure that you attempt to perform only as many concurrent operations as there are processors available?

  • If an operation is blocked (such as while it is waiting for I/O to complete), how can you detect this and arrange for the processor to run a different operation rather than sit idle?

  • How can you determine when one or more concurrent operations have completed?

  • How can you synchronize access to shared data to ensure that two or more concurrent operations do not inadvertently corrupt each other’s data?

To an application developer, the first question is a matter of application design. The remaining questions depend on the programmatic infrastructure—Microsoft provides the Task Parallel Library (TPL) to help address these issues.

In Chapter 28, "Performing Parallel Data Access," you will see how some query-oriented problems have naturally parallel solutions, and how you can use the ParallelEnumerable type of PLINQ to parallelize query operations. However, sometimes you need a more imperative approach for more generalized situations. The TPL contains a series of types and operations that enable you to more explicitly specify how you want to divide an application into a set of parallel tasks.

Tasks, Threads, and the ThreadPool

The most important type in the TPL is the Task class. The Task class is an abstraction of a concurrent operation. You create a Task object to run a block of code. You can instantiate multiple Task objects and start them running in parallel if sufficient processors or processor cores are available.

Internally, the TPL implements tasks and schedules them for execution by using Thread objects and the ThreadPool class. Multithreading and thread pools have been available with the .NET Framework since version 1.0, and you can use the Thread class in the System. Threading namespace directly in your code. However, the TPL provides an additional degree of abstraction that enables you to easily distinguish between the degree of parallelization in an application (the tasks) and the units of parallelization (the threads). On a single-processor computer, these items are usually the same. However, on a computer with multiple processors or with a multicore processor, they are different. If you design a program based directly on threads, you will find that your application might not scale very well; the program will use the number of threads you explicitly create, and the operating system will schedule only that number of threads. This can lead to overloading and poor response time if the number of threads greatly exceeds the number of available processors, or to inefficiency and poor throughput if the number of threads is less than the number of processors.

The TPL optimizes the number of threads required to implement a set of concurrent tasks and schedules them efficiently according to the number of available processors. The TPL uses a set of threads provided by the .NET Framework, called the ThreadPool, and implements a queuing mechanism to distribute the workload across these threads. When a program creates a Task object, the task is added to a global queue. When a thread becomes available, the task is removed from the global queue and is executed by that thread. The ThreadPool implements a number of optimizations and uses a work-stealing algorithm to ensure that threads are scheduled efficiently.

You should note that the number of threads created by the .NET Framework to handle your tasks is not necessarily the same as the number of processors. Depending on the nature of the workload, one or more processors might be busy performing high-priority work for other applications and services. Consequently, the optimal number of threads for your application might be less than the number of processors in the machine. Alternatively, one or more threads in an application might be waiting for long-running memory access, I/O, or a network operation to complete, leaving the corresponding processors free. In this case, the optimal number of threads might be more than the number of available processors. The .NET Framework follows an iterative strategy, known as a hill-climbing algorithm, to dynamically determine the ideal number of threads for the current workload.

The important point is that all you have to do in your code is divide your application into tasks that can be run in parallel. The .NET Framework takes responsibility for creating the appropriate number of threads based on the processor architecture and workload of your computer, associating your tasks with these threads and arranging for them to be run efficiently. It does not matter if you divide your work into too many tasks because the .NET Framework will attempt to run only as many concurrent threads as is practical; in fact, you are encouraged to overpartition your work because this will help to ensure that your application scales if you move it onto a computer that has more processors available.

Creating, Running, and Controlling Tasks

The Task object and the other types in the TPL reside in the System.Threading.Tasks namespace. You can create Task objects by using the Task constructor. The Task constructor is overloaded, but all versions expect you to provide an Action delegate as a parameter. Remember from Chapter 23 that an Action delegate references a method that does not return a value. A task object uses this delegate to run the method when it is scheduled. The following example creates a Task object that uses a delegate to run the method called doWork (you can also use an anonymous method or a lambda expression, as shown by the code in the comments):

Task task = new Task(new Action(doWork));
//Task task = new Task(delegate { this.doWork(); });
//Task task = new Task(() => { this.doWork(); });
...
private void doWork()
{
    // The task runs this code when it is started
    ...
}

The default Action type references a method that takes no parameters. Other overloads of the Task constructor take an Action<object> parameter representing a delegate that refers to a method that takes a single object parameter. These overloads enable you to pass data into the method run by the task. The following code shows an example:

Action<object> action;
action = doWorkWithObject;
object parameterData = ...;
Task task = new Task(action, parameterData);
...
private void doWorkWithObject(object o)
{
    ...
}

After you create a Task object, you can set it running by using the Start method, like this:

Task task = new Task(...);
task.Start();

The Start method is also overloaded, and you can optionally specify a TaskScheduler object to control the degree of concurrency and other scheduling options. It is recommended that you use the default TaskScheduler object built into the .NET Framework, or you can define your own custom TaskScheduler class if you want to take more control over the way in which tasks are queued and scheduled. The details of how to do this are beyond the scope of this book, but if you require more information look at the description of the TaskScheduler abstract class in the .NET Framework Class Library documentation provided with Visual Studio.

You can obtain a reference to the default TaskScheduler object by using the static Default property of the TaskScheduler class. The TaskScheduler class also provides the static Current property, which returns a reference to the TaskScheduler object currently used. (This TaskScheduler object is used if you do not explicitly specify a scheduler.) A task can provide hints to the default TaskScheduler about how to schedule and run the task if you specify a value from the TaskCreationOptions enumeration in the Task constructor. For more information about the TaskCreationOptions enumeration, consult the documentation describing the .NET Framework Class Library provided with Visual Studio.

When the method run by the task completes, the task finishes, and the thread used to run the task can be recycled to execute another task.

Normally, the scheduler arranges to perform tasks in parallel wherever possible, but you can also arrange for tasks to be scheduled serially by creating a continuation. You create a continuation by calling the ContinueWith method of a Task object. When the action performed by the Task object completes, the scheduler automatically creates a new Task object to run the action specified by the ContinueWith method. The method specified by the continuation expects a Task parameter, and the scheduler passes in a reference to the task that completed to the method. The value returned by ContinueWith is a reference to the new Task object. The following code example creates a Task object that runs the doWork method and specifies a continuation that runs the doMoreWork method in a new task when the first task completes:

Task task = new Task(doWork);
task.Start();
Task newTask = task.ContinueWith(doMoreWork);
...
private void doWork()
{
    // The task runs this code when it is started
    ...
}
...
private void doMoreWork(Task task)
{
    // The continuation runs this code when doWork completes
    ...
}

The ContinueWith method is heavily overloaded, and you can provide a number of parameters that specify additional items, such as the TaskScheduler to use and a TaskContinuationOptions value. The TaskContinuationOptions type is an enumeration that contains a superset of the values in the TaskCreationOptions enumeration. The additional values available include

  • NotOnCanceled and OnlyOnCanceled The NotOnCanceled option specifies that the continuation should run only if the previous action completes and is not canceled, and the OnlyOnCanceled option specifies that the continuation should run only if the previous action is canceled. The section "Canceling Tasks and Handling Exceptions" later in this chapter describes how to cancel a task.

  • NotOnFaulted and OnlyOnFaulted The NotOnFaulted option indicates that the continuation should run only if the previous action completes and does not throw an unhandled exception. The OnlyOnFaulted option causes the continuation to run only if the previous action throws an unhandled exception. The section Canceling Tasks and Handling Exceptions provides more information on how to manage exceptions in a task.

  • NotOnRanToCompletion and OnlyOnRanToCompletion The NotOnRanToCompletion option specifies that the continuation should run only if the previous action does not complete successfully; it must either be canceled or throw an exception. OnlyOnRanToCompletion causes the continuation to run only if the previous action completes successfully.

The following code example shows how to add a continuation to a task that runs only if the initial action does not throw an unhandled exception:

Task task = new Task(doWork);
task.ContinueWith(doMoreWork, TaskContinuationOptions.NotOnFaulted);
task.Start();

If you commonly use the same set of TaskCreationOptions values and the same TaskScheduler object, you can use a TaskFactory object to create and run a task in a single step. The constructor for the TaskFactory class enables you to specify the task scheduler, task creation options, and task continuation options that tasks constructed by this factory should use. The TaskFactory class provides the StartNew method to create and run a Task object. Like the Start method of the Task class, the StartNew method is overloaded, but all of them expect a reference to a method that the task should run.

The following code shows an example that creates and runs two tasks using the same task factory:

TaskScheduler scheduler = TaskScheduler.Current;
TaskFactory taskFactory = new TaskFactory(scheduler, TaskCreationOptions.None,
    TaskContinuationOptions.NotOnFaulted);
Task task = taskFactory.StartNew(doWork);
Task task2 = taskFactory.StartNew(doMoreWork);

Even if you do not currently specify any particular task creation options and you use the default task scheduler, you should still consider using a TaskFactory object; it ensures consistency, and you will have less code to modify to ensure that all tasks run in the same manner if you need to customize this process in the future. The Task class exposes the default TaskFactory used by the TPL through the static Factory property. You can use it like this:

Task task = Task.Factory.StartNew(doWork);

A common requirement of applications that invoke operations in parallel is to synchronize tasks. The Task class provides the Wait method, which implements a simple task coordination method. It enables you to suspend execution of the current thread until the specified task completes, like this:

task2.Wait(); // Wait at this point until task2 completes

You can wait for a set of tasks by using the static WaitAll, and WaitAny methods of the Task class. Both methods take a params array containing a set of Task objects. The WaitAll method waits until all specified tasks have completed, and WaitAny stops until at least one of the specified tasks has finished. You use them like this:

Task.WaitAll(task, task2); // Wait for both task and task2 to complete
Task.WaitAny(task, task2); // Wait for either of task or task2 to complete

Using the Task Class to Implement Parallelism

In the next exercise, you will use the Task class to parallelize processor-intensive code in an application, and you will see how this parallelization reduces the time taken for the application to run by spreading the computations across multiple processor cores.

The application, called GraphDemo, comprises a WPF form that uses an Image control to display a graph. The application plots the points for the graph by performing a complex calculation.

Examine and run the GraphDemo single-threaded application

  1. Start Microsoft Visual Studio 2010 if it is not already running.

  2. Open the GraphDemo solution, located in the \Microsoft Press\Visual CSharp Step By Step\Chapter 27\GraphDemo folder in your Documents folder.

  3. In Solution Explorer, in the GraphDemo project, double-click the file GraphWindow.xaml to display the form in the Design View window.

    The form contains the following controls:

    • An Image control called graphImage. This image control displays the graph rendered by the application.

    • A Button control called plotButton. The user clicks this button to generate the data for the graph and display it in the graphImage control.

    • A Label control called duration. The application displays the time taken to generate and render the data for the graph in this label.

  4. In Solution Explorer, expand GraphWindow.xaml, and then double-click GraphWindow.xaml.cs to display the code for the form in the Code and Text Editor window.

    The form uses a System.Windows.Media.Imaging.WriteableBitmap object called graphBitmap to render the graph. The variables pixelWidth and pixelHeight specify the horizontal and vertical resolution, respectively, for the WriteableBitmap object; the variables dpiX and dpiY specify the horizontal and vertical density, respectively, of the image in dots per inch:

    public partial class GraphWindow : Window
    {
        private static long availableMemorySize = 0;
        private int pixelWidth = 0;
        private int pixelHeight = 0;
        private double dpiX = 96.0;
        private double dpiY = 96.0;
        private WriteableBitmap graphBitmap = null;
        ...
    }
  5. Examine the GraphWindow constructor. It looks like this:

    public GraphWindow()
    {
        InitializeComponent();
    
        PerformanceCounter memCounter = new PerformanceCounter("Memory", "Available
    Bytes");
        availableMemorySize = Convert.ToUInt64(memCounter.NextValue());
    
        this.pixelWidth = (int)availablePhysicalMemory / 20000;
        if (this.pixelWidth < 0 || this.pixelWidth > 15000)
            this.pixelWidth = 15000;
        this.pixelHeight = (int)availablePhysicalMemory / 40000;
        if (this.pixelHeight < 0 || this.pixelHeight > 7500)
            this.pixelHeight = 7500;
    }

    To avoid presenting you with code that exhausts the memory available on your computer and generates OutOfMemory exceptions, this application creates a PerformanceCounter object to query the amount of available physical memory on the computer. It then uses this information to determine appropriate values for the pixelWidth and pixelHeight variables. The more available memory you have on your computer, the bigger the values generated for pixelWidth and pixelHeight (subject to the limits of 15,000 and 7500 for each of these variables, respectively) and the more you will see the benefits of using the TPL as the exercises in this chapter proceed. However, if you find that the application still generates OutOfMemory exceptions, increase the divisors (20,000 and 40,000) used for generating the values of pixelWidth and pixelHeight.

    If you have a lot of memory, the values calculated for pixelWidth and pixelHeight might overflow. In this case, they will contain negative values and the application will fail with an exception later on. The code in the constructor checks this case and sets the pixelWidth and pixelHeight fields to a pair of useful values that enable the application to run correctly in this situation.

  6. Examine the code for the plotButton_Click method:

    private void plotButton_Click(object sender, RoutedEventArgs e)
    {
        if (graphBitmap == null)
        {
            graphBitmap = new WriteableBitmap(pixelWidth, pixelHeight, dpiX, dpiY,
    PixelFormats.Gray8, null);
        }
        int bytesPerPixel = (graphBitmap.Format.BitsPerPixel + 7) / 8;
        int stride = bytesPerPixel * graphBitmap.PixelWidth;
        int dataSize = stride * graphBitmap.PixelHeight;
        byte [] data = new byte[dataSize];
    
        Stopwatch watch = Stopwatch.StartNew();
        generateGraphData(data);
    
        duration.Content = string.Format("Duration (ms): {0}", watch.ElapsedMilliseconds);
        graphBitmap.WritePixels(
           new Int32Rect(0, 0, graphBitmap.PixelWidth, graphBitmap.PixelHeight),
           data, stride, 0);
        graphImage.Source = graphBitmap;
    }

    This method runs when the user clicks the plotButton button. The code instantiates the graphBitmap object if it has not already been created by the user clicking the plotButton button previously, and it specifies that each pixel represents a shade of gray, with 8 bits per pixel. This method uses the following variables and methods:

    • The bytesPerPixel variable calculates the number of bytes required to hold each pixel. (The WriteableBitmap type supports a range of pixel formats, with up to 128 bits per pixel for full-color images.)

    • The stride variable contains the vertical distance, in bytes, between adjacent pixels in the WriteableBitmap object.

    • The dataSize variable calculates the number of bytes required to hold the data for the WriteableBitmap object. This variable is used to initialize the data array with the appropriate size.

    • The data byte array holds the data for the graph.

    • The watch variable is a System.Diagnostics.Stopwatch object. The StopWatch type is useful for timing operations. The static StartNew method of the StopWatch type creates a new instance of a StopWatch object and starts it running. You can query the running time of a StopWatch object by examining the ElapsedMilliseconds property.

    • The generateGraphData method populates the data array with the data for the graph to be displayed by the WriteableBitmap object. You will examine this method in the next step.

    • The WritePixels method of the WriteableBitmap class copies the data from a byte array to a bitmap for rendering. This method takes an Int32Rect parameter that specifies the area in the WriteableBitmap object to populate, the data to be used to copy to the WriteableBitmap object, the vertical distance between adjacent pixels in the WriteableBitmap object, and an offset into the WriteableBitmap object to start writing the data to.

    • The Source property of an Image control specifies the data that the Image control should render. This example sets the Source property to the WriteableBitmap object.

  7. Examine the code for the generateGraphData method:

    private void generateGraphData(byte[] data)
    {
        int a = pixelWidth / 2;
        int b = a * a;
        int c = pixelHeight / 2;
    
        for (int x = 0; x < a; x ++)
        {
            int s = x * x;
            double p = Math.Sqrt(b - s);
            for (double i = -p; i < p; i += 3)
            {
                double r = Math.Sqrt(s + i * i) / a;
                double q = (r - 1) * Math.Sin(24 * r);
                double y = i / 3 + (q * c);
                plotXY(data, (int)(-x + (pixelWidth / 2)), (int)(y + (pixelHeight / 2)));
                plotXY(data, (int)(x + (pixelWidth / 2)), (int)(y + (pixelHeight / 2)));
            }
        }
    }

    This method performs a series of calculations to plot the points for a rather complex graph. (The actual calculation is unimportant—it just generates a graph that looks attractive!) As it calculates each point, it calls the plotXY method to set the appropriate bytes in the data array that correspond to these points. The points for the graph are reflected around the X axis, so the plotXY method is called twice for each calculation: once for the positive value of the X coordinate, and once for the negative value.

  8. Examine the plotXY method:

    private void plotXY(byte[] data, int x, int y)
    {
        data[x + y * pixelWidth] = 0xFF;
    }

    This is a simple method that sets the appropriate byte in the data array that corresponds to X and Y coordinates passed in as parameters. The value 0xFF indicates that the corresponding pixel should be set to white when the graph is rendered. Any pixels left unset are displayed as black.

  9. On the Debug menu, click Start Without Debugging to build and run the application.

  10. When the Graph Demo window appears, click Plot Graph, and wait.

    Please be patient. The application takes several seconds to generate and display the graph. The following image shows the graph. Note the value in the Duration (ms) label in the following figure. In this case, the application took 4478 milliseconds (ms) to plot the graph.

  11. Click Plot Graph again, and take note of the time taken. Repeat this action several times to get an average value.

  12. On the desktop, right-click an empty area of the taskbar, and then in the pop-up menu click Start Task Manager.

  13. In the Windows Task Manager, click the Performance tab.

  14. Return to the Graph Demo window and then click Plot Graph.

  15. In the Windows Task Manager, note the maximum value for the CPU usage while the graph is being generated. Your results will vary, but on a dual-core processor the CPU utilization will probably be somewhere around 50–55 percent, as shown in the following image. On a quad-core machine, the CPU utilization will likely be below 30 percent.

    httpatomoreillycomsourcemspimages1374366.png
  16. Return to the Graph Demo window, and click Plot Graph again. Note the value for the CPU usage in the Windows Task Manager. Repeat this action several times to get an average value.

  17. Close the Graph Demo window, and minimize the Windows Task Manager.

You now have a baseline for the time the application takes to perform its calculations. However, it is clear from the CPU usage displayed by the Windows Task Manager that the application is not making full use of the processing resources available. On a dual-core machine, it is using just over half of the CPU power, and on a quad-core machine it is employing a little over a quarter of the CPU. This phenomenon occurs because the application is single-threaded, and in a Windows application, a single thread can occupy only a single core on a multicore processor. To spread the load over all the available cores, you need to divide the application into tasks and arrange for each task to be executed by a separate thread running on a different core.

Modify the GraphDemo application to use parallel threads

  1. Return to the Visual Studio 2010, and display the GraphWindow.xaml.cs file in the Code and Text Editor window if it is not already open.

  2. Examine the generateGraphData method.

    If you think about it carefully, the purpose of this method is to populate the items in the data array. It iterates through the array by using the outer for loop based on the x loop control variable, highlighted in bold here:

    private void generateGraphData(byte[] data)
    {
        int a = pixelWidth / 2;
        int b = a * a;
        int c = pixelHeight / 2;
    
        for (int x = 0; x < a; x ++)
        {
            int s = x * x;
            double p = Math.Sqrt(b - s);
            for (double i = -p; i < p; i += 3)
            {
                double r = Math.Sqrt(s + i * i) / a;
                double q = (r - 1) * Math.Sin(24 * r);
                double y = i / 3 + (q * c);
                plotXY(data, (int)(-x + (pixelWidth / 2)), (int)(y + (pixelHeight / 2)));
                plotXY(data, (int)(x + (pixelWidth / 2)), (int)(y + (pixelHeight / 2)));
            }
        }
    }

    The calculation performed by one iteration of this loop is independent of the calculations performed by the other iterations. Therefore, it makes sense to partition the work performed by this loop and run different iterations on a separate processor.

  3. Modify the definition of the generateGraphData method to take two additional int parameters called partitionStart and partitionEnd, as shown in bold here:

    private void generateGraphData(byte[] data, int partitionStart, int partitionEnd)
    {
        ...
    }
  4. In the generateGraphData method, change the outer for loop to iterate between the values of partitionStart and partitionEnd, as shown in bold here:

    private void generateGraphData(byte[] data, int partitionStart, int partitionEnd)
    {
        ...
    
        for (int x = partitionStart; x < partitionEnd; x ++)
        {
            ...
        }
    }
  5. In the Code and Text Editor window, add the following using statement to the list at the top of the GraphWindow.xaml.cs file:

    using System.Threading.Tasks;
  6. In the plotButton_Click method, comment out the statement that calls the generateGraphData method and add the statement shown next in bold that creates a Task object by using the default TaskFactory object and starts it running:

    ...
    Stopwatch watch = Stopwatch.StartNew();
    // generateGraphData(data);
    Task first = Task.Factory.StartNew(() => generateGraphData(data, 0, pixelWidth / 4));
    ...

    The task runs the code specified by the lambda expression. The values for the partitionStart and partitionEnd parameters indicate that the Task object calculates the data for the first half of the graph. (The data for the complete graph consists of points plotted for the values between 0 and pixelWidth / 2.)

  7. Add another statement that creates and runs a second Task object on another thread, as shown in bold here:

    ...
    
    Task first = Task.Factory.StartNew(() => generateGraphData(data, 0, pixelWidth / 4));
    Task second = Task.Factory.StartNew(() => generateGraphData(data, pixelWidth / 4,
    pixelWidth / 2));
    ...

    This Task object invokes the generateGraph method and calculates the data for the values between pixelWidth / 4 and pixelWidth / 2.

  8. Add the following statement that waits for both Task objects to complete their work before continuing:

    Task.WaitAll(first, second);
  9. On the Debug menu, click Start Without Debugging to build and run the application.

  10. Display the Windows Task Manager, and click the Performance tab if it is not currently displayed.

  11. Return to the Graph Demo window, and click Plot Graph. In the Windows Task Manager, note the maximum value for the CPU usage while the graph is being generated. When the graph appears in the Graph Demo window, record the time taken to generate the graph. Repeat this action several times to get an average value.

  12. Close the Graph Demo window, and minimize the Windows Task Manager.

    This time you should see that the application runs significantly quicker than previously. On my computer, the time dropped to 2682 milliseconds—a reduction in time of about 40 percent. Additionally, you should see that the application uses more cores of the CPU. On a dual-core machine, the CPU usage peaked at 100 percent. If you have a quad-core computer, the CPU utilization will not be as high. This is because two of the cores will not be occupied. To rectify this and reduce the time further, add two further Task objects and divide the work into four chunks in the plotButton_Click method, as shown in bold here:

    ...
    Task first = Task.Factory.StartNew(() => generateGraphData(data, 0, pixelWidth / 8));
    Task second = Task.Factory.StartNew(() => generateGraphData(data, pixelWidth / 8,
    pixelWidth / 4));
    Task third = Task.Factory.StartNew(() => generateGraphData(data, pixelWidth / 4,
    pixelWidth * 3 / 8));
    Task fourth = Task.Factory.StartNew(() => generateGraphData(data, pixelWidth * 3 / 8,
    pixelWidth / 2));
    Task.WaitAll(first, second, third, fourth);
    ...

    If you have only a dual-core processor, you can still try this modification, and you should still notice a beneficial effect on the time. This is primarily because of efficiencies in the TPL and the algorithms in the .NET Framework optimizing the way in which the threads for each task are scheduled.

Abstracting Tasks by Using the Parallel Class

By using the Task class, you have complete control over the number of tasks your application creates. However, you had to modify the design of the application to accommodate the use of Task objects. You also had to add code to synchronize operations; the application can render the graph only when all the tasks have completed. In a complex application, synchronization of tasks can become a nontrivial process and it is easy to make mistakes.

The Parallel class in the TPL enables you to parallelize some common programming constructs without requiring that you redesign an application. Internally, the Parallel class creates its own set of Task objects, and it synchronizes these tasks automatically when they have completed. The Parallel class is located in the System.Threading.Tasks namespace and provides a small set of static methods you can use to indicate that code should be run in parallel if possible. These methods are as follows:

  • Parallel.For You can use this method in place of a C# for statement. It defines a loop in which iterations can run in parallel by using tasks. This method is heavily overloaded (there are nine variations), but the general principle is the same for each; you specify a start value, an end value, and a reference to a method that takes an integer parameter. The method is executed for every value between the start value and one below the end value specified, and the parameter is populated with an integer that specifies the current value. For example, consider the following simple for loop that performs each iteration in sequence:

    for (int x = 0; x < 100; x++)
    {
        // Perform loop processing
    }

    Depending on the processing performed by the body of the loop, you might be able to replace this loop with a Parallel.For construct that can perform iterations in parallel, like this:

    Parallel.For(0, 100, performLoopProcessing);
    ...
    private void performLoopProcessing(int x)
    {
        // Perform loop processing
    }

    The overloads of the Parallel.For method enable you to provide local data that is private to each thread, specify various options for creating the tasks run by the For method, and create a ParallelLoopState object that can be used to pass state information to other concurrent iterations of the loop. (Using a ParallelLoopState object is described later in this chapter.)

  • Parallel.ForEach<T> You can use this method in place of a C# foreach statement. Like the For method, ForEach defines a loop in which iterations can run in parallel. You specify a collection that implements the IEnumerable<T> generic interface and a reference to a method that takes a single parameter of type T. The method is executed for each item in the collection, and the item is passed as the parameter to the method. Overloads are available that enable you to provide private local thread data and specify options for creating the tasks run by the ForEach method.

  • Parallel.Invoke You can use this method to execute a set of parameterless method calls as parallel tasks. You specify a list of delegated method calls (or lambda expressions) that take no parameters and do not return values. Each method call can be run on a separate thread, in any order. For example, the following code makes a series of method calls:

    doWork();
    doMoreWork();
    doYetMoreWork();

    You can replace these statements with the following code, which invokes these methods by using a series of tasks:

    Parallel.Invoke(
        doWork,
        doMoreWork,
        doYetMoreWork
    );

You should bear in mind that the .NET Framework determines the actual degree of parallelism appropriate for the environment and workload of the computer. For example, if you use Parallel.For to implement a loop that performs 1000 iterations, the .NET Framework does not necessarily create 1000 concurrent tasks (unless you have an exceptionally powerful processor with 1000 cores). Instead, the .NET Framework creates what it considers to be the optimal number of tasks that balances the available resources against the requirement to keep the processors occupied. A single task might perform multiple iterations, and the tasks coordinate with each other to determine which iterations each task will perform. An important consequence of this is that you cannot guarantee the order in which the iterations are executed, so you must ensure there are no dependencies between iterations; otherwise, you might get unexpected results, as you will see later in this chapter.

In the next exercise, you will return to the original version of the GraphData application and use the Parallel class to perform operations concurrently.

Use the Parallel class to parallelize operations in the GraphData application

  1. Using Visual Studio 2010, open the GraphDemo solution, located in the \Microsoft Press\Visual CSharp Step By Step\Chapter 27\GraphDemo Using the Parallel Class folder in your Documents folder.

    This is a copy of the original GraphDemo application. It does not use tasks yet.

  2. In Solution Explorer, in the GraphDemo project, expand the GraphWindow.xaml node, and then double-click GraphWindow.xaml.cs to display the code for the form in the Code and Text Editor window.

  3. Add the following using statement to the list at the top of the file:

    using System.Threading.Tasks;
  4. Locate the generateGraphData method. It looks like this:

    private void generateGraphData(byte[] data)
    {
        int a = pixelWidth / 2;
        int b = a * a;
        int c = pixelHeight / 2;
    
        for (int x = 0; x < a; x++)
        {
            int s = x * x;
            double p = Math.Sqrt(b - s);
            for (double i = -p; i < p; i += 3)
            {
                double r = Math.Sqrt(s + i * i) / a;
                double q = (r - 1) * Math.Sin(24 * r);
                double y = i / 3 + (q * c);
                plotXY(data, (int)(-x + (pixelWidth / 2)), (int)(y + (pixelHeight / 2)));
                plotXY(data, (int)(x + (pixelWidth / 2)), (int)(y + (pixelHeight / 2)));
            }
        }
    }

    The outer for loop that iterates through values of the integer variable x is a prime candidate for parallelization. You might also consider the inner loop based on the variable i, but this loop takes more effort to parallelize because of the type of i. (The methods in the Parallel class expect the control variable to be an integer.) Additionally, if you have nested loops such as occur in this code, it is good practice to parallelize the outer loops first and then test to see whether the performance of the application is sufficient. If it is not, work your way through nested loops and parallelize them working from outer to inner loops, testing the performance after modifying each one. You will find that in many cases parallelizing outer loops has the most effect on performance, while the effects of modifying inner loops becomes more marginal.

  5. Move the code in the body of the for loop, and create a new private void method called calculateData with this code. The calculateData method should take an integer parameter called x and a byte array called data. Also, move the statements that declare the local variables a, b, and c from the generateGraphData method to the start of the calculateData method. The following code shows the generateGraphData method with this code removed and the calculateData method (do not try and compile this code yet):

    private void generateGraphData(byte[] data)
    {
        for (int x = 0; x < a; x++)
        {
        }
    }
    
    private void calculateData(int x, byte[] data)
    {
        int a = pixelWidth / 2;
        int b = a * a;
        int c = pixelHeight / 2;
    
        int s = x * x;
        double p = Math.Sqrt(b - s);
        for (double i = -p; i < p; i += 3)
        {
            double r = Math.Sqrt(s + i * i) / a;
            double q = (r - 1) * Math.Sin(24 * r);
            double y = i / 3 + (q * c);
            plotXY(data, (int)(-x + (pixelWidth / 2)), (int)(y + (pixelHeight / 2)));
            plotXY(data, (int)(x + (pixelWidth / 2)), (int)(y + (pixelHeight / 2)));
        }
    }
  6. In the generateGraphData method, change the for loop to a statement that calls the static Parallel.For method, as shown in bold here:

    private void generateGraphData(byte[] data)
    {
        Parallel.For (0, pixelWidth / 2, (int x) => { calculateData(x, data); });
    }

    This code is the parallel equivalent of the original for loop. It iterates through the values from 0 to pixelWidth / 2 – 1 inclusive. Each invocation runs by using a task. (Each task might run more than one iteration.) The Parallel.For method finishes only when all the tasks it has created complete their work. Remember that the Parallel.For method expects the final parameter to be a method that takes a single integer parameter. It calls this method passing the current loop index as the parameter. In this example, the calculateData method does not match the required signature because it takes two parameters: an integer and a byte array. For this reason, the code uses a lambda expression to define an anonymous method that has the appropriate signature and that acts as an adapter that calls the calculateData method with the correct parameters.

  7. On the Debug menu, click Start Without Debugging to build and run the application.

  8. Display the Windows Task Manager, and click the Performance tab if it is not currently displayed.

  9. Return to the Graph Demo window, and click Plot Graph. In the Windows Task Manager, note the maximum value for the CPU usage while the graph is being generated. When the graph appears in the Graph Demo window, record the time taken to generate the graph. Repeat this action several times to get an average value.

  10. Close the Graph Demo window, and minimize the Windows Task Manager.

    You should notice that the application runs at a comparable speed to the previous version that used Task objects (and possibly slightly faster, depending on the number of CPUs you have available), and that the CPU usage peaks at 100 percent.

When Not to Use the Parallel Class

You should be aware that despite appearances and the best efforts of the Visual Studio development team at Microsoft, the Parallel class is not magic; you cannot use it without due consideration and just expect your applications to suddenly run significantly faster and produce the same results. The purpose of the Parallel class is to parallelize compute-bound, independent areas of your code.

The key phrases in the previous paragraph are compute-bound and independent. If your code is not compute-bound, parallelizing it might not improve performance. The next exercise shows you that you should be careful in how you determine when to use the Parallel.Invoke construct to perform method calls in parallel.

Determine when to use Parallel.Invoke

  1. Return to Visual Studio 2010, and display the GraphWindow.xaml.cs file in the Code and Text Editor window if it is not already open.

  2. Examine the calculateData method.

    The inner for loop contains the following statements:

    plotXY(data, (int)(-x + (pixelWidth / 2)), (int)(y + (pixelHeight / 2)));
    plotXY(data, (int)(x + (pixelWidth / 2)), (int)(y + (pixelHeight / 2)));

    These two statements set the bytes in the data array that correspond to the points specified by the two parameters passed in. Remember that the points for the graph are reflected around the X axis, so the plotXY method is called for the positive value of the X coordinate and also for the negative value. These two statements look like good candidates for parallelization because it does not matter which one runs first, and they set different bytes in the data array.

  3. Modify these two statements, and wrap them in a Parallel.Invoke method call, as shown next. Notice that both calls are now wrapped in lambda expressions, and that the semi-colon at the end of the first call to plotXY is replaced with a comma and the semi-colon at the end of the second call to plotXY has been removed because these statements are now a list of parameters:

    Parallel.Invoke(
        () => plotXY(data, (int)(-x + (pixelWidth / 2)), (int)(y + (pixelHeight / 2))),
        () => plotXY(data, (int)(x + (pixelWidth / 2)), (int)(y + (pixelHeight / 2)))
    );
  4. On the Debug menu, click Start Without Debugging to build and run the application.

  5. In the Graph Demo window, click Plot Graph. Record the time taken to generate the graph. Repeat this action several times to get an average value.

    You should find, possibly unexpectedly, that the application takes significantly longer to run. It might be up to 20 times slower than it was previously.

  6. Close the Graph Demo window.

The questions you are probably asking at this point are, “What went wrong? Why did the application slow down so much?” The answer lies in the plotXY method. If you take another look at this method, you will see that it is very simple:

private void plotXY(byte[] data, int x, int y)
{
    data[x + y * pixelWidth] = 0xFF;
}

There is very little in this method that takes any time to run, and it is definitely not a compute-bound piece of code. In fact, it is so simple that the overhead of creating a task, running this task on a separate thread, and waiting for the task to complete is much greater than the cost of running this method directly. The additional overhead might account for only a few milliseconds each time the method is called, but you should bear in mind the number of times that this method runs; the method call is located in a nested loop and is executed thousands of times, so all of these small overhead costs add up. The general rule is to use Parallel.Invoke only when it is worthwhile. Reserve Parallel.Invoke for operations that are computationally intensive.

As mentioned earlier in this chapter, the other key consideration for using the Parallel class is that operations should be independent. For example, if you attempt to use Parallel.For to parallelize a loop in which iterations are not independent, the results will be unpredictable.

To see what I mean, look at the following program:

using System;
using System.Threading;
using System.Threading.Tasks;

namespace ParallelLoop
{
    class Program
    {
        private static int accumulator = 0;

        static void Main(string[] args)
        {
            for (int i = 0; i < 100; i++)
            {
                AddToAccumulator(i);
            }
            Console.WriteLine("Accumulator is {0}", accumulator);
        }

        private static void AddToAccumulator(int data)
        {
            if ((accumulator % 2) == 0)
            {
                accumulator += data;
            }
            else
            {
                accumulator -= data;
            }
        }
    }
}

This program iterates through the values from 0 to 99 and calls the AddToAccumulator method with each value in turn. The AddToAccumulator method examines the current value of the accumulator variable, and if it is even it adds the value of the parameter to the accumulator variable; otherwise, it subtracts the value of the parameter. At the end of the program, the result is displayed. You can find this application in the ParallelLoop solution, located in the \Microsoft Press\Visual CSharp Step By Step\Chapter 27\ParallelLoop folder in your Documents folder. If you run this program, the value output should be –100.

To increase the degree of parallelism in this simple application, you might be tempted to replace the for loop in the Main method with Parallel.For, like this:

static void Main(string[] args)
{
    Parallel.For (0, 100, AddToAccumulator);
    Console.WriteLine("Accumulator is {0}", accumulator);
}

However, there is no guarantee that the tasks created to run the various invocations of the AddToAccumulator method will execute in any specific sequence. (The code is also not thread-safe because multiple threads running the tasks might attempt to modify the accumulator variable concurrently.) The value calculated by the AddToAccumulator method depends on the sequence being maintained, so the result of this modification is that the application might now generate different values each time it runs. In this simple case, you might not actually see any difference in the value calculated because the AddToAccumulator method runs very quickly and the .NET Framework might elect to run each invocation sequentially by using the same thread. However, if you make the following change shown in bold to the AddToAccumulator method, you will get different results:

private static void AddToAccumulator(int data)
{
    if ((accumulator % 2) == 0)
    {
        accumulator += data;
        Thread.Sleep(10); // wait for 10 milliseconds
    }
    else
    {
        accumulator -= data;
    }
}

The Thread.Sleep method simply causes the current thread to wait for the specified period of time. This modification simulates the thread, performing additional processing and affects the way in which the .NET Framework schedules the tasks, which now run on different threads resulting in a different sequence.

The general rule is to use Parallel.For and Parallel.ForEach only if you can guarantee that each iteration of the loop is independent, and test your code thoroughly. A similar consideration applies to Parallel.Invoke; use this construct to make method calls only if they are independent and the application does not depend on them being run in a particular sequence.

Returning a Value from a Task

So far, all the examples you have seen use a Task object to run code that performs a piece of work but does not return a value. However, you might also want to run a method that calculates a result. The TPL includes a generic variant of the Task class, Task<TResult>, that you can use for this purpose.

You create and run a Task<TResult> object in a similar way as a Task object. The main difference is that the method run by the Task<TResult> object returns a value, and you specify the type of this return value as the type parameter, T, of the Task object. For example, the method calculateValue shown in the following code example returns an integer value. To invoke this method by using a task, you create a Task<int> object and then call the Start method. You obtain the value returned by the method by querying the Result property of the Task<int> object. If the task has not finished running the method and the result is not yet available, the Result property blocks the caller. What this means is that you don’t have to perform any synchronization yourself, and you know that when the Result property returns a value the task has completed its work.

Task<int> calculateValueTask = new Task<int>(() => calculateValue(...));
calculateValueTask.Start(); // Invoke the calculateValue method
...
int calculatedData = calculateValueTask.Result; // Block until calculateValueTask completes
...
private int calculateValue(...)
{
    int someValue;
    // Perform calculation and populate someValue
    ...
    return someValue;
}

Of course, you can also use the StartNew method of a TaskFactory object to create a Task<TResult> object and start it running. The next code example shows how to use the default TaskFactory for a Task<int> object to create and run a task that invokes the calculateValue method:

Task<int> calculateValueTask = Task<int>.Factory.StartNew(() => calculateValue(...));
...

To simplify your code a little (and to support tasks that return anonymous types), the TaskFactory class provides generic overloads of the StartNew method and can infer the type returned by the method run by a task. Additionally, the Task<TResult> class inherits from the Task class. This means that you can rewrite the previous example like this:

Task calculateValueTask = Task.Factory.StartNew(() => calculateValue(...));
...

The next exercise gives a more detailed example. In this exercise, you will restructure the GraphDemo application to use a Task<TResult> object. Although this exercise seems a little academic, you might find the technique that it demonstrates useful in many real-world situations.

Modify the GraphDemo application to use a Task<TResult> object

  1. Using Visual Studio 2010, open the GraphDemo solution, located in the \Microsoft Press\Visual CSharp Step By Step\Chapter 27\GraphDemo Using Tasks that Return Results folder in your Documents folder.

    This is a copy of the GraphDemo application that creates a set of four tasks that you saw in an earlier exercise.

  2. In Solution Explorer, in the GraphDemo project, expand the GraphWindow.xaml node, and then double-click GraphWindow.xaml.cs to display the code for the form in the Code and Text Editor window.

  3. Locate the plotButton_Click method. This is the method that runs when the user clicks the Plot Graph button on the form. Currently, it creates a set of Task objects to perform the various calculations required and generate the data for the graph, and it waits for these Task objects to complete before displaying the results in the Image control on the form.

  4. Underneath the plotButton_Click method, add a new method called getDataForGraph. This method should take an integer parameter called dataSize and return a byte array, as shown in the following code:

    private byte[] getDataForGraph(int dataSize)
    {
    }

    You will add code to this method to generate the data for the graph in a byte array and return this array to the caller. The dataSize parameter specifies the size of the array.

  5. Move the statement that creates the data array from the plotButton_Click method to the getDataForGraph method as shown here in bold:

    private byte[] getDataForGraph(int dataSize)
    {
        byte[] data = new byte[dataSize];
    }
  6. Move the code that creates, runs, and waits for the Task objects that populate the data array from the plotButton_Click method to the getDataForGraph method, and add a return statement to the end of the method that passes the data array back to the caller. The completed code for the getDataForGraph method should look like this:

    private byte[] getDataForGraph(int dataSize)
    {
        byte[] data = new byte[dataSize];
        Task first = Task.Factory.StartNew(() => generateGraphData(data, 0, pixelWidth /
    8));
        Task second = Task.Factory.StartNew(() => generateGraphData(data, pixelWidth / 8,
    pixelWidth / 4));
        Task third = Task.Factory.StartNew(() => generateGraphData(data, pixelWidth / 4,
    pixelWidth * 3 / 8));
        Task fourth = Task.Factory.StartNew(() => generateGraphData(data, pixelWidth * 3 /
    8, pixelWidth / 2));
        Task.WaitAll(first, second, third, fourth);
        return data;
    }
  7. In the plotButton_Click method, after the statement that creates the Stopwatch variable used to time the tasks, add the statement shown next in bold that creates a Task<byte[]> object called getDataTask and uses this object to run the getDataForGraph method. This method returns a byte array, so the type of the task is Task<byte []>. The StartNew method call references a lambda expression that invokes the getDataForGraph method and passes the dataSize variable as the parameter to this method.

    private void plotButton_Click(object sender, RoutedEventArgs e)
    {
        ...
        Stopwatch watch = Stopwatch.StartNew();
        Task<byte[]> getDataTask = Task<byte[]>.Factory.StartNew(() =>
    getDataForGraph(dataSize));
        ...
    }
  8. After creating and starting the Task<byte []> object, add the following statements shown in bold that examine the Result property to retrieve the data array returned by the getDataForGraph method into a local byte array variable called data. Remember that the Result property blocks the caller until the task has completed, so you do not need to explicitly wait for the task to finish.

    private void plotButton_Click(object sender, RoutedEventArgs e)
    {
        ...
        Task<byte[]> getDataTask = Task<byte[]>.Factory.StartNew(() =>
    getDataForGraph(dataSize));
        byte[] data = getDataTask.Result;
        ...
    }
  9. Verify that the completed code for the plotButton_Click method looks like this:

    private void plotButton_Click(object sender, RoutedEventArgs e)
    {
        if (graphBitmap == null)
        {
            graphBitmap = new WriteableBitmap(pixelWidth, pixelHeight, dpiX, dpiY,
    PixelFormats.Gray8, null);
        }
        int bytesPerPixel = (graphBitmap.Format.BitsPerPixel + 7) / 8;
        int stride = bytesPerPixel * pixelWidth;
        int dataSize = stride * pixelHeight;
    
        Stopwatch watch = Stopwatch.StartNew();
        Task<byte[]> getDataTask = Task<byte[]>.Factory.StartNew(() =>
    getDataForGraph(dataSize));
        byte[] data = getDataTask.Result;
    
        duration.Content = string.Format("Duration (ms): {0}", watch.ElapsedMilliseconds);
        graphBitmap.WritePixels(new Int32Rect(0, 0, pixelWidth, pixelHeight), data,
    stride, 0);
        graphImage.Source = graphBitmap;
    }
  10. On the Debug menu, click Start Without Debugging to build and run the application.

  11. In the Graph Demo window, click Plot Graph. Verify that the graph is generated as before and that the time taken is similar to that seen previously. (The time reported might be marginally slower because the data array is now created by the task, whereas previously it was created before the task started running.)

  12. Close the Graph Demo window.