Introducing the Task Parallel Library in Microsoft Visual C# 2010

  • 4/15/2010
In this chapter from Microsoft Visual C# 2010 Step by Step you will see how to improve concurrency in an application by using the Task Parallel Library.

After completing the chapter, you will be able to

  • Describe the benefits that implementing parallel operations in an application can bring.

  • Explain how the Task Parallel Library provides an optimal platform for implementing applications that can take advantage of multiple processor cores.

  • Use the Task class to create and run parallel operations in an application.

  • Use the Parallel class to parallelize some common programming constructs.

  • Use tasks with threads to improve responsiveness and throughput in graphical user interface (GUI) applications.

  • Cancel long-running tasks, and handle exceptions raised by parallel operations.

You have now seen how to use Microsoft Visual C# to build applications that provide a graphical user interface and that can manage data held in a database. These are common features of most modern systems. However, as technology has advanced so have the requirements of users, and the applications that enable them to perform their day-to-day operations need to provide ever-more sophisticated solutions. In the final part of this book, you will look at some of the advanced features introduced with the .NET Framework 4.0. In particular, in this chapter you will see how to improve concurrency in an application by using the Task Parallel Library. In the next chapter, you will see how the parallel extensions provided with the .NET Framework can be used in conjunction with Language Integrated Query (LINQ) to improve the throughput of data access operations. And in the final chapter, you will meet Windows Communication Foundation for building distributed solutions that can incorporate services running on multiple computers. As a bonus, the appendix (provided on the CD) describes how to use the Dynamic Language Runtime to build C# applications and components that can interoperate with services built by using other languages that operate outside of the structure provided by the .NET Framework, such as Python and Ruby.

In the bulk of the preceding chapters in this book, you learned how to use C# to write programs that run in a single-threaded manner. By “single-threaded,” I mean that at any one point in time, a program has been executing a single instruction. This might not always be the most efficient approach for an application to take. For example, you saw in Chapter 23, "Gathering User Input," that if your program is waiting for the user to click a button on a Windows Presentation Foundation (WPF) form, there might be other work that it can perform while it is waiting. However, if a single-threaded program has to perform a lengthy, processor-intensive calculation, it cannot respond to the user typing in data on a form or clicking a menu item. To the user, the application appears to have frozen. Only when the calculation has completed does the user interface start responding again. Applications that can perform multiple tasks at the same time can make far better use of the resources available on a computer, can run more quickly, and can be more responsive. Additionally, some individual tasks might run more quickly if you can divide them into parallel paths of execution that can run concurrently. In Chapter 23, you saw how WPF can take advantage of threads to improve responsiveness in a graphical user interface. In this chapter, you will learn how to use the Task Parallel Library to implement a more generic form of multitasking in your programs that can apply to computationally intensive applications and not just those concerned with managing user interfaces.

Why Perform Multitasking by Using Parallel Processing?

As mentioned in the introduction, there are two principle reasons why you might want to perform multitasking in an application:

  • To improve responsiveness You can give the user of an application the impression that the program is performing more than one task at a time by dividing the program up into concurrent threads of execution and allowing each thread to run in turn for a short period of time. This is the conventional co-operative model that many experienced Windows developers are familiar with. However, it is not true multitasking because the processor is shared between threads, and the co-operative nature of this approach requires that the code executed by each thread behaves in an appropriate manner. If one thread dominates the CPU and resources available at the expense of other threads, the advantages of this approach are lost. It is sometimes difficult to write well-behaved applications that follow this model consistently.

  • To improve scalability You can improve scalability by making efficient use of the processing resources available and using these resources to reduce the time required to execute parts of an application. A developer can determine which parts of an application can be performed in parallel and arrange for them to be run concurrently. As more computing resources are added, more tasks can be run in parallel. Until recently, this model was suitable only for systems that either had multiple CPUs or were able to spread the processing across different computers networked together. In both cases, you had to use a model that arranged for coordination between parallel tasks. Microsoft provides a specialized version of Windows called High Performance Compute (HPC) Server 2008, which enables an organization to build clusters of servers that can distribute and execute tasks in parallel. Developers can use the Microsoft implementation of the Message Passing Interface (MPI), a well-known language-independent communications protocol, to build applications based on parallel tasks that coordinate and cooperate with each other by sending messages. Solutions based on Windows HPC Server 2008 and MPI are ideal for large-scale, compute-bound engineering and scientific applications, but they are expensive for smaller scale, desktop systems.

From these descriptions, you might be tempted to conclude that the most cost-effective way to build multitasking solutions for desktop applications is to use the cooperative multithreaded approach. However, the multithreaded approach was simply intended as a mechanism to provide responsiveness—to enable computers with a single processor to ensure that each task got a fair share of the processor. It is not well-suited for multiprocessor machines because it is not designed to distribute the load across processors and, consequently, does not scale well. While desktop machines with multiple processors were expensive (and consequently relatively rare), this was not an issue. However, this situation is changing, as I will briefly explain.

The Rise of the Multicore Processor

Ten years ago, the cost of a decent personal computer was in the range of $500 to $1000. Today, a decent personal computer still costs about the same, even after ten years of price inflation. The specification of a typical PC these days is likely to include a processor running at a speed of between 2 GHz and 3 GHz, 500 GB of hard disk storage, 4 GB of RAM, high-speed and high-resolution graphics, and a rewritable DVD drive. Ten years ago, the processor speed for a typical machine was between 500 MHz and 1 GHz, 80 GB was a big hard drive, Windows ran quite happily with 256 MB or less of RAM, and rewritable CD drives cost well over $100. (Rewritable DVD drives were rare and extremely expensive.) This is the joy of technological progress: ever faster and more powerful hardware at cheaper and cheaper prices.

This is not a new trend. In 1965, Gordon E. Moore, co-founder of Intel, wrote a paper titled “Cramming more components onto integrated circuits,” which discussed how the increasing miniaturization of components enabled more transistors to be embedded on a silicon chip, and how the falling costs of production as the technology became more accessible would lead economics to dictate squeezing as many as 65,000 components onto a single chip by 1975. Moore’s observations lead to the dictum frequently referred to as “Moore’s Law,” which basically states that the number of transistors that can be placed inexpensively on an integrated circuit will increase exponentially, doubling approximately every two years. (Actually, Gordon Moore was more optimistic than this initially, postulating that the volume of transistors was likely to double every year, but he later modified his calculations.) The ability to pack transistors together led to the ability to pass data between them more quickly. This meant we could expect to see chip manufacturers produce faster and more powerful microprocessors at an almost unrelenting pace, enabling software developers to write ever more complicated software that would run more quickly.

Moore’s Law concerning the miniaturization of electronic components still holds, even after more than 40 years. However, physics has started to intervene. There comes a limit when it is not possible transmit signals between transistors on a single chip any more quickly, no matter how small or densely packed they are. To a software developer, the most noticeable result of this limitation is that processors have stopped getting faster. Six years ago, a fast processor ran at 3 GHz. Today, a fast processor still runs at 3 GHz.

The limit to the speed at which processors can transmit data between components has caused chip companies to look at alternative mechanisms for increasing the amount of work a processor can do. The result is that most modern processors now have two or more processor cores. Effectively, chip manufacturers have put multiple processors on the same chip and added the necessary logic to enable them to communicate and coordinate with each other. Dual-core processors (two cores) and quad-core processors (four cores) are now common. Chips with 8, 16, 32, and 64 cores are available, and the price of these is expected to fall sharply in the near future. So, although processors have stopped speeding up, you can now expect to get more of them on a single chip.

What does this mean to a developer writing C# applications?

In the days before multicore processors, a single-threaded application could be sped up simply by running it on a faster processor. With multicore processors, this is no longer the case. A single-threaded application will run at the same speed on a single-core, dual-core, or quad-core processor that all have the same clock frequency. The difference is that on a dual-core processor, one of the processor cores will be sitting around idle, and on a quad-core processor, three of the cores will be simply ticking over waiting for work. To make the best use of multicore processors, you need to write your applications to take advantage of multitasking.