Design Principles and Patterns for Software Engineering with Microsoft .NET

10/15/2008

Contents

This chapter from Microsoft .NET - Architecting Applications for the Enterprise offers a quick tutorial about software engineering. It first outlines some basic principles that should always inspire the design of a modern software system and then discusses principles of object-oriented design.

Experienced designers evidently know something inexperienced others don’t. What is it?
— Erich Gamma

In Chapter 1. we focused on the true meaning of architecture and the steps through which architects get a set of specifications for the development team. We focused more on the process than the principles and patterns of actual design. In Chapter 2. we filled a gap by serving up a refresher (or a primer, depending on the reader’s skills) of Unified Modeling Language (UML). UML is the most popular modeling language through which design is expressed and communicated within development teams.

When examining the bundle of requirements, the architect at first gets a relatively blurred picture of the system. As the team progresses through iterations, the contours of the picture sharpen. In the end, the interior of the system unveils a web of interrelated classes applying design patterns and fulfilling design principles.

Designing a software system is challenging because it requires you to focus on today’s requested features while ensuring that the resulting system be flexible enough to support changes and addition of new features in the future.

Especially in the past two decades, a lot has been done in the Information Technology (IT) industry to make a systematic approach to software development possible. Methodologies, design principles, and finally patterns have been developed to help guide architects to envision and build systems of any complexity in a disciplined way.

This chapter aims to provide you with a quick tutorial about software engineering. It first outlines some basic principles that should always inspire the design of a modern software system. The chapter then moves on to discuss principles of object-oriented design. Along the way, we introduce patterns, idioms, and aspect-orientation, as well as pearls of wisdom regarding requirement-driven design that affect key areas such as testability, security, and performance.

Basic Design Principles

It is one thing to write code that just works. It is quite another to write good code that works. Adopting the attitude of “writing good code that works” springs from the ability to view the system from a broad perspective. In the end, a top-notch system is not just a product of writing instructions and hacks that make it all work. There’s much more, actually. And it relates, directly or indirectly, to design.

The attitude of “writing good code that works” leads you, for example, to value the maintainability of the code base over any other quality characteristics, such as those defined by International Organization for Standardization (ISO) and International Electrotechnical Commission (IEC) standard 9126. (See Chapter 1. "Architects and Architecture Today".) You adopt this preference not so much because other aspects (such as extensibility or perhaps scalability) are less important than maintainability—it’s just that maintenance is expensive and can be highly frustrating for the developers involved.

A code base that can be easily searched for bugs, and in which fixing bugs is not problematic for anyone, is open to any sort of improvements at any time, including extensibility and scalability. Thus, maintainability is the quality characteristic you should give the highest priority when you design a system.

Why is software maintenance so expensive?

Maintenance becomes expensive if essentially you have produced unsatisfactory (should we say, sloppy?) software, you haven’t tested the software enough, or both. Which attributes make software easier to maintain and evolve? Structured design in the first place, which is best applied through proper coding techniques. Code readability is another fundamental asset, which is best achieved if the code is combined with a bunch of internal documentation and a change-tracking system—but this might occur only in a perfect world.

Before we proceed any further with the basic principles of structured design, let’s arrange a brief cheat-sheet to help us catch clear and unambiguous symptoms of bad code design.

NOTE

Unsatisfactory software mostly springs from a poor design. But what causes a poor design? A poor design typically has two causes that are not mutually exclusive: the architect’s insufficient skills, and imprecise or contradictory requirements. So what about the requirements problem, then? Contradictory requirements usually result from bad communication. Communication is king, and it is one of the most important skills for an architect to cultivate and improve.

Not surprisingly, fixing this communication problem drives us again straight to agile methodologies. What many people still miss about the agile movement is that the primary benefit you get is not so much the iterative method itself. Instead, the major benefit comes from the continuous communication that the methodology promotes within the team and between the team and the customers. Whatever you get wrong in the first iteration will be fixed quite soon in the next (or close to the next) iteration because the communication that is necessary to move forward will clarify misunderstood requirements and fix bad ones. And it will do so quite early in the process and on a timely basis. This iterative approach simply reduces the entry point for the major cause of costly software maintenance: poor communication. And this is the primary reason why, one day, a group of (perfectly sane) developers and architects decided to found the agile movement. It was pragmatism that motivated them, not caprice.

This said, you should also keep in mind that that agile methodologies also tend to increase development costs and run the risk of scope/requirements creep. You also must make sure everyone in the process is on board with it. If the stakeholders don’t understand their role or are not responsive, or can’t review the work between iterations, the agile approach fails. So the bottom line is that the agile approach isn’t a magic wand that works for everyone. But when it works, it usually works well.

For What the Alarm Bell Should Ring

Even with the best intentions of everyone involved and regardless of their efforts, the design of a system at some point can head down a slippery slope. The deterioration of a good design is generally a slow process that occurs over a relatively long period of time. It happens by continually studding your classes with hacks and workarounds, making a large share of the code harder and harder to maintain and evolve. At a certain point, you find yourself in serious trouble.

Managers might be tempted to call for a complete redesign, but redesigning an evolving system is like trying to catch a runaway chicken. You need to be in a very good shape to do it. But is the team really in shape at that point?

Let’s identify a few general signs that would make the alarm bell ring to warn of a problematic design.

Rigid, Therefore Fragile

Can you bend a piece of wood? What do you risk if you insist on doing it? A piece of wood is typically a stiff and rigid object characterized by some resistance to deformation. When enough force is applied, the deformation becomes permanent and the wood breaks.

What about rigid software?

Rigid software is characterized by some resistance to changes. Resistance is measured in terms of regression. You make a change in one module, but the effects of your change cascade down the list of dependent modules. As a result, it’s really hard to predict how long making a change—any change, even the simplest—will actually take.

If you pummel glass or any other fragile material, you manage only to break it into several pieces. Likewise, when you enter a change in software and break it in various places, it becomes quite apparent that software is definitely fragile.

As in other areas of life, in the software world fragility and rigidity go hand in hand. When a change in a software module breaks (many) other modules because of (hidden) dependencies, you have a clear symptom of a bad design that needs to be remedied as soon as possible.

Easier to Use Than to Reuse

Imagine you have a piece of software that works in one project; you would like to reuse it in another project. However, copying the class or linking the assembly in the new project just doesn’t work.

Why is it so?

If the same code doesn’t work when moved to another project, it’s because of dependencies. The real problem isn’t just dependencies, but the number and depth of dependencies. The risk is that to reuse a piece of functionality in another project, you have to import a much larger set of functions. Ultimately, no reuse is ever attempted and code is rewritten from scratch.

This is not a good sign for your design. This negative aspect of a design is often referred to as immobility.

Easier to Work Around Than to Fix

When applying a change to a software module, it is not unusual that you figure out two or more ways to do it. Most of the time, one way of doing things is nifty, elegant, coherent with the design, but terribly laborious to implement. The other way is, conversely, much smoother, quick to code, but sort of a hack.

What should you do?

Actually, you can solve it either way, depending on the given deadlines and your manager’s direction about it.

In summary, it is not an ideal situation when a workaround is much easier and faster to apply than the right solution. And it doesn’t make a great statement about your overall design, either. It is a sign that too many unnecessary dependencies exist between classes and that your classes do not form a particularly cohesive mass of code.

This aspect of a design—that it invites or accommodates workarounds more or less than fixes—is often referred to as viscosity. High viscosity is bad, meaning that the software resists modification just as highly viscous fluids resist flow.

Structured Design

When the two of us started programming, which was far before we started making a living from it, the old BASIC language was still around with its set of GOTO statements. Like many others, we wrote toy programs jumping from one instruction to the next within the same monolithic block of code. They worked just fine, but they were only toy programs in the end.

It was about the late 1960s when the complexity of the average program crossed the significant threshold that marked the need for a more systematic approach to software development. That signaled the official beginning of software engineering.

From Spaghetti Code to Lasagna Code

Made of a messy tangle of jumps and returns, GOTO-based code was soon belittled and infamously labeled as spaghetti code. And we all learned the first of a long list of revolutionary concepts: structured programming. In particular, we learned to use subroutines to break our code into cohesive and more reusable pieces. In food terms, we evolved from spaghetti to lasagna. If you look at Figure 3-1, you will spot the difference quite soon. Lasagna forms a layered block of noodles and toppings that can be easily cut into pieces and just exudes the concept of structure. Lasagna is also easier to serve, which is the food analogy for reusability.

Figure 3-1. From a messy tangle to a layered and ordered block

NOTE

A small note (and some credits) about the figure is in order. First, as Italians we would have used the term lasagne, which is how we spell it, but we went for the international spelling of lasagna. However, we eat it regardless of the spelling. Second, Dino personally ate all the food in the figure in a sort of manual testing procedure for the book’s graphics. Dino, however, didn’t cook anything. Dino’s mother-in-law cooked the spaghetti; Dino’s mom cooked the lasagna. Great stuff—if you’re in Italy, and want to give it a try, send Dino an e-mail.

What software engineering really has been trying to convey since its inception is the need for some design to take place before coding begins and, subsequently, the need for some basic design principles. Still, today, when someone says “structured programming,” immediately many people think of subroutines. This assumption is correct, but it’s oversimplifying the point and missing the principal point of the structured approach.

Behind structured programming, there is structured design with two core principles. And these principles are as valid today as they were 30 and more years ago. Subroutines and Pascal-like programming are gone; the principles of cohesion and coupling, instead, still maintain their effectiveness in an object-oriented world.

These principles of structured programming, coupling and cohesion, were first introduced by Larry Constantine and Edward Yourdon in their book Structured Design: Fundamentals of a Discipline of Computer Program and Systems Design (Yourdon Press, 1976).

Cohesion

Cohesion indicates that a given software module—be it a subroutine, class, or library—features a set of responsibilities that are strongly related. Put another way, cohesion measures the distance between the logic expressed by the various methods on a class, the various functions in a library, and the various actions accomplished by a method.

If you look for a moment at the definition of cohesion in another field—chemistry—you should be able to see a clearer picture of software cohesion. In chemistry, cohesion is a physical property of a substance that indicates the attraction existing between like molecules within a body.

Cohesion measurement ranges from low to high and is preferably in the highest range possible.

Highly cohesive modules favor maintenance and reusability because they tend to have no dependencies. Low cohesion, on the other hand, makes it much harder to understand the purpose of a class and creates a natural habitat for rigidity and fragility in the software. Low cohesive modules also propagate dependencies through modules, thus contributing to the immobility and high viscosity of the design.

Decreasing cohesion leads to creating modules (for example, classes) where responsibilities (for example, methods) have very little in common and refer to distinct and unrelated activities. Translated in a practical guideline, the principle of cohesion recommends creating extremely specialized classes with few methods, which refer to logically related operations. If the logical distance between methods grows, you just create a new class.

Ward Cunningham—a pioneer of Extreme Programming—offers a concise and pragmatic definition of cohesion in his wiki at http://c2.com/cgi/wiki?CouplingAndCohesion. He basically says that two modules, A and B, are cohesive when a change to A has no repercussion for B so that both modules can add new value to the system.

There’s another quote we’d like to use from Ward Cunningham’s wiki to reinforce a concept we expressed a moment ago about cohesion. Cunningham suggests that we define cohesion as inversely proportional to the number of responsibilities a module (for example, a class) has. We definitely like this definition.

IMPORTANT

Strongly related to cohesion is the Single Responsibility Principle (SRP). In the formulation provided by Robert Martin (which you can see at http://www.objectmentor.com/resources/articles/srp.pdf), SRP indicates that each class should always have just one reason to change. In other words, each class should be given a single responsibility, where a responsibility is defined as “a reason to change.” A class with multiple responsibilities has more reasons to change and, subsequently, a less cohesive interface. A correct application of SRP entails breaking the methods of a class into logical subsets that configure distinct responsibilities. In the real world, however, this is much harder to do than the opposite—that is, aggregating distinct responsibilities in the same class.

Coupling

Coupling measures the level of dependency existing between two software modules, such as classes, functions, or libraries. An excellent description of coupling comes, again, from Cunningham’s wiki at http://c2.com/cgi/wiki?CouplingAndCohesion. Two modules, A and B, are said to be coupled when it turns out that you have to make changes to B every time you make any change to A.

In other words, B is not directly and logically involved in the change being made to module A. However, because of the underlying dependency, B is forced to change; otherwise, the code won’t compile any longer.

Coupling measurement ranges from low to high and the lowest possible range is preferable.

Low coupling doesn’t mean that your modules are to be completely isolated from one another. They are definitely allowed to communicate, but they should do that through a set of well-defined and stable interfaces. Each module should be able to work without intimate knowledge of another module’s internal implementation.

Conversely, high coupling hinders testing and reusing code and makes understanding it nontrivial. It is also one of the primary causes of a rigid and fragile design.

Low coupling and high cohesion are strongly correlated. A system designed to achieve low coupling and high cohesion generally meets the requirements of high readability, maintainability, easy testing, and good reuse.

Separation of Concerns

So you know you need to cook up two key ingredients in your system’s recipe. But is there a supermarket where you can get both? How do you achieve high cohesion and low coupling in the design of a software system?

A principle that is helpful to achieving high cohesion and low coupling is separation of concerns (SoC), introduced in 1974 by Edsger W. Dijkstra in his paper “On the Role of Scientific Thought.” If you’re interested, you can download the full paper from http://www.cs.utexas.edu/users/EWD/ewd04xx/EWD447.PDF.

Identifying the Concerns

SoC is all about breaking the system into distinct and possibly nonoverlapping features. Each feature you want in the system represents a concern and an aspect of the system. Terms such as feature, concern, and aspect are generally considered synonyms. Concerns are mapped to software modules and, to the extent that it is possible, there’s no duplication of functionalities.

SoC suggests that you focus on one particular concern at a time. It doesn’t mean, of course, that you ignore all other concerns of the system. More simply, after you’ve assigned a concern to a software module, you focus on building that module. From the perspective of that module, any other concerns are irrelevant.

NOTE

If you read Dijkstra’s original text, you’ll see that he uses the expression "Separation of Concerns" to indicate the general principle, but switches to the word “aspect” to indicate individual concerns that relate to a software system. For quite a few years, the word “aspect” didn’t mean anything special to software engineers. Things changed in the late 1990s when aspect-oriented programming (AOP) entered the industry. We’ll return to AOP later in this chapter, but we make the forward reference here to show Dijkstra’s great farsightedness.

Modularity

SoC is concretely achieved through using modular code and making heavy use of information hiding.

Modular programming encourages the use of separate modules for each significant feature. Modules are given their own public interface to communicate with other modules and can contain internal chunks of information for private use.

Only members in the public interface are visible to other modules. Internal data is either not exposed or it is encapsulated and exposed in a filtered manner. The implementation of the interface contains the behavior of the module, whose details are not known or accessible to other modules.

Information Hiding

Information hiding (IH) is a general design principle that refers to hiding behind a stable interface some implementation details of a software module that are subject to change. In this way, connected modules continue to see the same fixed interface and are unaffected by changes.

A typical application of the information-hiding principle is the implementation of properties in C# or Microsoft Visual Basic .NET classes. (See the following code sample.) The property name represents the stable interface through which callers refer to an internal value. The class can obtain the value in various ways (for example, from a private field, a control property, a cache, the view state in ASP.NET) and can even change this implementation detail without breaking external code.

// Software module where information hiding is applied
public class Customer
{
   // Implementation detail being hidden
   private string _name;

   // Public and stable interface
   public string CustomerName
   {
        // Implementation detail being hidden
        get {return _name;}
   }
}

Information hiding is often referred to as encapsulation. We like to distinguish between the principle and its practical applications. In the realm of object-oriented programming, encapsulation is definitely an application of IH.

Generally, though, the principle of SoC manifests itself in different ways in different programming paradigms, and so it is for modularity and information hiding.

SoC and Programming Paradigms

The first programming paradigm that historically supported SoC was Procedural Programming (PP), which we find expressed in languages such as Pascal and C. In PP, you separate concerns using functions and procedures.

Next—with the advent of object-oriented programming (OOP) in languages such as Java, C++, and more recently C# and Visual Basic .NET—you separate concerns using classes.

However, the concept isn’t limited to programming languages. It also transcends the realm of pure programming and is central in many approaches to software architecture. In a service-oriented architecture (SOA), for example, you use services to represent concerns. Layered architectures are based on SoC, and within a middle tier you can use an Object/ Relational Mapping tool (O/RM) to separate persistence from the domain model.

NOTE

In the preceding section, we basically went back over 40 years of computer science, and the entire sector of software engineering. We’ve seen how PP, OOP, and SOA are all direct or indirect emanations of the SoC principle. (Later in this chapter, we’ll see how AOP also fits this principle. In Chapter 7. we’ll see how fundamental design patterns for the presentation layer, such as Model-View-Controller and Model-View-Presenter, also adhere to the SoC principle.)

You really understand the meaning of the word principle if you look at how SoC influenced, and still influences, the development of software. And we owe this principle to a great man who passed away in 2002: Edsger W. Dijkstra. We mention this out of respect for this man.

For more information about Dijkstra’s contributions to the field, pay a visit to http://www.cs.utexas.edu/users/ewd.

Naming Conventions and Code Readability

When the implementation of a line-of-business application is expected to take several months to complete and the final application is expected to remain up and running for a few years, it is quite reasonable to expect that many different people will work on the project over time.

With such significant personnel turnover in sight, you must pay a lot of attention to system characteristics such as readability and maintainability. To ensure that the code base is manageable as well as easily shared and understood, a set of common programming rules and conventions should be used. Applied all the way through, common naming conventions, for example, make the whole code base look like it has been written by a single programmer rather than a very large group of people.

The most popular naming convention is Hungarian Notation (HN). You can read more about it at http://en.wikipedia.org/wiki/Hungarian_Notation. Not specifically bound to a programming language, HN became quite popular in the mid-1990s, as it was largely used in many Microsoft Windows applications, especially those written directly against the Windows Software Development Kit (SDK).

HN puts the accent on the type of the variable, and it prefixes the variable name with a mnemonic of the type. For example, szUserName would be used for a zero-terminated string that contains a user name, and iPageCount would be used for an integer that indicates the number of pages. Created to make each variable self-explanatory, HN lost most of its appeal with the advent of object-oriented languages.

In object-oriented languages, everything is an object, and putting the accent on the value, rather than the type, makes much more sense. So you choose variable names regardless of the type and look only at the value they are expected to contain. The choice of the variable name happens in a purely evocative way. Therefore, valid names are, for example, customer, customerID, and lowestPrice.

Finally, an argument against using HN is that a variable name should be changed every time the type of the variable changes during development. In practice, this is often difficult or overlooked, leading developers to make incorrect assumptions about the values contained within the variables. This often leads directly to bugs.

You can find detailed design guidelines for the .NET Framework classes and applications at http://msdn.microsoft.com/en-us/library/ms229042.aspx.

Save to your account