Querying Information in a Microsoft SQL Server Database

  • 12/19/2007

Querying a Database by Using DLINQ

In Chapter 20, “Querying In-Memory Data by Using Query Expressions” Chapter 20, you saw how to use LINQ to examine the contents of enumerable collections held in memory. LINQ provides query expressions, which use SQL-like syntax for performing queries and generating a result set that you can then step through. It should come as no surprise that you can use an extended form of LINQ, called DLINQ, for querying and manipulating the contents of a database. DLINQ is built on top of ADO.NET. DLINQ provides a high level of abstraction, removing the need for you to worry about the details of constructing an ADO.NET Command object, iterating through a result set returned by a DataReader object, or fetching data column by column by using the various GetXXX methods.

Defining an Entity Class

You saw in Chapter 20 that using LINQ requires the objects that you are querying be enumerable; they must be collections that implement the IEnumerable interface. DLINQ can create its own enumerable collections of objects based on classes you define and that map directly to tables in a database. These classes are called entity classes. When you connect to a database and perform a query, DLINQ can retrieve the data identified by your query and create an instance of an entity class for each row fetched.

The best way to explain DLINQ is to see an example. The Products table in the Northwind database contains columns that contain information about the different aspects of the various products that Northwind Traders sells. The part of the instnwnd.sql script that you ran in the first exercise in this chapter contains a CREATE TABLE statement that looks similar to this (some of the columns, constraints, and other details have been omitted):

CREATE TABLE "Products" (
    "ProductID" "int" NOT NULL ,
    "ProductName" nvarchar (40) NOT NULL ,
    "SupplierID" "int" NULL ,
    "UnitPrice" "money" NULL,
    CONSTRAINT "PK_Products" PRIMARY KEY CLUSTERED ("ProductID"),
    CONSTRAINT "FK_Products_Suppliers" FOREIGN KEY ("SupplierID")
        REFERENCES "dbo"."Suppliers" ("SupplierID")
)

You can define an entity class that corresponds to the Products table like this:

[Table(Name = "Products")]
public class Product
{
    [Column(IsPrimaryKey = true, CanBeNull = false)]
    public int ProductID { get; set; }
    [Column(CanBeNull = false)]
    public string ProductName { get; set; }
    [Column]
    public int? SupplierID { get; set; }
    [Column(DbType = "money")]
    public decimal? UnitPrice { get; set; }
}

The Product class contains a property for each of the columns in which you are interested in the Products table. You don’t have to specify every column from the underlying table, but any columns that you omit will not be retrieved when you execute a query based on this entity class. The important points to note are the Table and Column attributes.

The Table attribute identifies this class as an entity class. The Name parameter specifies the name of the corresponding table in the database. If you omit the Name parameter, DLINQ assumes that the entity class name is the same as the name of the corresponding table in the database.

The Column attribute describes how a column in the Products table maps to a property in the Product class. The Column attribute can take a number of parameters. The ones shown in this example and described in the following list are the most common:

  • The IsPrimaryKey parameter specifies that the property makes up part of the primary key. (If the table has a composite primary key spanning multiple columns, you should specify the IsPrimaryKey parameter for each corresponding property in the entity class.)

  • The DbType parameter specifies the type of the underlying column in the database. In many cases, DLINQ can detect and convert data in a column in the database to the type of the corresponding property in the entity class, but in some situations you need to specify the data type mapping yourself. For example, the UnitPrice column in the Products table uses the SQL Server money type. The entity class specifies the corresponding property as a decimal value.

  • The CanBeNull parameter indicates whether the column in the database can contain a null value. The default value for the CanBeNull parameter is true. Notice that the two properties in the Product class that correspond to columns that permit null values in the database (SupplierID and UnitPrice) are defined as nullable types in the entity class.

    Like the Table attribute, the Column attribute provides a Name parameter that you can use to specify the name of the underlying column in the database. If you omit this parameter, DLINQ assumes that the name of the column is the same as the name of the property in the entity class.

Creating and Running a DLINQ Query

Having defined an entity class, you can use it to fetch and display data from the Products table. The following code shows the basic steps for doing this:

DataContext db = new DataContext("Integrated Security=true;" +
    "Initial Catalog=Northwind;Data Source=YourComputer\\SQLExpress");
Table<Product> products = db.GetTable<Product>();
var productsQuery = from p in products
                    select p;
foreach (var product in productsQuery)
{
    Console.WriteLine("ID: {0}, Name: {1}, Supplier: {2}, Price: {3:C}",
                      product.ProductID, product.ProductName,
                      product.SupplierID, product.UnitPrice);
}

The DataContext class is responsible for managing the relationship between your entity classes and the tables in the database. You use it to establish a connection to the database and create collections of the entity classes. The DataContext constructor expects a connection string as a parameter, specifying the database that you want to use. This connection string is exactly the same as the connection string that you would use when connecting through an ADO.NET Connection object. (The DataContext class actually creates an ADO.NET connection behind the scenes.)

The generic GetTable<TEntity> method of the DataContext class expects an entity class as its TEntity type parameter. This method constructs an enumerable collection based on this type and returns the collection as a Table<TEntity> type. You can perform DLINQ queries over this collection. The query shown in this example simply retrieves every object from the Products table.

The foreach statement iterates through the results of this query and displays the details of each product. The following image shows the results of running this code. (The prices shown are per case, not per individual item.)

httpatomoreillycomsourcemspimages732999.png

The DataContext object controls the database connection automatically; it opens the connection immediately prior to fetching the first row of data in the foreach statement and then closes the connection after the last row has been retrieved.

The DLINQ query shown in the preceding example retrieves every column for every row in the Products table. In this case, you can actually iterate through the products collection directly, like this:

Table<Product> products = db.GetTable<Product>();

foreach (Product product in products)
{
      ...
}

When the foreach statement runs, the DataContext object constructs a SQL SELECT statement that simply retrieves all the data from the Products table. If you want to retrieve a single row in the Products table, you can call the Single method of the Products entity class. Single is an extension method that itself takes a method that identifies the row you want to find and returns this row as an instance of the entity class (as opposed to a collection of rows in a Table collection). You can specify the method parameter as a lambda expression. If the lambda expression does not identify exactly one row, the Single method returns an InvalidOperationException. The following code example queries the Northwind database for the product with the ProductID value of 27. The value returned is an instance of the Product class, and the Console.WriteLine statement prints the name of the product. As before, the database connection is opened and closed automatically by the DataContext object.

Product singleProduct = products.Single(p => p.ProductID == 27);
Console.WriteLine("Name: {0}", singleProduct.ProductName);

Deferred and Immediate Fetching

An important point to emphasize is that by default, DLINQ retrieves the data from the database only when you request it and not when you define a DLINQ query or create a Table collection. This is known as deferred fetching. In the example shown earlier that displays all of the products from the Products table, the productsQuery collection is populated only when the foreach loop runs. This mode of operation matches that of LINQ when querying in-memory objects; you will always see the most up-to-date version of the data, even if the data changes after you have run the statement that creates the productsQuery enumerable collection.

When the foreach loop starts, DLINQ creates and runs a SQL SELECT statement derived from the DLINQ query to create an ADO.NET DataReader object. Each iteration of the foreach loop performs the necessary GetXXX methods to fetch the data for that row. After the final row has been fetched and processed by the foreach loop, DLINQ closes the database connection.

Deferred fetching ensures that only the data an application actually uses is retrieved from the database. However, if you are accessing a database running on a remote instance of SQL Server, fetching data row by row does not make the best use of network bandwidth. In this scenario, you can fetch and cache all the data in a single network request by forcing immediate evaluation of the DLINQ query. You can do this by calling the ToList or ToArray extension methods, which fetch the data into a list or array when you define the DLINQ query, like this:

var productsQuery = from p in products.ToList()
                    select p;

In this code example, productsQuery is now an enumerable list, populated with information from the Products table. When you iterate over the data, DLINQ retrieves it from this list rather than sending fetch requests to the database.

Joining Tables and Creating Relationships

DLINQ supports the join query operator for combining and retrieving related data held in multiple tables. For example, the Products table in the Northwind database holds the ID of the supplier for each product. If you want to know the name of each supplier, you have to query the Suppliers table. The Suppliers table contains the CompanyName column, which specifies the name of the supplier company, and the ContactName column, which contains the name of the person in the supplier company that handles orders from Northwind Traders. You can define an entity class containing the relevant supplier information like this (the SupplierName column in the database is mandatory, but the ContactName allows null values):

[Table(Name = "Suppliers")]
public class Supplier
{
    [Column(IsPrimaryKey = true, CanBeNull = false)]
    public int SupplierID { get; set; }
    [Column(CanBeNull = false)]
    public string CompanyName { get; set; }
    [Column]
    public string ContactName { get; set; }
}

You can then instantiate Table<Product> and Table<Supplier> collections and define a DLINQ query to join these tables together, like this:

DataContext db = new DataContext(...);
Table<Product> products = db.GetTable<Product>();
Table<Supplier> suppliers = db.GetTable<Supplier>();
var productsAndSuppliers = from p in products
                           join s in suppliers
                           on p.SupplierID equals s.SupplierID
                           select new { p.ProductName, s.CompanyName, s.ContactName };

When you iterate through the productsAndSuppliers collection, DLINQ will execute a SQL SELECT statement that joins the Products and Suppliers tables in the database over the SupplierID column in both tables and fetches the data.

However, with DLINQ you can specify the relationships between tables as part of the definition of the entity classes. DLINQ can then fetch the supplier information for each product automatically without requiring that you code a potentially complex and errorprone join statement. Returning to the products and suppliers example, these tables have a many-to-one relationship in the Northwind database; each product is supplied by a single supplier, but a single supplier can supply several products. Phrasing this relationship slightly differently, a row in the Product table can reference a single row in the Suppliers table through the SupplierID columns in both tables, but a row in the Suppliers table can reference a whole set of rows in the Products table. DLINQ provides the EntityRef<TEntity> and EntitySet<TEntity> generic types to model this type of relationship. Taking the Product entity class first, you can define the “one” side of the relationship with the Supplier entity class by using the EntityRef<Supplier> type, as shown here in bold type:

[Table(Name = "Products")]
public class Product
{
    [Column(IsPrimaryKey = true, CanBeNull = false)]
    public int ProductID { get; set; }
    ...
    [Column]
    public int? SupplierID { get; set; }
    ...
    private EntityRef<Supplier> supplier;
    [Association(Storage = "supplier", ThisKey = "SupplierID", OtherKey = "SupplierID")]
    public Supplier Supplier
    {
        get { return this.supplier.Entity; }
        set { this.supplier.Entity = value; }
    }
}

The private supplier field is a reference to an instance of the Supplier entity class. The public Supplier property provides access to this reference. The Association attribute specifies how DLINQ locates and populates the data for this property. The Storage parameter identifies the private field used to store the reference to the Supplier object. The ThisKey parameter indicates which property in the Product entity class DLINQ should use to locate the Supplier to reference for this product, and the OtherKey parameter specifies which property in the Supplier table DLINQ should match against the value for the ThisKey parameter. In this example, The Product and Supplier tables are joined across the SupplierID property in both entities.

The get accessor in the Supplier property returns a reference to the Supplier entity by using the Entity property of the EntityRef<Supplier> type. The set accessor populates this property with a reference to a Supplier entity.

You can define the “many” side of the relationship in the Supplier class with the EntitySet<Product> type, like this:

[Table(Name = "Suppliers")]
public class Supplier
{
     [Column(IsPrimaryKey = true, CanBeNull = false)]
     public int SupplierID { get; set; }
     ...
     private EntitySet<Product> products = null;
     [Association(Storage = "products", OtherKey = "SupplierID", ThisKey = "SupplierID")]
     public EntitySet<Product> Products
     {
         get { return this.products; }
         set { this.products.Assign(value); }
     }
}

This time, notice that the Storage parameter of the Association attribute specifies the private EntitySet<Product> field. An EntitySet<TEntity> object holds a collection of references to entities. The get accessor of the public Products property returns this collection. The set accessor uses the Assign method of the EntitySet<Product> class to populate this collection.

So, by using the EntityRef<TEntity> and EntitySet<TEntity> types you can define properties that can model a one-to-many relationship, but how do you actually fill these properties with data? The answer is that DLINQ fills them for you when it fetches the data. The following code creates an instance of the Table<Product> class and issues a DLINQ query to fetch the details of all products. This code is similar to the first DLINQ example you saw earlier. The difference is in the foreach loop that displays the data.

DataContext db = new DataContext(...);
Table<Product> products = db.GetTable<Product>();
var productsAndSuppliers = from p in products
                           select p;
foreach (var product in productsAndSuppliers)
{
    Console.WriteLine("Product {0} supplied by {1}",
        product.ProductName, product.Supplier.CompanyName);
}

The Console.WriteLine statement reads the value in the ProductName property of the product entity as before, but it also accesses the Supplier entity and displays the CompanyName property from this entity. If you run this code, the output looks like this:

httpatomoreillycomsourcemspimages733001.png

As the code fetches each Product entity, DLINQ executes a second, deferred, query to retrieve the details of the supplier for that product so that it can populate the Supplier property, based on the relationship specified by the Association attribute of this property in the Product entity class.

When you have defined the Product and Supplier entities as having a one-to-many relationship, similar logic applies if you execute a DLINQ query over the Table<Supplier> collection, like this:

DataContext db = new DataContext(...);
Table<Supplier> suppliers = db.GetTable<Supplier>();
var suppliersAndProducts = from s in suppliers
                           select s;
foreach (var supplier in suppliersAndProducts)
{
    Console.WriteLine("Supplier name: {0}", supplier.CompanyName);
    Console.WriteLine("Products supplied");
    foreach (var product in supplier.Products)
    {
        Console.WriteLine("\t{0}", product.ProductName);
    }
    Console.WriteLine();
}

In this case, when the foreach loop fetches a supplier, it runs a second query (again deferred) to retrieve all the products for that supplier and populate the Products property. This time, however, the property is a collection (an EntitySet<Product>), so you can code a nested foreach statement to iterate through the set, displaying the name of each product. The output of this code looks like this:

httpatomoreillycomsourcemspimages733003.png

Deferred and Immediate Fetching Revisited

Earlier in this chapter, I mentioned that DLINQ defers fetching data until the data is actually requested but that you could apply the ToList or ToArray extension method to retrieve data immediately. This technique does not apply to data referenced as EntitySet<TEntity> or EntityRef<TEntity> properties; even if you use ToList or ToArray, the data will still be fetched only when accessed. If you want to force DLINQ to query and fetch referenced data immediately, you can set the LoadOptions property of the DataContext object as follows:

DataContext db = new DataContext(...);
Table<Supplier> suppliers = db.GetTable<Supplier>();
DataLoadOptions loadOptions = new DataLoadOptions();
loadOptions.LoadWith<Supplier>(s => s.Products);
db.LoadOptions = loadOptions;
var suppliersAndProducts = from s in suppliers
                           select s;

The DataLoadOptions class provides the generic LoadWith method. By using this method, you can specify whether an EntitySet<TEntity> property in an instance should be loaded when the instance is populated. The parameter to the LoadWith method is another method, which you can supply as a lambda expression. The example shown here causes the Products property of each Supplier entity to be populated as soon as the data for each Product entity is fetched rather than being deferred. If you specify the LoadOptions property of the DataContext object together with the ToList or ToArray extension method of a Table collection, DLINQ will load the entire collection as well as the data for the referenced properties for the entities in that collection into memory as soon as the DLINQ query is evaluated.

Defining a Custom DataContext Class

The DataContext class provides functionality for managing databases and database connections, creating entity classes, and executing commands to retrieve and update data in a database. Although you can use the raw DataContext class provided with the .NET Framework, it is better practice to use inheritance and define your own specialized version that declares the various Table<TEntity> collections as public members. For example, here is a specialized DataContext class that exposes the Products and Suppliers Table collections as public members:

public class Northwind : DataContext
{
    public Table<Product> Products;
    public Table<Supplier> Suppliers;
    public Northwind(string connectionInfo) : base(connectionInfo)
    {
    }
}

Notice that the Northwind class also provides a constructor that takes a connection string as a parameter. You can create a new instance of the Northwind class and then define and run DLINQ queries over the Table collection classes it exposes like this:

Northwind nwindDB = new Northwind(...);
var suppliersQuery = from s in nwindDB.Suppliers
                     select s;
foreach (var supplier in suppliersQuery)
{
    ...
}

This practice makes your code easier to maintain, especially if you are retrieving data from multiple databases. Using an ordinary DataContext object, you can instantiate any entity class by using the GetTable method, regardless of the database to which the DataContext object connects. You find out that you have used the wrong DataContext object and have connected to the wrong database only at run time, when you try to retrieve data. With a custom DataContext class, you reference the Table collections through the DataContext object. (The base DataContext constructor uses a mechanism called reflection to examine its members, and it automatically instantiates any members that are Table collections—the details of how reflection works are outside the scope of this book.) It is obvious to which database you need to connect to retrieve data for a specific table; if IntelliSense does not display your table when you define the DLINQ query, you have picked the wrong DataContext class, and your code will not compile.

Using DLINQ to Query Order Information

In the following exercise, you will write a version of the console application that you developed in the preceding exercise that prompts the user for a customer ID and displays the details of any orders placed by that customer. You will use DLINQ to retrieve the data. You will then be able to compare DLINQ with the equivalent code written by using ADO.NET.

Define the Order entity class

  1. Using Visual Studio 2008, create a new project called DLINQOrders by using the Console Application template. Save it in the \Microsoft Press\Visual CSharp Step By Step\Chapter 25 folder under your Documents folder, and then click OK.

  2. In Solution Explorer, change the name of the file Program.cs to DLINQReport.cs. In the Microsoft Visual Studio message, click Yes to change all references of the Program class to DLINQReport.

  3. On the Project menu, click Add Reference. In the Add Reference dialog box, click the .NET tab, select the System.Data.Linq assembly, and then click OK.

    This assembly holds the DLINQ types and attributes.

  4. In the Code and Text Editor window, add the following using statements to the list at the top of the file:

    using System.Data.Linq;
    using System.Data.Linq.Mapping;
    using System.Data.SqlClient;
  5. Add the Order entity class to the DLINQReport.cs file after the DLINQReport class, as follows:

    [Table(Name = "Orders")]
    public class Order
    {
    }

    The table is called Orders in the Northwind database. Remember that it is common practice to use the singular noun for the name of an entity class because an entity object represents one row from the database.

  6. Add the property shown here in bold type to the Order class:

    [Table(Name = "Orders")]
    public class Order
    {
        [Column(IsPrimaryKey = true, CanBeNull = false)]
        public int OrderID { get; set; }
    }

    The OrderID column is the primary key for this table in the Northwind database.

  7. Add the following properties shown in bold type to the Order class:

    [Table(Name = "Orders")]
    public class Order
    {
        ...
        [Column]
        public string CustomerID { get; set; }
    
        [Column]
        public DateTime? OrderDate { get; set; }
    
        [Column]
        public DateTime? ShippedDate { get; set; }
    
        [Column]
        public string ShipName { get; set; }
    
        [Column]
        public string ShipAddress { get; set; }
    
        [Column]
        public string ShipCity { get; set; }
    
        [Column]
        public string ShipCountry { get; set; }
    }

    These properties hold the customer ID, order date, and shipping information for an order. In the database, all of these columns allow null values, so it is important to use the nullable version of the DateTime type for the OrderDate and ShippedDate properties (string is a reference type that automatically allows null values). Notice that DLINQ automatically maps the SQL Server NVarChar type to the .NET Framework string type and the SQL Server DateTime type to the .NET Framework DateTime type.

  8. Add the following Northwind class to the DLINQReport.cs file after the Order entity class:

    public class Northwind : DataContext
    {
        public Table<Order> Orders;
        public Northwind(string connectionlnfo) : base (connectionlnfo)
        {
        }
    }

    The Northwind class is a DataContext class that exposes a Table property based on the Order entity class. In the next exercise, you will use this specialized version of the DataContext class to access the Orders table in the database.

Retrieve order information by using a DLINQ query

  1. In the Main method of the DLINQReport class, add the statement shown here in bold type, which creates a Northwind object. Be sure to replace YourComputer with the name of your computer:

    static void Main(string[] args)
    {
        Northwind northwindDB = new Northwind("Integrated Security=true;" +
            "Initial Catalog=Northwind;Data Source=YourComputer\\SQLExpress");
    }

    The connection string specified here is exactly the same as in the earlier exercise. The northwindDB object uses this string to connect to the Northwind database.

  2. After the variable declaration, add a try/catch block to the Main method:

    static void Main(string[] args)
    {
        ...
        try
        {
            // You will add your code here in a moment
        }
        catch(SqlException e)
        {
            Console.WriteLine("Error accessing the database: {0}", e.Message);
        }
    }

    As when using ordinary ADO.NET code, DLINQ raises a SqlException if an error occurs when accessing a SQL Server database.

  3. Replace the comment in the try block with the following code shown in bold type:

    try
    {
        Console.Write("Please enter a customer ID (5 characters): ");
        string customerId = Console.ReadLine();
    }

    These statements prompt the user for a customer ID and save the user’s response in the string variable customerId.

  4. Type the statement shown here in bold type after the code you just entered:

    try
    {
          ...
          var ordersQuery = from o in northwindDB.Orders
                            where String.Equals(o.CustomerID, customerId)
                            select o;
    }

    This statement defines the DLINQ query that will retrieve the orders for the specified customer.

  5. Add the foreach statement and ifelse block shown here in bold type after the code you added in the preceding step:

    try
    {
        ...
        foreach (var order in ordersQuery)
        {
            if (order.ShippedDate == null)
            {
                Console.WriteLine("Order {0} not yet shipped\n\n", order.OrderID);
             }
             else
             {
                 // Display the order details
             }
        }
    }

    The foreach statement iterates through the orders for the customer. If the value in the ShippedDate column in the database is null, the corresponding property in the Order entity object is also null, and then the if statement outputs a suitable message.

  6. Replace the comment in the else part of the if statement you added in the preceding step with the code shown here in bold type:

    if (order.ShippedDate == null)
    {
        ...
    }
    else
    {
        Console.WriteLine("Order: {0}\nPlaced: {1}\nShipped: {2}\n" +
                          "To Address: {3}\n{4}\n{5}\n{6}\n\n", order.OrderID,
                           order.OrderDate, order.ShippedDate, order.ShipName,
                           order.ShipAddress, order.ShipCity,
                           order.ShipCountry);
    }
  7. On the Debug menu, click Start Without Debugging to build and run the application.

  8. In the console window displaying the message “Please enter a customer ID (5 characters):”, type VINET.

    The application should display a list of orders for this customer. When the application has finished, press Enter to return to Visual Studio 2008.

  9. Run the application again. This time type BONAP when prompted for a customer ID.

    The final order for this customer has not yet shipped and contains a null value for the ShippedDate column. Verify that the application detects and handles this null value. When the application has finished, press Enter to return to Visual Studio 2008.

You have now seen the basic elements that DLINQ provides for querying information from a database. DLINQ has many more features that you can employ in your applications, including the ability to modify data and update a database. You will look briefly at some of these aspects of DLINQ in the next chapter.

  • If you want to continue to the next chapter

    Keep Visual Studio 2008 running, and turn to Chapter 26.

  • If you want to exit Visual Studio 2008 now

    On the File menu, click Exit. If you see a Save dialog box, click Yes (if you are using Visual Studio 2008) or Save (if you are using Visual C# 2008 Express Edition) and save the project.