Implement data access
- 10/11/2018
- Skill 4.1: Perform I/O operations
- Skill 4.2: Consume data
- Skill 4.3: Query and manipulate data and objects by using LINQ
- Skill 4.4: Serialize and deserialize data by using binary serialization, custom serialization, XML Serializer, JSON Serializer, and Data Contract Serializer
- Skill 4.5: Store data in and retrieve data from collections
- Thought experiments
- Thought experiment answers
- Chapter summary
Chapter summary
A stream is an object that represents a connection to a data source. A stream allows a program to read and write sequences of bytes and set the position of the next stream operation.
The Stream class is the abstract parent class that defines fundamental stream behaviors. A range of different child classes extend this base class to provide stream interaction with different data sources.
The FileStream class provides a stream interface to file storage.
A file contains a sequence of 8-bit values (bytes) that can be encoded into text using a particular character mapping. The Encoding class provides methods for different character mappings.
The TextWriter and TextReader classes are abstract classes that define operations that can be performed with text in files. The StreamWriter and StreamReader class are implementations of this class that can be used to work with text files in C#.
Stream classes have constructors that can accept other streams, allowing a program to create a “pipeline” of data processing behaviors that are ultimately connected to a storage device.
The File class is a “helper” class that contains static methods that can be used to write, read, append, open, copy, rename, and delete files.
It is important that a program using files deals with any exceptions that are thrown when the files are used. File operations are prone to throwing exceptions.
The actual file storage on a computer is managed by a file system that interacts with a partition on a disk drive. The file system maintains information on files and directories which can be manipulated by C# programs.
The FileInfo class holds information about a particular file in a filesystem. It duplicates some functions provided by the File class but is useful if you are working through a large number of files. The File class is to be preferred when working with individual files.
The DirectoryInfo class holds information about a particular directory in a filesystem. This includes a list of FileInfo items describing the files held in that directory.
A path describes a file on a filesystem. Paths can be absolute, thus starting at the drive letter, or relative. The Path class provides a set of methods that can be used to work with path strings, including extracting elements of the path and concatenating path strings.
A C# program can use the HttpWebRequest, WebClient, and HttpClient classes to communicate with an HTTP server via the Internet. HttWebRequest provides the most flexibility when assembling HTTP messages. WebClient is simpler to use and can be used with await and async to perform asynchronously. HttpClient only supports asynchronous use and must be used when writing Universal Windows Applications.
Programs can (and should) perform file operations asynchronously. The FileStream class provides asynchronous methods. When catching exceptions thrown by file operations, ensure that the methods being awaited do not have a void return type.
A database provides data storage for applications in the form of tables. A row in a table equates to a class instance. Each row can have a unique ID (called a primary key) which allows other objects to refer to that row.
Programs interact with a database server by creating an instance of a connection object. The connection string is used to configure this connection, to identify the location of the server and to provide authentication details. ASP.NET applications can be configured with different environments for development and production, including the contents of the connection string.
A database responds to commands expressed in Structured Query Language (SQL). SQL is plain text that contains commands and data elements. Care must be taken when incorporating user entered data elements in SQL queries because a malicious user can inject additional SQL commands into the data.
When creating ASP.NET applications, the SQL commands to update the database are performed by methods in that act on objects in the application.
A program can download data from a web server in the form of a JSON or XML document that describes the elements in an object.
A program can download data from a web server in the form of an XML document. XML documents can be parsed element by element or used to create a Document Object Model (DOM) instance, which provides programmatic access to the elements in the data.
A web service takes the form of a server and a client. The server exposes a description of the service in the form of method calls that are implemented by a proxy object created by the client. The method calls in the client proxy object are translated into requests sent to the server. The server performs the requested action and then sends the response back to the client, which receives the response in the form of the result from the method call.
LINQ allows programmers to express SQL-like queries using “query comprehension syntax.”
LINQ queries can be performed against C# collections, database connections, and XML documents.
A LINQ query generates an iteration. The execution of the query is deferred until the iteration is enumerated, although it is possible to force the execution of a query by requesting the query to generate a List, array, or dictionary as a result.
LINQ queries are compiled into C# method calls. A programmer can express a query as methods if required.
A LINQ query generates an iteration as a result. This may be an iteration of data objects or an iteration of anonymous types, which are created dynamically when the query runs.
A program can work with anonymous types by using the var type, which requests that the compiler infer the type of the data from the context in which it is used. Using var types does not result in any relaxation of type safety because the compiler will ensure that the inferred type is not used incorrectly.
The output from one query can be joined with a next, to allow data in different sources (C# collections or database tables) to be combined.
The output from a query can be grouped on a particular property of the incoming data, allowing a query to create summary information that can be evaluated by the aggregate commands, which are sum, average, min, max, and count.
LINQ to XML can be used to perform LINQ queries against XML documents held in the XDocument and XElement objects. These objects also provide behaviors that make it easy to create new XML documents and edit existing ones.
Serialization involves sending (serializing) the contents of an object into a stream. The stream can be deserialized to create a copy of the object with all the data intact. The code content (the methods) in an object are not transferred by serialization.
Classes that are to be serialized by the binary serializer must be marked using the [Serializable] attribute, which will request that all the data items in the class be serialized. It is possible to mark data items in a class with the [NotSerialized] attribute if it is not meaningful for them to be serialized.
Binary serialization encodes the data into a binary file. Binary serialization serializes public and private data elements and preserves references. Sensitive data should not be serialized without paying attention to the security issues, because a binary serialized file containing private data can be compromised.
A programmer can write their own serialization behaviors in a class, which save and restore the data items using the serialization stream. Note that customized serialization behaviors may be used to illicitly obtain the contents of private data in a class, and so must be managed in a secure way.
A programmer can add methods that can modify the contents of a class during the serialization and deserialization process. This allows you to create classes that can create default values for missing attributes when old versions of serialized data are deserialized.
The XML serializer serializes public elements of a class into XML text. The value of each element is stored in the file. References in objects that are serialized are converted into copies of the value at the end of the reference. There is no need for the [Serializable] attribute to be added to classes to be serialized using XML.
The JSON serializer uses the JavaScript Object Notation to serialize data into a stream.
The DataContract serializer can serialize public and private data elements into XML files. Classes to be serialized must be given the [DataContract] attribute and data elements to be serialized must be given the [DataMember] attribute.
The C# language allows a program to create arrays. An array can contain value or reference types. The size of the array (the number of elements in it) is fixed when the array is created and cannot be changed. Array elements are accessed by the use of a subscript/index value which is 0 for the element at the start of the array. Arrays can have multiple dimensions.
The ArrayList is a collection class that provides dynamic storage of elements. A program can add and remove elements. Elements in an ArrayList are managed in terms of references to the object type, which is the base type of all types in C#. This means that a program can store any type in an ArrayList.
The List type uses generics to allow developers to create lists of a particular type. The list stores elements of the given type. It is used in exactly the same way as an ArrayList, with the difference that there is no requirement to cast elements removed from the List to their proper type.
Dictionaries provide storage organized on a key value of a particular type. The key value must be unique for each item in the dictionary.
Sets store a collection of unique values. They are useful because of the set functions that they provide. Sets are useful for storing tag values and other kinds of unstructured properties of an item.
A Queue is a First-In-First-Out (FIFO) storage device that provides methods that can be used to Enqueue and Dequeue items.
A program can customize a collection by extending the base collection type and adding additional behaviors.
Programmers can create their own collection types by creating types that implement the ICollection interface.
A Stack is a Last-In-First-Out (LIFO) storage device that provides methods that can be used Push items on the stack and Pop them off.