Software Development in Windows

  • 5/15/2012

Windows Architecture

The fundamental design of the Windows operating system, with an executive that runs in kernel mode and a complementary set of user-mode system support processes (smss.exe, csrss.exe, winlogon.exe, and so on) to help manage additional system facilities, has for the most part remained unchanged since the inception of the Windows NT operating system back in the late ’80s. Each new version of Windows naturally brings about a number of new components and APIs, but understanding how they fit in the architectural stack often starts with knowing how they interact with these core components of the operating system.

Kernel Mode vs. User Mode

Kernel mode is an execution mode in the processor that grants access to all system memory (including user-mode memory) and unrestricted use of all CPU instructions. This CPU mode is what enables the Windows operating system to prevent user-mode applications from causing system instability by accessing protected memory or I/O ports.

Application software usually runs in user mode and is allowed to execute code in kernel mode only via a controlled mechanism called a system call. When the application wants to call a system service exposed by code in the OS that runs in kernel mode, it issues a special CPU instruction to switch the calling thread to kernel mode. When the service call completes its execution in kernel mode, the operating system switches the thread context back to user mode, and the calling application is able to continue its execution in user mode.

Third-party vendors can get their code to run directly in kernel mode by implementing and installing signed drivers. Note that Windows is a monolithic system in the sense that the OS kernel and drivers share the same address space, so any code executing in kernel mode gets the same unrestricted access to memory and hardware that the core of the Windows operating system would have. In fact, several parts of the operating system (the NT file system, the TCP/IP networking stack, and so on) are also implemented as drivers rather than being provided by the kernel binary itself.

The Windows operating system uses the following layering structure for its kernel-mode operations:

  • Kernel Implements core low-level OS services such as thread scheduling, multiprocessor synchronization, and interrupt/exception dispatching. The kernel also contains a set of routines that are used by the executive to expose higher-level semantics to user-mode applications.
  • Executive Also hosted by the same “kernel” module in Windows (NTOSKRNL), and performs base services such as process/thread management and I/O dispatching. The executive exposes documented functions that can be called from kernel-mode components (such as drivers). It also exposes functions that are callable from user mode, known as system services. The typical entry point to these executive system services in user mode is the ntdll.dll module. (This is the module that has the system call CPU instruction!) During these system service calls, the executive allows user-mode processes to reference the objects (process, thread, event, and so on) it implements via indirect abstractions called object handles, which the executive keeps track of using a per-process handle table.
  • Hardware Abstraction Layer The HAL (hal.dll) is a loadable kernel-mode module that isolates the kernel, executive, and drivers from hardware-specific differences. This layer sits at the very bottom of kernel layers and handles key hardware differences so that higher-level components (such as third-party device drivers) can be written in a platform-agnostic way.
  • Windows and Graphics Subsystem The Win32 UI and graphics services are implemented by an extension to the kernel (win32k.sys module) and expose system services for UI applications. The typical entry point to these services in user mode is the user32.dll module.

    Figure 1-2 illustrates this high-level architecture.

    Figure 1-2

    Figure 1-2 Kernel-mode layers and services in the Windows operating system.

User-Mode System Processes

Several core facilities (logon, logoff, user authentication, and so on) of the Windows operating system are primarily implemented in user mode rather than in kernel mode. A fixed set of user-mode system processes exists to complement the OS functionality exposed from kernel mode. Here are a few important processes that fall in this category:

  • Smss.exe User sessions in Windows represent resource and security boundaries and offer a virtualized view of the keyboard, mouse, and physical display to support concurrent user logons on the same OS. The state that backs these sessions is tracked in a kernel-mode virtual memory space usually referred to as the session space. In user mode, the session manager subsystem process (smss.exe) is used to start and manage these user sessions.

    A “leader” smss.exe instance that's not associated with any sessions gets created as part of the Windows boot process. This leader smss.exe creates a transient copy of itself for each new session, which then starts the winlogon.exe and csrss.exe instances corresponding to that user session. Although having the leader session manager use copies of itself to initialize new sessions doesn't provide any practical advantages on client systems, having multiple smss.exe copies running concurrently can provide faster logon of multiple users on Windows Server systems acting as Terminal Servers.

  • Winlogon.exe The Windows logon process is responsible for managing user logon and logoff. In particular, this process starts the logon UI process that displays the logon screen when the user presses the Ctrl+Alt+Del keyboard combination and also creates the processes responsible for displaying the familiar Windows desktop after the user is authenticated. Each session has its own instance of the winlogon.exe process.
  • Csrss.exe The client/server runtime subsystem process is responsible for the user-mode portion of the Win32 subsystem (win32k.sys being the kernel-mode portion) and also was used to host the UI message loop of console applications prior to Windows 7. Each user session has its own instance of this process.
  • Lsass.exe The local security authority subsystem process is used by winlogon.exe to authenticate user accounts during the logon sequence. After successful authentication, LSASS generates a security access token object representing the user's security rights, which are then used to create the new explorer process for the user session. New child processes created from that shell then inherit their access tokens from the initial explorer process security token. There is only one single instance of this process, which runs in the noninteractive session (known as session 0).
  • Services.exe This system process is called the NT service control manager (SCM for short) and runs in session 0 (noninteractive session). It's responsible for starting a special category of user-mode processes called Windows services. These processes are generally used by the OS or third-party applications to carry out background tasks that do not require user interaction. Examples of Windows services include the spooler print service (spooler); the task scheduler service (schedule); the COM activation services, also known as the COM SCM (RpcSs and DComLaunch); and the Windows time service (w32time).

    These processes can choose to run with the highest level of user-mode privileges in Windows (LocalSystem account), so they are often used to perform privileged tasks on behalf of user-mode applications. Also, because these special processes are always started and stopped by the SCM process, they can be started on demand and are guaranteed to have at most one active instance running at any time.

All of the aforementioned system-support processes run under the LocalSystem account, which is the highest privileged account in Windows. Processes that run with this special account identity are said to be a part of the trusted computing base (TCB) because once user code is able to run with that level of privilege, it is also able to bypass any checks by the security subsystem in the OS.

User-Mode Application Processes

Every user-mode process (except for the leader smss.exe process mentioned earlier) is associated with a user session. These user-mode processes are boundaries for a memory address space. As far as scheduling in Windows is concerned, however, the most fundamental scheduling units remain the threads of execution and processes are merely containers for those threads. It's also important to realize that user-mode processes (more specifically, the threads they host) also often run plenty of code in kernel mode. Although your application code might indeed run in user mode, it's often the case that it also calls into system services (through API layers that call down to NTDLL or USER32 for the system call transitions) that end up transitioning to kernel mode on your behalf. This is why it makes sense to always think of your software (whether it's user-mode software or kernel drivers) as an extension of the Windows operating system and also that you understand how it interacts with the “services” provided by the OS.

Processes, in turn, can be placed in containers called job objects. These executive objects can be very useful to manage a group of processes as a single unit. Unlike threads and processes, job objects are often overlooked when studying the Windows architecture despite their unique advantages and the useful semantics they provide. Figure 1-3 illustrates the relationship between these fundamental objects.

Figure 1-3

Figure 1-3 Threads, processes, and jobs in Windows.

Job objects can be used to provide common execution settings for a set of processes and, among other things, to control the resources used by member processes (such as the amount of memory consumed by the job and the processors used for its execution) or their UI capabilities.

One particularly useful feature of job objects is that they can be configured to terminate their processes when their user-mode job handle is closed (either using an explicit kernel32!CloseHandle API call, or implicitly when the kernel runs down the handles in the process handle table when the process kernel object is destroyed). To provide a practical illustration, the following C++ program shows how to take advantage of the job-object construct exposed by the Windows executive in a C++ user-mode application to start a child (“worker” process and synchronize its lifetime with that of its parent process. This is often useful in the case of worker processes whose sole purpose is to serve requests in the context of their parent process, in which case it becomes critical not to “leak” those worker instances should the parent process die unexpectedly. (The reverse is more straightforward because the parent process can easily monitor when the child dies by simply waiting on the worker process handle to become signaled using the kernel32!WaitForSingleObject Win32 API.)

To follow this experiment, remember to refer back to the Introduction of this book, which contains step-by-step instructions for how to build the companion source code.

// // C:\book\code\chapter_01\WorkerProcess>main.cpp
// 
class CMainApp 
{ public:
     static
     HRESULT
     MainHR()
     {
         HANDLE hProcess, hPrimaryThread;
         CHandle shProcess, shPrimaryThread;
         CHandle shWorkerJob;
        DWORD dwExitCode;
         JOBOBJECT_EXTENDED_LIMIT_INFORMATION exLimitInfo = {0};
        CStringW shCommandLine = L"notepad.exe";
        ChkProlog();
        //
         // Create the job object, set its processes to terminate on
         // handle close (similar to an explicit call to TerminateJobObject),
         // and then add the current process to the job.
         //         shWorkerJob.Attach(CreateJobObject(NULL, NULL));
        ChkWin32(shWorkerJob);
        exLimitInfo.BasicLimitInformation.LimitFlags =
            JOB_OBJECT_LIMIT_KILL_ON_JOB_CLOSE;
        ChkWin32(SetInformationJobObject(
            shWorkerJob,
             JobObjectExtendedLimitInformation,
             &exLimitInfo,
             sizeof(exLimitInfo)));
        ChkWin32(AssignProcessToJobObject(
            shWorkerJob,
            ::GetCurrentProcess()));
         //
         // Now launch the new child process (job membership is inherited by default)
         //         wprintf(L"Launching child process (notepad.exe) ...\n");
         ChkHr(LaunchProcess(
            shCommandLine.GetBuffer(),
            0,
             &hProcess,
             &hPrimaryThread));
         shProcess.Attach(hProcess);
         shPrimaryThread.Attach(hPrimaryThread);
        //
         // Wait for the worker process to exit
         //         switch (WaitForSingleObject(shProcess, INFINITE))
        {
             case WAIT_OBJECT_0:
                 ChkWin32(::GetExitCodeProcess(shProcess, &dwExitCode));
                 wprintf(L"Child process exited with exit code %d.\n", dwExitCode);
                 break;
             default:
                 ChkReturn(E_FAIL);
         }
        ChkNoCleanup();
     }
 };

One key observation here is that the parent process is assigned to the new job object before the new child process is created, which allows the worker process to automatically inherit this job membership. This means in particular that there is no time window in which the new process would exist without being a part of the job object. If you kill the parent process (using the Ctrl+C signal, for example), you will notice that the worker process (notepad.exe in this case) is also terminated at the same time, which was precisely the desired behavior.

C:\book\code\chapter_01\WorkerProcess>objfre_win7_x86\i386\workerprocess.exe
Launching child process (notepad.exe) ... ^C

Low-Level Windows Communication Mechanisms

With code executing in kernel and user modes, and also inside the boundaries of per-process address spaces in user mode, the Windows operating system supports several mechanisms for allowing components to communicate with each other.

Calling Kernel-Mode Code from User Mode

The most basic way to call kernel-mode code from user-mode components is the system call mechanism mentioned earlier in this chapter. This mechanism relies on native support in the CPU to implement the transition in a controlled and secure manner.

One inherent drawback to the system call mechanism is that it relies on a hard-coded table of well-known executive service routines to dispatch the request from the client code in user mode to its intended target service routine in kernel mode. This doesn't extend well to kernel extensions implemented in the form of drivers, however. For those cases, another mechanism—called I/O control commands (IOCTL)—is supported by Windows to enable user-mode code to communicate with kernel-mode drivers. This is done through the generic kernel32!DeviceIoControl API, which takes the user-defined IOCTL identifier as one of its parameters and also a handle to the device object to which to dispatch the request. The transition to kernel mode is still performed in the NTDLL layer (ntdll!NtDeviceIoControlFile) and internally also uses the system call mechanism. So, you can think of the IOCTL method as a higher-level user/kernel communication protocol built on top of the raw system call services provided by the OS and CPU.

Internally, I/O control commands are processed by the I/O manager component of the Windows executive, which builds what is called an I/O request packet (IRP for short) that it then routes to the device object requested by the caller from user mode. IRP processing in the Windows executive uses a layered model where devices have an associated driver stack that handles their requests. When an IRP is sent to a top-level device object, it travels through its device stack starting at the top, passing through each driver in the corresponding device stack and giving it a chance to either process or ignore the command. In fact, IRPs are also used in kernel mode to send commands to other drivers so that the same IRP model is used for interdriver communication in the kernel. Figure 1-4 depicts this architecture.

Figure 1-4

Figure 1-4 User-mode to kernel-mode communication mechanisms.

Calling User-Mode Code from Kernel Mode

Code that runs in kernel mode has unrestricted access to the entire virtual address space (both the user and kernel portions), so kernel mode in theory could invoke any code running in user mode. However, doing so requires first picking a thread to run the code in, transitioning the CPU mode back to user mode, and setting up the user-mode context of the thread to reflect the call parameters. Fortunately, however, only the system code written by Microsoft really needs to communicate with random threads in user mode. The drivers you write, on the other hand, need to call back to user mode only in the context of a device IOCTL initiated by a user-mode thread, so they do not need a more generic kernel-mode to user-mode communication mechanism.

A standard way for the system to execute code in the context of a given user-mode thread is to send an asynchronous procedure call (APC) to that thread. For example, this is exactly how thread suspension works in Windows: the kernel simply sends an APC to the target thread and asks it to execute a function to wait on its internal thread semaphore object, causing it to become suspended. APCs are also used by the system in many other scenarios, such as in I/O completion and thread pool callback routines, just to cite a couple.

Interprocess Communication

Another way for communicating between user-mode processes and code in kernel mode, as well as between user-mode processes themselves, is to use the advanced local procedure call (ALPC) mechanism. ALPC was introduced in the Windows Vista timeframe and is a big revision of the LPC mechanism, a feature that provided in many ways the bloodline of low-level intercomponent communication in Windows since its early releases.

ALPC is based on a simple idea: a server process first opens a kernel port object to receive messages. Clients can then connect to the port if allowed by the server owning the port and start sending messages to the server. They are also able to wait until the server has fetched and processed the message from the internal queue that's associated with the ALPC port object.

In the case of user/user ALPC, this provides a basic low-level interprocess communication channel. In the case of kernel/user ALPC channels, this essentially provides another (indirect) way for user-mode applications to call code in kernel mode (whether it's in a driver or in the kernel module itself) and vice versa. An example of this communication is the channel that's established between the lsass.exe user-mode system process and the security reference monitor (SRM) executive component in kernel mode, which is used, for example, to send audit messages from the executive to lsass.exe. Figure 1-5 illustrates this architecture.

Figure 1-5

Figure 1-5 ALPC communication in Windows.

ALPC-style communication is used extensively in the operating system itself, most notably as it pertains to this book to implement the low-level communication protocol that native user-mode debuggers employ to receive various debug events from the process they debug. ALPC is also used as a building block in higher-level communication protocols such as local RPC, which in turn is used as the transport protocol in the COM model to implement interprocess method invocations with proper parameter marshaling.