How Windows Debuggers Work

  • 5/15/2012

Managed-Code Debugging

As previously mentioned in Chapter 2, one of the unfortunate limitations of the Windows debuggers is that they don’t support source-level debugging of .NET applications. This doesn’t mean that you can’t use WinDbg to debug managed code; it simply means that you won’t have the convenience of source-level debugging, such as single-stepping and source line breakpoints, when doing so. It’s as if you were debugging system code without the source code; only it’s worse because many important commands that work for native system debugging, such as displaying call stacks using the k command, don’t even work for managed code. Fortunately, there is at least a workaround in the form of a WinDbg extension called SOS, which Microsoft ships with the .NET Framework. This useful extension is covered in more detail later in this section.

Because of this limitation, the Microsoft Visual Studio environment remains the debugger of choice for .NET debugging. To better understand why the Windows debuggers are lacking in this regard, it’s useful to first discuss the architecture used by Visual Studio and the .NET Common Language Runtime (CLR) environment to implement their support for managed-code debugging and understand the way they collaborate to present a seamless native/managed debugging experience. Just like other .NET-related discussions in this book, the coverage centers on the architecture in version 4.0 of the .NET Framework.

Architecture Overview

The first challenge when designing an architecture that enables debugging of Microsoft Intermediate Language (MSIL) .NET code is that such code gets translated into machine instructions on the fly by the CLR’s Just-in-Time (JIT) compiler. For performance reasons, this run-time code generation is done lazily only after a method is actually invoked. In particular, this means that to insert a code breakpoint, the debugger needs to wait until the code in question is loaded into memory so that it can edit the code in memory and insert the debug break instruction at the appropriate location. The native debug events generated by the OS aren’t sufficient by themselves to support this type of MSIL debugging because only the CLR knows when the .NET methods are compiled or how the managed class objects are represented in memory.

For those reasons, the CLR designed an infrastructure for debuggers to inspect and control managed targets with the help of a dedicated thread that runs as part of every .NET process and has intimate knowledge of its internal CLR data structures. This thread is known as the debugger runtime controller thread, and it runs in a continuous loop waiting for messages from the debugger process. Even in the break-in state, the managed target process isn’t entirely frozen because this thread must still run to service the debugger commands. Any .NET application will have this extra debugger thread even when it isn’t being actively debugged with a managed-code debugger. To confirm this fact, you can use the following “Hello World!” C# sample from the companion source code.

C:\book\code\chapter_03\HelloWorld>test.exe
Hello World!
Press any key to continue...

You can now use the steps described in Chapter 2 to start a live kernel-debugging session and noninvasively observe the threads that the managed process contains when it’s active. Notice the presence of the debugger runtime controller thread (the clr!DebuggerRCThread::ThreadProc thread routine in the following listing) even though the .NET process isn’t being debugged with a user-mode debugger.

lkd> .symfix
lkd> .reload
lkd> !process 0 0 test.exe
PROCESS 85520c88  SessionId: 1  Cid: 07b8    Peb: 7ffdf000  ParentCid: 0e5c
    Image: test.exe
lkd> .process /r /p 85520c88
lkd> !process 85520c88 7
PROCESS 85520c88  SessionId: 1  Cid: 07b8    Peb: 7ffdf000  ParentCid: 0e5c
    Image: test.exe
...
    THREAD 9532ed20  Cid 07b8.1e5c  ...
        828d74b0  SynchronizationEvent
        885f5cb8  SynchronizationEvent
        86a0e808  SynchronizationEvent
...
    0116f7fc 5d6bb4d8 00000003 0116f824 00000000 KERNEL32!WaitForMultipleObjects+0x18
    0116f860 5d6bb416 d6ab8654 00000000 00000000 clr!DebuggerRCThread::MainLoop+0xd9
    0116f890 5d6bb351 d6ab8678 00000000 00000000 clr!DebuggerRCThread::ThreadProc+0xca
    0116f8bc 76f9ed6c 00000000 0116f908 779c377b clr!DebuggerRCThread::ThreadProcStatic+0x83
    0116f8c8 779c377b 00000000 6cb23a74 00000000 KERNEL32!BaseThreadInitThunk+0xe
    0116f908 779c374e 5d6bb30c 00000000 00000000 ntdll!__RtlUserThreadStart+0x70
    0116f920 00000000 5d6bb30c 00000000 00000000 ntdll!_RtlUserThreadStart+0x1b
lkd> q

Because of its reliance on this helper thread, the managed-code debugging paradigm is often referred to as in-process debugging, in contrast to the out-of-process debugging architecture used by native code user-mode debuggers, which requires no active collaboration from the target process. The contract defined by the CLR for managed-code debuggers to interact with the runtime controller thread is represented by a set of COM interfaces implemented in the mscordbi.dll .NET Framework DLL. Because this contract is published as a set of COM interfaces, you can write a managed-code debugger in C/C++, and also in any other .NET language, where the COM Interop facilities can be used to consume the CLR debugging objects implemented in this DLL.

The Visual Studio debugger is based on this same CLR debugging infrastructure, which it also uses to implement its support for managed-code debugging. The components used to service the user actions in the debugger are represented, at a high level, in Figure 3-6. The debugger front-end UI processes any commands entered by the user and forwards them to the debugger’s back-end engine, which in turn internally uses the CLR debugging COM objects from mscordbi.dll to communicate with the runtime controller thread in the managed target process. These COM objects take care of all the internal details related to the private interprocess communication channel between the debugger and target processes.

Figure 3-6

Figure 3-6 In-process managed debugging architecture in Visual Studio and the CLR.

This architecture has one big advantage, which is that it insulates the debuggers from the intricate details of the internal CLR execution engine data structures by having a higher-level contract and communication channel between the managed-code debuggers and the CLR debugger controller thread. This means the layouts of those data structures can change without breaking the functionality of those debuggers.

Unfortunately, this architecture also has several drawbacks. First, this model doesn’t work for debugging of crash dumps because the target isn’t running in that case, so the debuggers can’t rely on an active debugger helper thread to perform their actions when debugging a memory crash dump file.

Second, the operating system is unaware that the application is being debugged using this private interprocess communication channel. Up until .NET version 4.0, Visual Studio debugging of managed applications didn’t work at all on machines that also had a host kernel debugger attached to them. Because the OS didn’t know that the managed process was being debugged, exceptions raised for the purpose of managed debugging were being incorrectly caught by the kernel debugger. The official workaround to this problem was documented in the Knowledge Base (KB) article at http://support.microsoft.com/kb/303067, but it’s hardly satisfactory because it recommends disabling the kernel debugger entirely. Fortunately, this problem is now fixed in Visual Studio 2010—at least for managed applications compiled for .NET 4.0—because the debugger now also attaches to the target process debug port as a regular native user-mode debugger. However, the in-process managed-debugging architecture is still otherwise being used in that release as the main live, managed-code debugging channel.

Table 3-3 contains a comparative analysis of the in-process and out-of-process debugging architectures.

Table 3-3 In-Process and Out-of-Process Managed Debugging Paradigms

Advantages

Drawbacks

In-process debugging

  • Easy access to CLR data structures

  • Faster single-stepping

  • Poor integration with kernel-mode debugging

  • Doesn’t work for crash dump debugging

Out-of-process debugging

  • Supports crash dump debugging

  • Natural integration with native debugging

  • No side effects preventing kernel-mode debugging

  • More difficult for the debugger to stay in-sync with the CLR execution engine’s data structures

Given the benefits of out-of-process debugging, the CLR and Visual Studio probably will continue to move toward that architecture for managed-code debugging in the future. That trend has already begun in .NET 4.0 and Visual Studio 2010, where the out-of-process architecture is now used to support crash dump debugging of managed processes.

The SOS Windows Debuggers Extension

Many WinDbg commands don’t work natively when debugging a .NET target program. For instance, the k command cannot display the names of managed functions in a call stack and the dv command cannot display the values of local variables from those functions, either. To understand why, remember that MSIL images are compiled on the fly, so the dynamically generated code addresses are completely unknown to the symbols that the Windows debugger relies on to map the addresses to their friendly symbolic names. Even when an MSIL image is precompiled into a native one—a process known as NGEN’ing the assembly—the generated native image is actually machine-specific and won’t have a corresponding symbol file, either. The .NET Framework DLL assemblies fall into this second category because they are usually NGEN’ed on the machines where they’re installed to improve the performance of all the applications that use them.

How SOS Works

To work around the lack of native support for managed-code debugging in the Windows debuggers, the .NET Framework ships the sos.dll debugger extension module. This extension was doubly useful in earlier releases of the .NET Framework because it was also the only supported way to perform crash dump debugging of .NET code, given that Visual Studio started supporting out-of-process debugging of managed code only in its 2010 release.

This debugger extension is built as part of the CLR code base, so it has intimate knowledge of the internal layouts of the CLR data structures, allowing it to read the virtual address space of the target process directly and parse the CLR execution engine structures that it needs. These capabilities enable it to support out-of-process managed-code debugging. When using SOS, you’ll at least be able to display managed call stacks, set breakpoints in managed code, find the values of local variables, dump the arguments to method calls, and perform most of the inspection and control debugging actions that you can use in native-code debugging—only without the convenience of source-level debugging.

Symbols for .NET modules are used by managed-code debuggers only to enable source-level debugging (source lines, names of local function variables, and so on). Even without symbol files for managed assemblies, you still can do a lot of things you aren’t able to do in native-code debugging, where the symbols are absolutely crucial. This is because MSIL images also carry metadata describing the type information for the classes they host, allowing any component with internal knowledge of how to parse that information to use it for displaying function names in a call stack, dump the values of local variables (though without their names), or find the parameters to function calls. This is precisely how the SOS Windows debugger extension enables out-of-process managed-code debugging—even without symbol files or any additional help from the CLR debugger runtime controller thread.

Debugging Your First .NET Program Using SOS

To provide a practical illustration for how to use SOS to debug .NET programs in WinDbg, you’ll now use it to debug the following C# program from the companion source code, which you should compile to target CLR version 4.0, as described in the procedure provided in the Introduction of this book.

//                                                        
// C:\book\code\chapter_03\HelloWorld>main.cs             
//                                                        
public class Test                                         
{                                                          
    public static void Main()                             
    {                                                     
        Console.WriteLine("Hello World!");                
        Console.WriteLine("Press any key to continue...");
        Console.ReadLine();                               
        Console.WriteLine("Exiting...");                  
    }                                                      
}                                                         

Every version of the CLR has its own copy of the SOS extension DLL that understands its internal data structures and is able to decode them. For this reason, you must always load the version of the extension that comes with the CLR version that’s used by the target process you’re trying to debug. In addition, the SOS commands work only after the CLR execution engine DLL has been loaded, so you need to wait for its module load event to occur. This happens early during the startup of the .NET target as the CLR shim DLL (mscoree.dll) hands the reins over to the CLR execution engine DLL, which is clr.dll in the case of CLR version 4 (.NET 4.x), and mscorwks.dll in the case of CLR version 2 (.NET 2.x and .NET 3.x). You can get notified of this module load event in the debugger by using the sxe ld command, as shown in the following listing.

0:000> vercommand
command line: '"c:\Program Files\Debugging Tools for Windows (x86)\windbg.exe"
c:\book\code\chapter_03\HelloWorld\test.exe'
0:000> .symfix
0:000> .reload
0:000> sxe ld clr.dll
0:000> g
ModLoad: 5fad0000 6013e000   C:\Windows\Microsoft.NET\Framework\v4.0.30319\clr.dll
ntdll!KiFastSystemCallRet:
779970b4 c3              ret
0:000> .lastevent
Last event: 1e30.c20: Load module C:\Windows\Microsoft.NET\Framework\v4.0.30319\clr.dll at
5fad0000

After the execution engine DLL is loaded, you can load the SOS extension module before any managed code has a chance to run inside the target process. A command you’ll find useful when loading the SOS extension DLL is the .loadby debugger command. This command works just like the more basic .load command, but it looks up the extension module under the same path where its second module parameter was loaded from. By specifying the CLR execution engine DLL module name, you will be sure to load the sos.dll extension from the same location so that it matches the precise CLR version of the target. One of the useful SOS commands is the !eeversion command, which displays the current version of the CLR in the target process.

0:000> .loadby sos clr
0:000> !eeversion
4.0.30319.239 retail
0:000> g

The program now waits for user input in the ReadLine method. If you break into the debugger at this point by using the Debug\Break menu action, you’ll see that the k command isn’t able to properly display the function names in the managed code frames from the main thread in the .NET process. (Notice the very large offsets in the frames from the mscorlib_ni native image of the mscorlib.dll .NET Framework assembly, which is indicative of missing or unresolved symbols.) The unmanaged frames are still decoded correctly.

0:004> ~0s
0:000> k
ChildEBP RetAddr
0017e998 77996464 ntdll!KiFastSystemCallRet
0017e99c 75ea4b6e ntdll!ZwRequestWaitReplyPort+0xc
0017e9bc 75eb2833 KERNEL32!ConsoleClientCallServer+0x88
0017eab8 75efc978 KERNEL32!ReadConsoleInternal+0x1ac
0017eb40 75ebb974 KERNEL32!ReadConsoleA+0x40
0017eb88 5efc1c8b KERNEL32!ReadFileImplementation+0x75
0017ec08 5f637cc8 mscorlib_ni+0x2c1c8b
0017ec30 5f637f60 mscorlib_ni+0x937cc8
0017ec58 5ef78bfb mscorlib_ni+0x937f60
0017ec74 5ef5560a mscorlib_ni+0x278bfb
0017ec94 5f63e6f5 mscorlib_ni+0x25560a
0017eca4 5f52a7aa mscorlib_ni+0x93e6f5
0017ecb4 5fad21bb mscorlib_ni+0x82a7aa
0017ecc4 5faf4be2 clr!CallDescrWorker+0x33
0017ed40 5faf4d84 clr!CallDescrWorkerWithHandler+0x8e
0017ee7c 5faf4db9 clr!MethodDesc::CallDescr+0x194
0017ee98 5faf4dd9 clr!MethodDesc::CallTargetWorker+0x21
0017eeb0 5fc273c2 clr!MethodDescCallSite::Call_RetArgSlot+0x1c
0017f014 5fc274d0 clr!ClassLoader::RunMain+0x24c
0017f27c 5fc272e4 clr!Assembly::ExecuteMainMethod+0xc1
0017f760 5fc276d9 clr!SystemDomain::ExecuteMainMethod+0x4ec
0017f7b4 5fc275da clr!ExecuteEXE+0x58
...

Fortunately, the !clrstack command from the SOS debugger extension allows you to see the managed frames in the thread’s call stack.

0:000> !clrstack
OS Thread Id: 0xe48 (0)
Child SP IP       Call Site
0017eba8 779970b4 [InlinedCallFrame: 0017eba8]
0017eba4 5efc1c8b DomainNeutralILStubClass.IL_STUB_PInvoke(Microsoft.Win32.SafeHandles.
SafeFileHandle, Byte*, Int32, Int32 ByRef, IntPtr)
0017eba8 5f637cc8 [InlinedCallFrame: 0017eba8] System.IO.__ConsoleStream.ReadFile(Microsoft.
Win32.SafeHandles.SafeFileHandle, Byte*, Int32, Int32 ByRef, IntPtr)
0017ec1c 5f637cc8 System.IO.__ConsoleStream.ReadFileNative(Microsoft.Win32.SafeHandles.
SafeFileHandle, Byte[], Int32, Int32, Int32, Int32 ByRef)
0017ec48 5f637f60 System.IO.__ConsoleStream.Read(Byte[], Int32, Int32)
0017ec68 5ef78bfb System.IO.StreamReader.ReadBuffer()
0017ec7c 5ef5560a System.IO.StreamReader.ReadLine()
0017ec9c 5f63e6f5 System.IO.TextReader+SyncTextReader.ReadLine()
0017ecac 5f52a7aa System.Console.ReadLine()
0017ecb4 0043009f Test.Main() [c:\book\code\chapter_03\HelloWorld\main.cs @ 9]
0017eee4 5fad21bb [GCFrame: 0017eee4]

The mscorlib_ni.dll module shown in the stack trace output of the k command is the NGEN image (“ni”) corresponding to the mscorlib.dll MSIL image. You can treat these modules just like their MSIL sources for the purpose of SOS debugging. In particular, you can set breakpoints at managed code functions from both MSIL or NGEN images by using the !bpmd SOS extension command.

For example, you can set a breakpoint at the WriteLine method that would be executed by the next line of source code. This .NET method is defined in the System.Console class of the mscorlib.dll .NET assembly (or in this case, its mscorlib_ni.dll NGEN version). The !bpmd command takes the target module name as its first argument (without the extension!) and the fully qualified name of the .NET method as its second argument, as shown in the following listing.

0:004> !bpmd mscorlib_ni System.Console.WriteLine
Found 19 methods in module 5ed01000...
MethodDesc = 5ed885a4
Setting breakpoint: bp 5EFAD4FC [System.Console.WriteLine()]
MethodDesc = 5ed885b0
Setting breakpoint: bp 5F52A770 [System.Console.WriteLine(Boolean)]
MethodDesc = 5ed885bc
...
Adding pending breakpoints...
0:004> g

This command adds breakpoints to all overloads of the WriteLine method (19 of them in the previous case). If you now press Enter in the active command prompt window from the target process, you’ll notice that the debugger hits your breakpoint next.

Breakpoint 13 hit
mscorlib_ni+0x2570ac:
5ef570ac 55              push    ebp

You can again use the !clrstack command to see the current stack trace at the time of this breakpoint. The –a option of this command also allows you to view the arguments to the managed frames on the stack.

0:000> !clrstack -a
OS Thread Id: 0x18e0 (0)
Child SP IP       Call Site
0018f260 5ef570ac System.Console.WriteLine(System.String)
    PARAMETERS:
        value (<CLR reg>) = 0x01fdb24c
0018f264 004600ab Test.Main()
*** WARNING: Unable to verify checksum for test.exe
 [c:\book\code\chapter_03\HelloWorld\main.cs @ 10]
0018f490 5fad21bb [GCFrame: 0018f490]

Notice how this command also displays the address of the .NET string object that was passed to the WriteLine method, which you can dump using the !do (“dump object”) SOS debugger extension command.

0:000> !do 0x01fdb24c
Name:        System.String
MethodTable: 5f01f92c
EEClass:     5ed58ba0
Size:        34(0x22) bytes
String:      Exiting...
Fields:
      MT    Field   Offset                 Type VT     Attr    Value Name
5f0228f8  4000103        4         System.Int32  1 instance       10 m_stringLength
5f021d48  4000104        8          System.Char  1 instance       45 m_firstChar
5f01f92c  4000105        8        System.String  0   shared   static Empty
    >> Domain:Value  002a1270:01fd1228 <<

Notice that the !clrstack command doesn’t display the unmanaged functions on the call stack, though it’s usually easy to see where the managed calls fit in the overall stack trace by combining the !clrstack and the regular k back-trace command, which should give you everything you need to know about what code the current thread is currently executing. Note that SOS also has a !dumpstack command that attempts to do this merge, but its output can be rather noisy.

The SOS extension also has several other useful commands that you can use to inspect .NET programs, including a variant of the u (“un-assemble”) command that’s also able to decode the addresses of managed function calls in addition to unmanaged addresses. For example, you could use this command to obtain the disassembly of the current function at the time of the breakpoint in the previous case (the WriteLine method).

0:000> !u .
preJIT generated code
System.Console.WriteLine(System.String)
Begin 5ef570ac, size 1a
>>> 5ef570ac 55              push    ebp
5ef570ad 8bec            mov     ebp,esp
5ef570af 56              push    esi
5ef570b0 8bf1            mov     esi,ecx
5ef570b2 e819000000      call    mscorlib_ni+0x2570d0 (5ef570d0) (System.Console.get_Out(),
mdToken: 060008fd)
...

Notice how the regular u command, by contrast, doesn’t display the friendly name of the function itself or of the call to get_Out (a managed method too) that’s made inside the same function.

0:000> u .
mscorlib_ni+0x2570ac:
5ef570ac 55              push    ebp
5ef570ad 8bec            mov     ebp,esp
5ef570af 56              push    esi
5ef570b0 8bf1            mov     esi,ecx
5ef570b2 e819000000      call    mscorlib_ni+0x2570d0 (5ef570d0)

If you would like to experiment with more SOS debugger commands, you can find a listing of those commands and a brief summary of what they do by using the !help command in the WinDbg debugger.

0:000> !help
-------------------------------------------------------------------------------
SOS is a debugger extension DLL designed to aid in the debugging of managed
programs. Functions are listed by category, then roughly in order of
importance. Shortcut names for popular functions are listed in parenthesis.
Type "!help <functionname>" for detailed info on that function.
...
0:000> $ Terminate this debugging session now...
0:000> q

Table 3-4 recaps the basic SOS commands introduced during this experiment.

Table 3-4 Basic SOS Extension Commands

Command

Purpose

!eeversion

Display the target CLR (execution engine) version.

!bpmd

Set a breakpoint using a managed .NET method.

!do (or !dumpobj)

Dump the fields of a managed object.

!clrstack

!clrstack –a

Display the managed frames in the current thread’s call stack. The optional –a option is used to also display the arguments to the functions on the call stack. These values are the extension’s best guess, however; so, they’re not always accurate.

!u

Display the disassembly of a managed function.

Despite the fact you can achieve a lot of critical debugging tasks using the SOS extension, the managed-code debugging experience in the Windows debuggers still leaves a lot to be desired. The Windows debuggers clearly are not your first choice when debugging the managed code you write yourself, but SOS can still be a good option, especially if you can’t get Visual Studio installed on the target machine or if you are debugging without source code—in which case, you don’t lose much by using WinDbg anyway.