The Common Language Runtime – Overview of the Runtime Environment

 

At a high level, the CLR is simply an engine that takes in IL instructions,translates them into machine instructions, and executes them. This does not meanthat the CLR is interpreting the instructions. This is just to say that the CLRforms an environment in which IL code can be executed. For this to workefficiently and portably, the execution engine must form a runtime environmentthat is both efficient and portable. Efficiency is key; if the code does not runquickly enough, all of the other features of the system become moot.

Portability is important because of the number of processors and devices onwhich the CLR is slated to run. For a long time, Microsoft and Intel seemed tobe close partners. Microsoft more or less picked the Intel line of processors torun the software that the company produced. This allowed Microsoft to build anddevelop software without worrying about supporting multiple CPU architecturesand instructions. The company didn't have to worry about shipping aMotorola 68XXX version of the software because it was not supported. Limitingthe scope of processor support became a problem as Win16 gave way to Win32. (NoAPIs were called Win16, but this is the name I will give the APIs that existedbefore Win32.) Building software that took advantage of the features of a 32-bitCPU remained somewhat backward compatible with older Win16 APIs and proved to bea major undertaking. With Win64 on the horizon, Microsoft must realize that itcannot continue to "port" all of its software with each new CPU thatis released if it wants to stay alive as a company. Microsoft is trying topenetrate the mobile phone, hand-held, and tablet markets that are powered by amyriad of different processors and architectures. Too much software is producedat Microsoft for it to continue to produce a CPU-bound version.

The answer to the problem of base address and data size (Win32 versus Win64)and to the problem of providing general portability to other processors came inthe form of the runtime environment, or the Common Language Runtime. Withoutgoing into the details of the specific instructions that the CLR supports (thisis done in Chapter 5, "Intermediate Language Basics"), this chapterdetails the architecture of the runtime that goes into making a managedapplication run.

Introduction to the Runtime

Before .NET, an executable (usually a file with an .exe suffix), was theapplication. In other words, the application was contained within one file. Tomake the overall system run more efficiently, the application would elect to usecode that was shared (usually a file with a .dll suffix). If the program electedto use shared code, you could either use an import library (a file that pointsfunction references to the DLL that is associated with the import library), oryou could load the DLL explicitly at runtime (using LoadLibrary,LoadLibraryEx, and GetProcAddress). With .NET, the unit ofexecution and deployment is the assembly. Execution usually begins with anassembly that has an .exe suffix. The application can use shared code byimporting the assembly that contains the shared code with an explicit reference.(You can add the reference via the "Add References" node in VisualStudio .NET or include it via a command-line switch /r). The application canalso explicitly load an assembly with Assembly.Load or Assembly.LoadFrom.


Note

Before going further, you need to learn definitions of some of the terms:

  • Assembly—The assembly is the primary unit of deploymentwithin the .NET Framework. Within the base class libraries is a class thatencapsulates a physical assembly appropriately named Assembly. When this bookrefers to the class or an instance of the class, it will be denoted as Assembly.This class exists in the System namespace. An assembly can contain references toother assemblies and modules. Chapter 4, "The Assembly," contains moredetailed information about assemblies.

  • Module—A module is a single file that contains executablecontent. An assembly can encapsulate one or more modules; a module does notstand alone without an assembly referring to it. Similar to assembly, a classexists in the base class library that encapsulates most of the features of amodule called Module. When this book refers to Module, it is referring to theclass in the base class library. This class exists in the Systemnamespace.

  • AppDomain—An application domain has been referred to as alightweight process. Before .NET, isolation was achieved through separateprocesses through assistance from the OS and the supporting hardware. If oneprocess ran amok, then it would not bring down the whole system, just thatprocess. Because types are so tightly controlled with the .NET Framework, it ispossible to have a mechanism whereby this same level of isolation can occurwithin a process. This mechanism is called the application domain, orAppDomain. As with modules and assemblies, a class in the base classlibrary encapsulates many of the features and functionality of an applicationdomain called AppDomain. This class exists in the System namespace. When thisbook refers to the class, it will be called AppDomain.

  • IL or MSIL—IL stands for Intermediate Language, and MSIL stands forMicrosoft Intermediate Language. IL is the language in which assemblies arewritten. It is a set of instructions that represent the code of the application.It is intermediate because it is not turned in to native code until needed. Whenthe code that describes a method is required to run, it is compiled into nativecode with the JIT compiler. Chapter 5 contains information about individual ILinstructions.

  • JIT—JIT stands for Just-In-Time. This term refers to the compilerthat is run against IL code on an as-needed basis.


After the code is "loaded," execution of the code can begin. Thisis where the old (pre-.NET) and the new (.NET) start to diverge significantly.In the case of unmanaged code, the compiler and linker have already turned thesource into native instructions, so those instructions can begin to executeimmediately. Of course, this means that you will have to compile a separateversion of the code for every different native environment. In some cases,because it is undesirable to ship and maintain a separate version for everypossible native environment, only a compatible version is compiled and shipped.This leads to a lowest common denominator approach as companies want to shipsoftware that can be run on as wide a range of environments as possible.Currently, few companies ship programs that target environments that have anaccelerated graphics engine. Not only would the manufacturer need to ship adifferent program for each graphics accelerator card, but a different programalso would need to be developed for those cases where a graphics accelerator waslacking. Other examples of hardware environments in which specific optimizationscould be taken advantage of would be disk cache, memory cache, high-speednetworks, multiple CPUs, specialized hardware for processing images, acceleratedmath functions, and so forth. In numerous other examples, compiling a programahead of time either results in a highly optimized yet very specific program, oran unoptimized and general program.

One of the first steps that the CLR takes in running a program is checkingthe method that is about to be run to see whether it has been turned into nativecode. If the method has not been turned into native
code, th
en the code in themethod is Just-In-Time compiled (JITd). Delaying the compilation of a methodyields two immediate benefits. First, it is possible for a company to ship oneversion of the software and have the CLR on the CPU where the program isinstalled take care of the specific optimizations that are appropriate for thehardware environment. Second, it is possible for the JIT compiler to takeadvantage of specific optimizations that allow the program to run more quicklythan a general-purpose, unmanaged version of the program. Systems built with a64-bit processor will have a "compatibility" mode that allows 32-bitprograms to run unmodified on the 64-bit CPU. This compatibility mode will notresult in the most efficient or fastest possible throughput, however. If anapplication is compiled into IL, it can take advantage of the 64-bit processingas long as a JIT engine can target the new 64-bit processor.

The process of loading a method and compiling it if necessary is repeateduntil either all of the methods in the application have been compiled or theapplication terminates. The rest of this chapter explores the environment inwhich the CLR encloses each class method.

Starting a Method

The CLR requires the following information about each method. All of thisdata is available to the CLR through metadata in each assembly.

  • Instructions—The CLR requires a list of MSIL instructions. As youwill see in the next chapter, each method has a pointer to the instruction setas part of metadata that is associated with it.

  • Signature—Each method has a signature, and the CLR requires that asignature be available for each method. The signature describes the callingconvention, return type, parameter count, and parameter types.

  • Exception Handling Array—No specific IL instructions handle withexceptions. There are directives, but no IL instructions. Instead ofexception-handling instructions, the assembly encloses a list of exceptions. Theexceptions list contains the type of the exception, an offset address to thefirst instruction after the exception try block, and the length of the tryblock. It also includes the offset to the handler code, the length of thehandler code, and a token describing the class that is used to encapsulate theexception.

  • The size of the evaluation stack—This data is available through themetadata of the assembly, and you will typically see it as .maxstack x in ILDASMlistings, where x is the size of the evaluation stack. This logical size of thestack as x represents the maximum number of items that will need to be pushedonto the stack. The physical size of the items and the stack is left up to theCLR to determine at runtime when the method is JITd.

  • A description of the locals array—Every method needs to declare upfront the number of items of local storage that the method requires. Like theevaluation stack, this is a logical array of items, although each item'stype is also declared in the array. In addition, a flag is stored in themetadata to indicate whether the local variables should be initialized to zeroat the beginning of the method call.

With this information, the CLR is able to form an abstraction of whatnormally would be the native stack frame. Typically, each CPU or machine forms astack frame that contains the arguments (parameters) or references to argumentsto the method. Similarly, the return variables are placed on the stack framebased on calling conventions that are specific to a particular CPU or machine.The order of both the input and output parameters, as well as the way that thenumber of parameters is specified, is specific to a particular machine. Becauseall of the required information is available for each method, the CLR can makethe determination at runtime of what the stack frame should look like.

The call to the method is made in such a way as to allow the CLR to havemarginal control of the execution of the method and its state. When the CLRcalls or invokes a method, the method and its state are put under the control ofthe CLR in what is known as the Thread of Control.

IL Supported Types

At the IL level, a simple set of types is supported. These types can bedirectly manipulated with IL instructions:

  • int8—8-bit 2's complement signed value.

  • unsigned int8 (byte)—8-bit unsigned binary value.

  • int16 (short)—16-bit 2's complement signed value.

  • unsigned int16 (ushort)—16-bit unsigned binary value.

  • int32 (int)—32-bit 2's complement signed value.

  • unsigned int32 (uint)—32-bit unsigned binary value.

  • int64 (long)—64-bit 2's complement signed value.

  • unsigned (ulong)—64-bit unsigned binary value.

  • float32 (float)—32-bit IEC 60559:1989 floating point value.

  • float64 (double)—64-bit IEC 60559:1989 floating pointvalue.

  • native int—Native size 2's complement signed value.

  • native unsigned int—Native size unsigned binary value.

  • F—Native size floating point variable. This variable is internal tothe CLR and is not visible by the user.

  • O—Native size object reference to managed memory.

  • &—Native size managed pointer.

These are the types that can be represented in memory, but some restrictionsexist in processing these data items. As discussed in the next section, the CLRprocesses these items on an evaluation stack that is part of the state data foreach method. The evaluation stack can represent an item of any size, but theonly operations that are allowed on user-defined value types are copying to andfrom memory and computing the addresses of user-defined value types. Alloperations that involve floating point values use an internal representation ofthe floating point value that is implementation specific (an Fvalue).

The other data types (other than the floating point value just discussed)that have a native size are native int, native unsigned int,native size object reference (O), and native size managed pointer(&). These data types are a mechanism for the CLR to defer thechoice of the value size. For example, this mechanism allows for a nativeint to be 64-bits on an IA64 processor and 32-bits on a Pentiumprocessor.

Two of these native size types might seem similar, the O (nativesize object reference and the & (native size managed pointer). AnO typed variable points to a managed object, but its use is restrictedto instructions that explicitly indicate an operation on a managed type or toinstructions whose metadata indicates that managed object references areallowed. The O type is said to point "outside" the object orto the object as a whole. The & type is also a reference to amanaged object, but it can be used to refer to a field of an object or anelement of an array. Both O and & types are tracked by theCLR and can change based on the results of a garbage collection.

One particular use of the native size type is for unmanaged pointers.Although unmanaged pointers can be strongly typed with metadata, they arerepresented as native unsigned int in the IL code. This gives the CLRthe flexibility to assign an unmanaged pointer to a larger address

space on aprocessor that supports it without unnecessarily tying up memory in storingthese values on processors that do not have the capability to address such alarge address space.

Some IL instructions require that an address be on the stack, such as thefollowing:

  • calli -…, arg1, arg2 … argn, ftn " … retVal

  • cpblk – …, destaddr, srcaddr, size " …

  • initblk …, addr, value, size "

  • ldind.* - …, addr " …, value

  • stind.* – …, addr, val " …

Using a native type guarantees that the operations that involve that type areportable. If the address is specified as a 64-bit integer, then it can beportable if appropriate steps are taken to ensure that the value is convertedappropriately to an address. If an address is specified as 32-bits or smaller,the code is never portable even though it might work for most 32-bit machines.For most cases, this is an IL generator or compiler issue and you should notneed to worry about it. You should, however, be aware that you can make codenon-portable by improperly using these instructions.

Short numeric values (those values less than 4 bytes) are widened to 4 byteswhen loaded (copied from memory to the stack) and narrowed when stored (copiedfrom stack to memory). Any operation that involves a short numeric value isreally handled as a 4-byte operation. Specific IL instructions deal with shortnumeric types:

  • Load and store operations to/from memory—ldelem, ldind, stind, andstelem

  • Data conversion—conv, conv.ovf

  • Array creation—newarr

Strictly speaking, IL only supports signed operations. The difference betweensigned and unsigned operations is how the value is interpreted. For operationsin which it would matter how the value is interpreted, the operation has both asigned and an unsigned version. For example, a cgt instruction and a cgt.unoperation compare two values for the greater value.

Homes for Values

To track objects, the CLR introduces the concept of a home for an object. Anobject's home is where the value of the object is stored. The home of anobject must have a mechanism in place for the JIT engine to determine the typeof the object. When an object is passed by reference, it must have a homebecause the address of the home is passed as a reference. Two types of data are"homeless" and cannot be passed by reference: constants andintermediate values on the evaluation stack from IL instructions or returnvalues from methods. The CLR supports the following homes for objects:

  • Incoming argument—ldarg and ldarga instructions determine theaddress of an argument home. The method signature determines the type.

  • Local variable—ldloca or ldloc IL instructions determine the addressof a local variable. The local evaluation stack determines the type of localvariable as part of the metadata.

  • Field (instance or static)—The use of ldflda for an instance fieldand ldsflda for a static field determine the address of a field. The metadatathat is associated with the class interface or module determines the type of thefield.

  • Array element—The use of ldelema determines the address of an arrayelement. The element array type determines the type of the element.

The Runtime Thread of Control

The CLR Thread of Control does not necessarily correspond with the native OSthreads. The base class library class System.Threading.Thread provides thelogical encapsulation of a thread of control.


Note

For more information onthreading, see Chapter 11, "Threading."


Each time a method is called, the normal procedure of checking whether the method has been JITd must take place.


Figure 3.1

shows a loose representation of what the CLR state looks like. This is loose in that it shows a simple link from one method to the other. This representation does not correctly portray situations that involve control flow that is exceptional, such as with jump instructions, exceptions, and tail calls.


Machine state under the CLR.

Themanaged heap referenced in this diagram refers to the memory that the CLRmanages. Details about the managed heaps and specifically garbage collection canbe found in Chapter 10, "Memory/Resource Management." Each time amethod is invoked, a method state is created. The method state includes thefollowing:

  • Instruction pointer—This points to the next IL instruction that thecurrent method is to execute.

  • Evaluation stack—This is the stack that the .maxstack directivespecifies. The compiler determines at compile time how many slots are requiredon the stack.

  • Local variable array—This is the same array that is declared andperhaps initialized in the metadata. Every time these variables are accessedfrom the IL, they are accessed as an index to this array. In the IL code, yousee references to this array via instructions like the following: ldloc.0("loading" or pushing local array variable 0 on to the stack) orstloc.1 ("stores" the value on the stack in the local variable1).

  • Argument array—This is an array of arguments that is passed to themethod. The arguments are manipulated with IL instructions such as ldarg.0("loads" argument zero onto the stack) or starg.1 ("stores"the value on the stack to argument 1).

  • MethodInfo handle—This composite of information is available in theassembly metadata. This handle points to information about the signature of themethod (types of arguments, numbers of arguments, and return types), types oflocal variables, and exception information.

  • Local memory pool—IL has instructions that can allocate memory thatis addressable by the current method (localloc). When the method returns, thememory is reclaimed.

  • Return state handle—This is a handle used to return the state afterthe method returns.

  • Security descriptor—The CLR uses this descriptor to record securityoverrides either programmatically or with custom attributes.

The evaluation stack does not directly equate to a physical representation.The physical representation of the evaluation stack is left up to the CLR andthe CPU for which the CLR is targeted. Logically, the evaluation stack is madeup of slots that can hold any data type. The size of the evaluation stack cannotbe indeterminate. For example, code that causes a variable to be pushed onto thestack an infinite or indeterminate number of times is disallowed.

Instructions that involve operations on the evaluation stack are not typed.For example, an add instruction adds two numbers, and a mul instructionmultiplies two numbers. The CLR tracks data types and uses them when the methodis JITd.

Method Flow Control

The CLR provides support for a rich set of flow control instructions:

  • Conditional or unconditional branch—Control can be transferredanywhere within a method as long as the transfer does not cross a protectedregion boundary. A protected region is defined in the metadata as aregion that is associated with an exception handler. In C#, this region is knownas a try block, and the associated catch block is known as a handler. TheCLR supports the execution of many different kinds of exception handlers to bedetailed later. The important point here is that a conditional or unconditionalbranch cannot specify a destination that crosses an exception boundary. In thecase of C#, you cannot branch into or out of a try or a catch block. This is nota limitation of the C# language; rather, it is a restriction of the IL code forwhich C# acts as a code generator.

  • Method call—Several instructions allow methods to call othermethods, thus creating other method states, as explained earlier.

  • Tail call—This is a special prefix that immediately precedes amethod call. It instructs the calling method to discard its stack frame beforecalling the method. This causes the called method to return to the point atwhich the calling method would have returned.

  • Return—This is a simple return from a method.

  • Method jump—This is an optimization of the tail call that transfersthe arguments and control of a method to another method with the same signature,essentially "deleting" the current method. The following snippet showsa simple jump:


  • // Function A
    .method static public void A()
    {
    // Output from A
    ret
    }
    // Function B
    .method static public void B()
    {
    jmp void A()
    // Output from B
    ret}

  • The instructions represented by the comment Output from B willnever be executed because the return from B is replaced by a return fromA.

  • Exception—This includes a set of instructions that generates anexception and transfers control out of a protected region.

The CLR enforces several rules when control is transferred within a method.First, control cannot be transferred to within an exception handler (catch,finally, and so on) except as the result of an exception. This restrictionreinforces the rule that the destination of a branch cannot cross a protectedregion. Second, after you are in a handler for a protected region, it is illegalto transfer out of that handler by any other means other than the restricted setof exception instructions (leave, end.finally, end.filter, end.catch). Again,you will notice that if you try to return from a method from within a finallyblock in C#, the compiler generates an error. This is not a C# limitation, but arestriction that is placed on the IL code. Third, each slot in the evaluationstack must maintain its type throughout the lifetime of the evaluation stack(hence the lifetime of the method). In other words, you cannot change the typeof a slot (variable) on the evaluation stack. This is typically not a problembecause the evaluation stack is not accessible to the user anyway. Finally,control is not allowed to simply "fall through." All paths ofexecution must terminate in either a return (ret), a method jump (jmp) or tailcall (tail.*), or a thrown exception (throw).

Method Call

The CLR can call methods in three different ways. Each of these call methodsonly differs in the way that the call site descriptor is specified. A call sitedescriptor gives the CLR and the JIT engine enough information about the methodcall so that a native method call can be generated, the appropriate argumentscan be made accessible to the method, and provision can be made for the returnif one exists.

The calli instruction is the simplest of the method calls. This instructionis used when the destination address is computed at runtime. The instructiontakes an additional function pointer argument that is known to exist on the callsite as an argument. This function pointer is computed with either the ldftn orldvirftn instructions. The call site is specified in the StandAloneSigtable of the metadata (see Chapter 4).

The call instruction is used when the address of the function is known atcompile time, such as with a static method. The call site descriptor is derivedfrom the MethodDef or MethodRef token that is part of theinstruction. (See Chapter 4 for a description of these two tables.)

The callvirt instruction calls a method on a particular instance of anobject. The instruction includes a MethodDef or MethodReftoken like with the call instruction, but the callvirt instruction takes anadditional argument, which refers to a particular instance on which this methodis to be called.

Method Calling Convention

The CLR uses a single calling convention throughout all IL code. If themethod being called is a method on an instance, a reference to the objectinstance is pushed on the stack first, followed by each of the arguments to themethod in left-to-right order. The result is that the this pointer is popped offof the stack first by the called method, followed by each of the argumentsstarting with argument zero and proceeding to argument n. If the method call isto a static method, then no associated instance pointer exists and the stackcontains only the arguments. For the calli instruction, the arguments are pushedon the stack in a left-to-right order followed by the function pointer that ispushed on the stack last. The CLR and the JIT must translate this to the mostefficient native calling convention.

Method Parameter Passing

The CLR supports three types of parameter-passing mechanisms:

  • By value—The value of the object is placed on the stack. Forbuilt-in types such as integers, floats, and so on, this simply means that thevalue is pushed onto the stack. For objects, a O type reference to theobject is placed on the stack. For managed and unmanaged pointers, the addressis placed on the stack. For user-defined value types, you can place a value onthe evaluation stack that precedes a method call in two ways. First, the valuecan be directly put on the stack with ldarg, ldloc, ldfld, or ldsfld. Second,the address of the value can be computed and the value can be loaded onto thestack with the ldobj instruction.

  • By reference—Using this convention, the address of the parameter ispassed to the method rather than the value. This allows a method to potentiallymodify such a parameter. Only values that have homes can be passed by referencebecause it is the address of the home that is passed. For code to be verifiable(type safety that can be verified), parameters that are passed by referenceshould only be passed and referenced via the ldind.* and stind.* instructionsrespectively.

  • Typed reference—A typed reference is similar to a "normal"by reference parameter with the addition of a static data type that is passedalong with the data reference. This allows IL to support languages such as VBthat can have methods that are not statically restricted to the types of datathat they can accept, yet require an unboxed, by reference value. To call such amethod, one would either copy
    an existing type reference or use the mkrefanyinstruction to create a data reference type. Using this reference type, theaddress is computed using the refanyval instruction. A typed reference parametermust refer to data that has a home.

Exception Handling

The CLR supports exceptional conditions or error handling by using exceptionobjects and protected blocks of code. A C# try block is an example of aprotected block of code. The CLR supports four different kinds of exceptionhandlers:

  • Finally—This block will be executed when the method exits no matterhow the method exits, whether by normal control (either implicitly or by anexplicit ret) or by unhandled exception.

  • Fault—This block will be executed if an exception occurs, but not ifthe method normally exits.

  • Type-filtered—This block of code will be executed when a match isdetected between the type of the exception for this block and the exception thatis thrown. This corresponds the C# catch block.

  • User-filtered—The determination whether this block should handle theexception is made as the result of a set of IL instructions that can specifythat the exception should be ignored, that this handler should handle theexception, or that the exception should be handled by the next exceptionhandler. For the reader who is familiar with Structured Exception Handling(SEH), this is much like the __except handler.

Not every language that generates compliant IL code necessarily supports allof the types of exception handling. For instance, C# does not supportuser-filtered exception handlers, whereas VB does.

When an exception occurs, the CLR searches the exception handling array thatis part of the metadata with each method. This array defines a protected regionand a block of code that is to handle a specific exception. Theexception-handling array specifies an offset from the beginning of the methodand a size of the block of code. Each row in the exception-handling arrayspecifies a protected region (offset and size), the type of handler (from thefour types of exception handlers listed in the previous paragraph), and thehandler block (offset and size). In addition, a type-filtered exception handlerrow contains information regarding the exception type for which this handler istargeted. The user- filtered exception handler contains a label that starts ablock of code to be executed to determine at runtime whether the handler blockshould be executed in addition to the specification of the handler region.Listing 3.1 shows some C# pseudo-code for handling an exception.

Listing 3.1 C# Exception-Handling Pseudo-Code

try
{
// Protect block
. . .
}
catch(ExceptionOne e)
{
// Type-filtered handler
. . .
}
finally
{
// Finally handler
. . .
}

For the code in Listing 3.1, you would see two rows in the exception handlerarray: one for the type-filtered handler and one for the finally block. Bothrows would refer to the same protected block of code—namely, the code inthe try block.

Listing 3.2 shows one more example of an exception-handling scheme, this timein Visual Basic.

Listing 3.2 VB Exception-Handling Pseudo-Code

Try
'Protected region of code
. . .
Catch e As ExceptionOne When i = 0
'User filtered exception handler
. . .
Catch e As ExceptionTwo
'Type filtered exception handler
. . .
Finally
'Finally handler
. . .
End Try

The pseudo-code in Listing 3.2 would result in three rows in theexception-handling array. The first Catch is a user-filtered exception handler,which would be turned into the first row in the exception-handling array. Thesecond Catch block is a type-exception handler, which is the same as thetyped-exception handler in the C# case. The third and last row in theexception-handling array would be the Finally handler.

When an exception is generated, the CLR looks for the first match in theexception- handling array. A match would mean that the exception was thrownwhile the managed code was in the protected block that was specified by theparticular row. In addition, for a match to exist, the particular handler must"want" to handle the exception (the user filter is true; the typematches the exception type thrown; the code is leaving the method, as infinally; and so forth). The first row in the exception-handing array that theCLR matches becomes the exception handler to be executed. If an appropriatehandler is not found for the current method, then the current method'scaller is examined. This continues until either an acceptable handler is found,or the top of the stack is reached and the exception is declared unhandled.

Exception Control Flow

Several rules govern the flow of control within protected regions and theassociated handlers. These rules are enforced either by the compiler (the ILcode generator) or by the CLR because the method is JITd. Remember that aprotected region and the associated handler are overlaid on top of an existingblock of IL code. You cannot determine the structure of an exception frameworkfrom the IL code that is specified in the metadata. The CLR enforces a set ofrules when transferring control to or from exception control blocks. These rulesare as follows:

  • Control can only pass into an exception handler block through theexception mechanism.

  • There are two ways in which control can pass to a protected region (thetry block). First, control can simply branch or fall into the first instructionof a protected region. Second, from within a type-filtered handler a leaveinstruction can specify the offset to any instruction within a protected region(not necessarily the first instruction).

  • The evaluation stack on entering a protected region must be empty. Thiswould mean that one cannot push values on to the evaluation stack prior toentering a protected region.

  • Once in a protected region any of the associated handler blocks exitingsuch a block is strictly controlled.

  • One can exit any of the exception blocks by throwing anotherexception.

  • From within a protected region or in a handler block (notfinally or fault) a leave instruction may be executedwhich is similar to an unconditional branch but has the side effect of emptyingthe evaluation stack and the destination of a leave instruction can be anyinstruction in a protected region.

  • A user-filtered handler block must be terminated by an endfilterinstruction. This instruction takes a single argument from the evaluation stackto determine how exception handling should proceed.

  • A finally or fault block is terminated with anendfinally instruction. This instruction empties the evaluation stack andreturns from the enclosing method.

  • Control can pass outside of a type-filtered handler block by rethrowingthe exception. This is just a specialized case for throwing an exception inwhich the exception thrown is simply the exception that is currently beinghandled.

  • None of the handler blocks or protected regions can execute a retinstruction to return from the enclosing method.

  • No local allocation can be done from within any of the exception handlerb

    locks. Specifically, the localloc instruction is not allowed from anyhandler.

Exception Types

The documentation indicates the exceptions that an individual instruction cangenerate, but in general, the CLR can generate the following exceptions as aresult of executing specific IL instructions:

  • ArithmeticException

  • DivideByZeroException

  • ExecutionEngineException

  • InvalidAddressException

  • OverflowException

  • SecurityException

  • StackOverflowException

In addition, the following exceptions are generated as a result of objectmodel inconsistencies and errors:

  • TypeLoadException

  • IndexOutOfRangeException

  • InvalidAddressException

  • InvalidCastException

  • MissingFieldException

  • MissingMethodException

  • NullReferenceException

  • OutOfMemoryException

  • SecurityException

  • StackOverflowException

The ExecutionEngineException can be thrown by any instruction, and itindicates that the CLR has detected an unexpected inconsistency. If the code hasbeen verified, this exception will never be thrown.

Many exceptions are thrown because of a failed resolution. That is, a methodwas not found, or the method was found but it had the wrong signature, and soforth. The following is a list of exceptions that are considered to beresolution exceptions:

  • BadImageFormatException

  • EntryPointNotFoundException

  • MissingFieldException

  • MissingMemberException

  • MissingMethodException

  • NotSupportedException

  • TypeLoadException

  • TypeUnloadedException

A few of the exceptions might be thrown early, before the code that causedthe exception is actually run. This is usually because an error was detectedduring the conversion of the IL code to native code (JIT compile time). Thefollowing exceptions might be thrown early:

  • MissingFieldException

  • MissingMethodException

  • SecurityException

  • TypeLoadException

Exceptions are covered in more detail in Chapter 15, "Using ManagedExceptions to Effectively Handle Errors."

Remote Execution

If it is determined that an object's identity cannot be shared then aremoting boundary is put in place. A remoting boundary is implemented by the CLRusing proxies. A proxy represents an object on one side of the remoting boundaryand all instance field and method references are forwarded to the other side ofthe remoting boundary. A proxy is automatically created for objects that derivefrom System.MarshalByRefObject.


Note

Remoting is covered in moredetail in Chapter 13, "Building Distributed Applications with .NETRemoting."


The CLR has a mechanism that allows applications running from within the sameoperating system process to be isolated from one another. This mechanism isknown as the application domain. A class in the base class libraryencapsulates the features of an application domain known as AppDomain. Aremoting boundary is required to effectively communicate between two isolatedobjects. Because each application domain is isolated from another applicationdomain, a remoting boundary is required to communicate between applicationdomains.

Memory Access

All memory access from within the runtime environment must be properlyaligned. This means that access to int16 or unsigned int16(short or ushort; 2-byte values) values must occur on even boundaries. Access toint32, unsigned int32, and float32 (int, uint, andfloat; 4-byte values) must occur at an address that is evenly divisible by 4.Access to int64, unsigned int64, and float64 (long,ulong, and double; 8-byte values) must occur at an address that is evenlydivisible by 4 or 8 depending on the architecture. Access to any of the nativetypes (native int, native unsigned int, &) mustoccur on an address that is evenly divisible by 4 or 8, depending on that nativeenvironment.

A side effect of properly aligned data is that read and write access to itthat is no larger than the size of a native int is guaranteed to beatomic. That is, the read or write operation is guaranteed to beindivisible.

Volatile Memory Access

Certain memory access IL instructions can be prefixed with the volatileprefix. By marking memory access as volatile it does not necessarilyguarantee atomicity but it does guarantee that prior to any read access to thememory the variable will be read from memory. A volatile write simplymeans that a write to memory is guaranteed to happen before any other access isgiven to the variable in memory.

The volatile prefix is meant to simulate a hardware CPU register. If this iskept in mind, volatile is easier to understand.

CLR Threads and Locks

The CLR provides support for many different mechanisms to guaranteesynchronized access to data. Thread synchronization is covered in more detail inChapter 11. Some of the locks that are part of the CLR execution model are asfollows:

  • Synchronized methods—Synchronized method locks that the CLR provideseither lock on a particular instance (locks on the this pointer) or in the caseof static locks, the lock is made on the type to which the method is defined.Once held, a method lock allows access any number of times from the same thread(recursion, other method calls, and so forth); access to the lock from anotherthread will block until the lock is released.

  • Explicit locks—These locks are provided by the base classlibrary.

  • Volatile reads and writes—As stated previously, marking access to avariable volatile does not guarantee atomicity except in the case where the sizeof the value is less than or equal to that of a native int and it isproperly aligned.

  • Atomic operations—The base class library provides for a number ofatomic operations through the use of the System.Threading.Interlockedclass.

Summary

This chapter provided a brief overview of the framework under which managedcode runs. If you keep in mind that at the lowest level, the CLR is an enginethat allows the execution of IL instructions, you will have an easier timeunderstanding both IL and how your code runs with the CLR.

This chapter detailed the rules for loading an assembly and startingexecution of a method. It also supplied detailed information about control flowfrom within a method call. I

t explored in depth the built-in mechanisms forhandling errors and exceptions from within this runtime environment. Inaddition, it discussed the runtime support for remoting that is built into theCLR. Finally, it revealed how the code that is running under the CLR accessesmemory and synchronizes access to methods when multiple threads couldpotentially have access to the memory store.

From the book: .NET Common Language Runtime Unleashed

Twitter Digg Delicious Stumbleupon Technorati Facebook Email

No comments yet... Be the first to leave a reply!