By Kevin Burton
At a high level, the CLR is simply an engine that takes in IL instructions,
translates them into machine instructions, and executes them. This does not mean
that the CLR is interpreting the instructions. This is just to say that the CLR
forms an environment in which IL code can be executed. For this to work
efficiently and portably, the execution engine must form a runtime environment
that is both efficient and portable. Efficiency is key; if the code does not run
quickly enough, all of the other features of the system become moot.
Portability is important because of the number of processors and devices on
which the CLR is slated to run. For a long time, Microsoft and Intel seemed to
be close partners. Microsoft more or less picked the Intel line of processors to
run the software that the company produced. This allowed Microsoft to build and
develop software without worrying about supporting multiple CPU architectures
and instructions. The company didn't have to worry about shipping a
Motorola 68XXX version of the software because it was not supported. Limiting
the scope of processor support became a problem as Win16 gave way to Win32. (No
APIs were called Win16, but this is the name I will give the APIs that existed
before Win32.) Building software that took advantage of the features of a 32-bit
CPU remained somewhat backward compatible with older Win16 APIs and proved to be
a major undertaking. With Win64 on the horizon, Microsoft must realize that it
cannot continue to "port" all of its software with each new CPU that
is released if it wants to stay alive as a company. Microsoft is trying to
penetrate the mobile phone, hand-held, and tablet markets that are powered by a
myriad of different processors and architectures. Too much software is produced
at Microsoft for it to continue to produce a CPU-bound version.
The answer to the problem of base address and data size (Win32 versus Win64)
and to the problem of providing general portability to other processors came in
the form of the runtime environment, or the Common Language Runtime. Without
going into the details of the specific instructions that the CLR supports (this
is done in Chapter 5, "Intermediate Language Basics"), this chapter
details the architecture of the runtime that goes into making a managed
application run.
Introduction to the Runtime
Before .NET, an executable (usually a file with an .exe suffix), was the
application. In other words, the application was contained within one file. To
make the overall system run more efficiently, the application would elect to use
code that was shared (usually a file with a .dll suffix). If the program elected
to use shared code, you could either use an import library (a file that points
function references to the DLL that is associated with the import library), or
you could load the DLL explicitly at runtime (using LoadLibrary,
LoadLibraryEx, and GetProcAddress). With .NET, the unit of
execution and deployment is the assembly. Execution usually begins with an
assembly that has an .exe suffix. The application can use shared code by
importing the assembly that contains the shared code with an explicit reference.
(You can add the reference via the "Add References" node in Visual
Studio .NET or include it via a command-line switch /r). The application can
also explicitly load an assembly with Assembly.Load or Assembly.LoadFrom.
Note
Before going further, you need to learn definitions of some of the terms:
AssemblyThe assembly is the primary unit of deployment
within the .NET Framework. Within the base class libraries is a class that
encapsulates a physical assembly appropriately named Assembly. When this book
refers to the class or an instance of the class, it will be denoted as Assembly.
This class exists in the System namespace. An assembly can contain references to
other assemblies and modules. Chapter 4, "The Assembly," contains more
detailed information about assemblies.
ModuleA module is a single file that contains executable
content. An assembly can encapsulate one or more modules; a module does not
stand alone without an assembly referring to it. Similar to assembly, a class
exists in the base class library that encapsulates most of the features of a
module called Module. When this book refers to Module, it is referring to the
class in the base class library. This class exists in the System
namespace.
AppDomainAn application domain has been referred to as a
lightweight process. Before .NET, isolation was achieved through separate
processes through assistance from the OS and the supporting hardware. If one
process ran amok, then it would not bring down the whole system, just that
process. Because types are so tightly controlled with the .NET Framework, it is
possible to have a mechanism whereby this same level of isolation can occur
within a process. This mechanism is called the application domain, or
AppDomain. As with modules and assemblies, a class in the base class
library encapsulates many of the features and functionality of an application
domain called AppDomain. This class exists in the System namespace. When this
book refers to the class, it will be called AppDomain.
IL or MSILIL stands for Intermediate Language, and MSIL stands for
Microsoft Intermediate Language. IL is the language in which assemblies are
written. It is a set of instructions that represent the code of the application.
It is intermediate because it is not turned in to native code until needed. When
the code that describes a method is required to run, it is compiled into native
code with the JIT compiler. Chapter 5 contains information about individual IL
instructions.
JITJIT stands for Just-In-Time. This term refers to the compiler
that is run against IL code on an as-needed basis.
After the code is "loaded," execution of the code can begin. This
is where the old (pre-.NET) and the new (.NET) start to diverge significantly.
In the case of unmanaged code, the compiler and linker have already turned the
source into native instructions, so those instructions can begin to execute
immediately. Of course, this means that you will have to compile a separate
version of the code for every different native environment. In some cases,
because it is undesirable to ship and maintain a separate version for every
possible native environment, only a compatible version is compiled and shipped.
This leads to a lowest common denominator approach as companies want to ship
software that can be run on as wide a range of environments as possible.
Currently, few companies ship programs that target environments that have an
accelerated graphics engine. Not only would the manufacturer need to ship a
different program for each graphics accelerator card, but a different program
also would need to be developed for those cases where a graphics accelerator was
lacking. Other examples of hardware environments in which specific optimizations
could be taken advantage of would be disk cache, memory cache, high-speed
networks, multiple CPUs, specialized hardware for processing images, accelerated
math functions, and so forth. In numerous other examples, compiling a program
ahead of time either results in a highly optimized yet very specific program, or
an unoptimized and general program.
One of the first steps that the CLR takes in running a program is checking
the method that is about to be run to see whether it has been turned into native
code. If the method has not been turned into native code, then the code in the
method is Just-In-Time compiled (JITd). Delaying the compilation of a method
yields two immediate benefits. First, it is possible for a company to ship one
version of the software and have the CLR on the CPU where the program is
installed take care of the specific optimizations that are appropriate for the
hardware environment. Second, it is possible for the JIT compiler to take
advantage of specific optimizations that allow the program to run more quickly
than a general-purpose, unmanaged version of the program. Systems built with a
64-bit processor will have a "compatibility" mode that allows 32-bit
programs to run unmodified on the 64-bit CPU. This compatibility mode will not
result in the most efficient or fastest possible throughput, however. If an
application is compiled into IL, it can take advantage of the 64-bit processing
as long as a JIT engine can target the new 64-bit processor.
The process of loading a method and compiling it if necessary is repeated
until either all of the methods in the application have been compiled or the
application terminates. The rest of this chapter explores the environment in
which the CLR encloses each class method.
Starting a Method
The CLR requires the following information about each method. All of this
data is available to the CLR through metadata in each assembly.
InstructionsThe CLR requires a list of MSIL instructions. As you
will see in the next chapter, each method has a pointer to the instruction set
as part of metadata that is associated with it.
SignatureEach method has a signature, and the CLR requires that a
signature be available for each method. The signature describes the calling
convention, return type, parameter count, and parameter types.
Exception Handling ArrayNo specific IL instructions handle with
exceptions. There are directives, but no IL instructions. Instead of
exception-handling instructions, the assembly encloses a list of exceptions. The
exceptions list contains the type of the exception, an offset address to the
first instruction after the exception try block, and the length of the try
block. It also includes the offset to the handler code, the length of the
handler code, and a token describing the class that is used to encapsulate the
exception.
The size of the evaluation stackThis data is available through the
metadata of the assembly, and you will typically see it as .maxstack x in ILDASM
listings, where x is the size of the evaluation stack. This logical size of the
stack as x represents the maximum number of items that will need to be pushed
onto the stack. The physical size of the items and the stack is left up to the
CLR to determine at runtime when the method is JITd.
A description of the locals arrayEvery method needs to declare up
front the number of items of local storage that the method requires. Like the
evaluation stack, this is a logical array of items, although each item's
type is also declared in the array. In addition, a flag is stored in the
metadata to indicate whether the local variables should be initialized to zero
at the beginning of the method call.
With this information, the CLR is able to form an abstraction of what
normally would be the native stack frame. Typically, each CPU or machine forms a
stack frame that contains the arguments (parameters) or references to arguments
to the method. Similarly, the return variables are placed on the stack frame
based on calling conventions that are specific to a particular CPU or machine.
The order of both the input and output parameters, as well as the way that the
number of parameters is specified, is specific to a particular machine. Because
all of the required information is available for each method, the CLR can make
the determination at runtime of what the stack frame should look like.
The call to the method is made in such a way as to allow the CLR to have
marginal control of the execution of the method and its state. When the CLR
calls or invokes a method, the method and its state are put under the control of
the CLR in what is known as the Thread of Control.
IL Supported Types
At the IL level, a simple set of types is supported. These types can be
directly manipulated with IL instructions:
int88-bit 2's complement signed value.
unsigned int8 (byte)8-bit unsigned binary value.
int16 (short)16-bit 2's complement signed value.
unsigned int16 (ushort)16-bit unsigned binary value.
int32 (int)32-bit 2's complement signed value.
unsigned int32 (uint)32-bit unsigned binary value.
int64 (long)64-bit 2's complement signed value.
unsigned (ulong)64-bit unsigned binary value.
float32 (float)32-bit IEC 60559:1989 floating point value.
float64 (double)64-bit IEC 60559:1989 floating point
value.
native intNative size 2's complement signed value.
native unsigned intNative size unsigned binary value.
FNative size floating point variable. This variable is internal to
the CLR and is not visible by the user.
ONative size object reference to managed memory.
&Native size managed pointer.
These are the types that can be represented in memory, but some restrictions
exist in processing these data items. As discussed in the next section, the CLR
processes these items on an evaluation stack that is part of the state data for
each method. The evaluation stack can represent an item of any size, but the
only operations that are allowed on user-defined value types are copying to and
from memory and computing the addresses of user-defined value types. All
operations that involve floating point values use an internal representation of
the floating point value that is implementation specific (an F
value).
The other data types (other than the floating point value just discussed)
that have a native size are native int, native unsigned int,
native size object reference (O), and native size managed pointer
(&). These data types are a mechanism for the CLR to defer the
choice of the value size. For example, this mechanism allows for a native
int to be 64-bits on an IA64 processor and 32-bits on a Pentium
processor.
Two of these native size types might seem similar, the O (native
size object reference and the & (native size managed pointer). An
O typed variable points to a managed object, but its use is restricted
to instructions that explicitly indicate an operation on a managed type or to
instructions whose metadata indicates that managed object references are
allowed. The O type is said to point "outside" the object or
to the object as a whole. The & type is also a reference to a
managed object, but it can be used to refer to a field of an object or an
element of an array. Both O and & types are tracked by the
CLR and can change based on the results of a garbage collection.
One particular use of the native size type is for unmanaged pointers.
Although unmanaged pointers can be strongly typed with metadata, they are
represented as native unsigned int in the IL code. This gives the CLR
the flexibility to assign an unmanaged pointer to a larger address space on a
processor that supports it without unnecessarily tying up memory in storing
these values on processors that do not have the capability to address such a
large address space.
Some IL instructions require that an address be on the stack, such as the
following:
calli -..., arg1, arg2 ... argn, ftn " ... retVal
cpblk - ..., destaddr, srcaddr, size " ...
initblk ..., addr, value, size "
...
ldind.* - ..., addr " ..., value
stind.* - ..., addr, val " ...
Using a native type guarantees that the operations that involve that type are
portable. If the address is specified as a 64-bit integer, then it can be
portable if appropriate steps are taken to ensure that the value is converted
appropriately to an address. If an address is specified as 32-bits or smaller,
the code is never portable even though it might work for most 32-bit machines.
For most cases, this is an IL generator or compiler issue and you should not
need to worry about it. You should, however, be aware that you can make code
non-portable by improperly using these instructions.
Short numeric values (those values less than 4 bytes) are widened to 4 bytes
when loaded (copied from memory to the stack) and narrowed when stored (copied
from stack to memory). Any operation that involves a short numeric value is
really handled as a 4-byte operation. Specific IL instructions deal with short
numeric types:
Load and store operations to/from memoryldelem, ldind, stind, and
stelem
Data conversionconv, conv.ovf
Array creationnewarr
Strictly speaking, IL only supports signed operations. The difference between
signed and unsigned operations is how the value is interpreted. For operations
in which it would matter how the value is interpreted, the operation has both a
signed and an unsigned version. For example, a cgt instruction and a cgt.un
operation compare two values for the greater value.
Homes for Values
To track objects, the CLR introduces the concept of a home for an object. An
object's home is where the value of the object is stored. The home of an
object must have a mechanism in place for the JIT engine to determine the type
of the object. When an object is passed by reference, it must have a home
because the address of the home is passed as a reference. Two types of data are
"homeless" and cannot be passed by reference: constants and
intermediate values on the evaluation stack from IL instructions or return
values from methods. The CLR supports the following homes for objects:
Incoming argumentldarg and ldarga instructions determine the
address of an argument home. The method signature determines the type.
Local variableldloca or ldloc IL instructions determine the address
of a local variable. The local evaluation stack determines the type of local
variable as part of the metadata.
Field (instance or static)The use of ldflda for an instance field
and ldsflda for a static field determine the address of a field. The metadata
that is associated with the class interface or module determines the type of the
field.
Array elementThe use of ldelema determines the address of an array
element. The element array type determines the type of the element.
The Runtime Thread of Control
The CLR Thread of Control does not necessarily correspond with the native OS
threads. The base class library class System.Threading.Thread provides the
logical encapsulation of a thread of control.
Note
For more information on
threading, see Chapter 11, "Threading."
Each time a method is called, the normal procedure of checking whether the
method has been JITd must take place.

Figure 3.1
shows a loose representation of what the CLR state looks like. This
is loose in that it shows a simple link from one method to the other. This representation
does not correctly portray situations that involve control flow that is exceptional,
such as with jump instructions, exceptions, and tail calls.
Machine state under the CLR.
The
managed heap referenced in this diagram refers to the memory that the CLR
manages. Details about the managed heaps and specifically garbage collection can
be found in Chapter 10, "Memory/Resource Management." Each time a
method is invoked, a method state is created. The method state includes the
following:
Instruction pointerThis points to the next IL instruction that the
current method is to execute.
Evaluation stackThis is the stack that the .maxstack directive
specifies. The compiler determines at compile time how many slots are required
on the stack.
Local variable arrayThis is the same array that is declared and
perhaps initialized in the metadata. Every time these variables are accessed
from the IL, they are accessed as an index to this array. In the IL code, you
see references to this array via instructions like the following: ldloc.0
("loading" or pushing local array variable 0 on to the stack) or
stloc.1 ("stores" the value on the stack in the local variable
1).
Argument arrayThis is an array of arguments that is passed to the
method. The arguments are manipulated with IL instructions such as ldarg.0
("loads" argument zero onto the stack) or starg.1 ("stores"
the value on the stack to argument 1).
MethodInfo handleThis composite of information is available in the
assembly metadata. This handle points to information about the signature of the
method (types of arguments, numbers of arguments, and return types), types of
local variables, and exception information.
Local memory poolIL has instructions that can allocate memory that
is addressable by the current method (localloc). When the method returns, the
memory is reclaimed.
Return state handleThis is a handle used to return the state after
the method returns.
Security descriptorThe CLR uses this descriptor to record security
overrides either programmatically or with custom attributes.
The evaluation stack does not directly equate to a physical representation.
The physical representation of the evaluation stack is left up to the CLR and
the CPU for which the CLR is targeted. Logically, the evaluation stack is made
up of slots that can hold any data type. The size of the evaluation stack cannot
be indeterminate. For example, code that causes a variable to be pushed onto the
stack an infinite or indeterminate number of times is disallowed.
Instructions that involve operations on the evaluation stack are not typed.
For example, an add instruction adds two numbers, and a mul instruction
multiplies two numbers. The CLR tracks data types and uses them when the method
is JITd.
Method Flow Control
The CLR provides support for a rich set of flow control instructions:
Conditional or unconditional branchControl can be transferred
anywhere within a method as long as the transfer does not cross a protected
region boundary. A protected region is defined in the metadata as a
region that is associated with an exception handler. In C#, this region is known
as a try block, and the associated catch block is known as a handler. The
CLR supports the execution of many different kinds of exception handlers to be
detailed later. The important point here is that a conditional or unconditional
branch cannot specify a destination that crosses an exception boundary. In the
case of C#, you cannot branch into or out of a try or a catch block. This is not
a limitation of the C# language; rather, it is a restriction of the IL code for
which C# acts as a code generator.
Method callSeveral instructions allow methods to call other
methods, thus creating other method states, as explained earlier.
Tail callThis is a special prefix that immediately precedes a
method call. It instructs the calling method to discard its stack frame before
calling the method. This causes the called method to return to the point at
which the calling method would have returned.
ReturnThis is a simple return from a method.
Method jumpThis is an optimization of the tail call that transfers
the arguments and control of a method to another method with the same signature,
essentially "deleting" the current method. The following snippet shows
a simple jump:
// Function A
.method static public void A()
{
// Output from A
ret
}
// Function B
.method static public void B()
{
jmp void A()
// Output from B
ret}
The instructions represented by the comment Output from B will
never be executed because the return from B is replaced by a return from
A.
ExceptionThis includes a set of instructions that generates an
exception and transfers control out of a protected region.
The CLR enforces several rules when control is transferred within a method.
First, control cannot be transferred to within an exception handler (catch,
finally, and so on) except as the result of an exception. This restriction
reinforces the rule that the destination of a branch cannot cross a protected
region. Second, after you are in a handler for a protected region, it is illegal
to transfer out of that handler by any other means other than the restricted set
of exception instructions (leave, end.finally, end.filter, end.catch). Again,
you will notice that if you try to return from a method from within a finally
block in C#, the compiler generates an error. This is not a C# limitation, but a
restriction that is placed on the IL code. Third, each slot in the evaluation
stack must maintain its type throughout the lifetime of the evaluation stack
(hence the lifetime of the method). In other words, you cannot change the type
of a slot (variable) on the evaluation stack. This is typically not a problem
because the evaluation stack is not accessible to the user anyway. Finally,
control is not allowed to simply "fall through." All paths of
execution must terminate in either a return (ret), a method jump (jmp) or tail
call (tail.*), or a thrown exception (throw).
Method Call
The CLR can call methods in three different ways. Each of these call methods
only differs in the way that the call site descriptor is specified. A call site
descriptor gives the CLR and the JIT engine enough information about the method
call so that a native method call can be generated, the appropriate arguments
can be made accessible to the method, and provision can be made for the return
if one exists.
The calli instruction is the simplest of the method calls. This instruction
is used when the destination address is computed at runtime. The instruction
takes an additional function pointer argument that is known to exist on the call
site as an argument. This function pointer is computed with either the ldftn or
ldvirftn instructions. The call site is specified in the StandAloneSig
table of the metadata (see Chapter 4).
The call instruction is used when the address of the function is known at
compile time, such as with a static method. The call site descriptor is derived
from the MethodDef or MethodRef token that is part of the
instruction. (See Chapter 4 for a description of these two tables.)
The callvirt instruction calls a method on a particular instance of an
object. The instruction includes a MethodDef or MethodRef
token like with the call instruction, but the callvirt instruction takes an
additional argument, which refers to a particular instance on which this method
is to be called.
Method Calling Convention
The CLR uses a single calling convention throughout all IL code. If the
method being called is a method on an instance, a reference to the object
instance is pushed on the stack first, followed by each of the arguments to the
method in left-to-right order. The result is that the this pointer is popped off
of the stack first by the called method, followed by each of the arguments
starting with argument zero and proceeding to argument n. If the method call is
to a static method, then no associated instance pointer exists and the stack
contains only the arguments. For the calli instruction, the arguments are pushed
on the stack in a left-to-right order followed by the function pointer that is
pushed on the stack last. The CLR and the JIT must translate this to the most
efficient native calling convention.
Method Parameter Passing
The CLR supports three types of parameter-passing mechanisms:
By valueThe value of the object is placed on the stack. For
built-in types such as integers, floats, and so on, this simply means that the
value is pushed onto the stack. For objects, a O type reference to the
object is placed on the stack. For managed and unmanaged pointers, the address
is placed on the stack. For user-defined value types, you can place a value on
the evaluation stack that precedes a method call in two ways. First, the value
can be directly put on the stack with ldarg, ldloc, ldfld, or ldsfld. Second,
the address of the value can be computed and the value can be loaded onto the
stack with the ldobj instruction.
By referenceUsing this convention, the address of the parameter is
passed to the method rather than the value. This allows a method to potentially
modify such a parameter. Only values that have homes can be passed by reference
because it is the address of the home that is passed. For code to be verifiable
(type safety that can be verified), parameters that are passed by reference
should only be passed and referenced via the ldind.* and stind.* instructions
respectively.
Typed referenceA typed reference is similar to a "normal"
by reference parameter with the addition of a static data type that is passed
along with the data reference. This allows IL to support languages such as VB
that can have methods that are not statically restricted to the types of data
that they can accept, yet require an unboxed, by reference value. To call such a
method, one would either copy an existing type reference or use the mkrefany
instruction to create a data reference type. Using this reference type, the
address is computed using the refanyval instruction. A typed reference parameter
must refer to data that has a home.
Exception Handling
The CLR supports exceptional conditions or error handling by using exception
objects and protected blocks of code. A C# try block is an example of a
protected block of code. The CLR supports four different kinds of exception
handlers:
FinallyThis block will be executed when the method exits no matter
how the method exits, whether by normal control (either implicitly or by an
explicit ret) or by unhandled exception.
FaultThis block will be executed if an exception occurs, but not if
the method normally exits.
Type-filteredThis block of code will be executed when a match is
detected between the type of the exception for this block and the exception that
is thrown. This corresponds the C# catch block.
User-filteredThe determination whether this block should handle the
exception is made as the result of a set of IL instructions that can specify
that the exception should be ignored, that this handler should handle the
exception, or that the exception should be handled by the next exception
handler. For the reader who is familiar with Structured Exception Handling
(SEH), this is much like the __except handler.
Not every language that generates compliant IL code necessarily supports all
of the types of exception handling. For instance, C# does not support
user-filtered exception handlers, whereas VB does.
When an exception occurs, the CLR searches the exception handling array that
is part of the metadata with each method. This array defines a protected region
and a block of code that is to handle a specific exception. The
exception-handling array specifies an offset from the beginning of the method
and a size of the block of code. Each row in the exception-handling array
specifies a protected region (offset and size), the type of handler (from the
four types of exception handlers listed in the previous paragraph), and the
handler block (offset and size). In addition, a type-filtered exception handler
row contains information regarding the exception type for which this handler is
targeted. The user- filtered exception handler contains a label that starts a
block of code to be executed to determine at runtime whether the handler block
should be executed in addition to the specification of the handler region.
Listing 3.1 shows some C# pseudo-code for handling an exception.
Listing 3.1 C# Exception-Handling Pseudo-Code
try
{
// Protect block
. . .
}
catch(ExceptionOne e)
{
// Type-filtered handler
. . .
}
finally
{
// Finally handler
. . .
}
For the code in Listing 3.1, you would see two rows in the exception handler
array: one for the type-filtered handler and one for the finally block. Both
rows would refer to the same protected block of codenamely, the code in
the try block.
Listing 3.2 shows one more example of an exception-handling scheme, this time
in Visual Basic.
Listing 3.2 VB Exception-Handling Pseudo-Code
Try
'Protected region of code
. . .
Catch e As ExceptionOne When i = 0
'User filtered exception handler
. . .
Catch e As ExceptionTwo
'Type filtered exception handler
. . .
Finally
'Finally handler
. . .
End Try
The pseudo-code in Listing 3.2 would result in three rows in the
exception-handling array. The first Catch is a user-filtered exception handler,
which would be turned into the first row in the exception-handling array. The
second Catch block is a type-exception handler, which is the same as the
typed-exception handler in the C# case. The third and last row in the
exception-handling array would be the Finally handler.
When an exception is generated, the CLR looks for the first match in the
exception- handling array. A match would mean that the exception was thrown
while the managed code was in the protected block that was specified by the
particular row. In addition, for a match to exist, the particular handler must
"want" to handle the exception (the user filter is true; the type
matches the exception type thrown; the code is leaving the method, as in
finally; and so forth). The first row in the exception-handing array that the
CLR matches becomes the exception handler to be executed. If an appropriate
handler is not found for the current method, then the current method's
caller is examined. This continues until either an acceptable handler is found,
or the top of the stack is reached and the exception is declared unhandled.
Exception Control Flow
Several rules govern the flow of control within protected regions and the
associated handlers. These rules are enforced either by the compiler (the IL
code generator) or by the CLR because the method is JITd. Remember that a
protected region and the associated handler are overlaid on top of an existing
block of IL code. You cannot determine the structure of an exception framework
from the IL code that is specified in the metadata. The CLR enforces a set of
rules when transferring control to or from exception control blocks. These rules
are as follows:
Control can only pass into an exception handler block through the
exception mechanism.
There are two ways in which control can pass to a protected region (the
try block). First, control can simply branch or fall into the first instruction
of a protected region. Second, from within a type-filtered handler a leave
instruction can specify the offset to any instruction within a protected region
(not necessarily the first instruction).
The evaluation stack on entering a protected region must be empty. This
would mean that one cannot push values on to the evaluation stack prior to
entering a protected region.
Once in a protected region any of the associated handler blocks exiting
such a block is strictly controlled.
One can exit any of the exception blocks by throwing another
exception.
From within a protected region or in a handler block (not
finally or fault) a leave instruction may be executed
which is similar to an unconditional branch but has the side effect of emptying
the evaluation stack and the destination of a leave instruction can be any
instruction in a protected region.
A user-filtered handler block must be terminated by an endfilter
instruction. This instruction takes a single argument from the evaluation stack
to determine how exception handling should proceed.
A finally or fault block is terminated with an
endfinally instruction. This instruction empties the evaluation stack and
returns from the enclosing method.
Control can pass outside of a type-filtered handler block by rethrowing
the exception. This is just a specialized case for throwing an exception in
which the exception thrown is simply the exception that is currently being
handled.
None of the handler blocks or protected regions can execute a ret
instruction to return from the enclosing method.
No local allocation can be done from within any of the exception handler
blocks. Specifically, the localloc instruction is not allowed from any
handler.
Exception Types
The documentation indicates the exceptions that an individual instruction can
generate, but in general, the CLR can generate the following exceptions as a
result of executing specific IL instructions:
ArithmeticException
DivideByZeroException
ExecutionEngineException
InvalidAddressException
OverflowException
SecurityException
StackOverflowException
In addition, the following exceptions are generated as a result of object
model inconsistencies and errors:
TypeLoadException
IndexOutOfRangeException
InvalidAddressException
InvalidCastException
MissingFieldException
MissingMethodException
NullReferenceException
OutOfMemoryException
SecurityException
StackOverflowException
The ExecutionEngineException can be thrown by any instruction, and it
indicates that the CLR has detected an unexpected inconsistency. If the code has
been verified, this exception will never be thrown.
Many exceptions are thrown because of a failed resolution. That is, a method
was not found, or the method was found but it had the wrong signature, and so
forth. The following is a list of exceptions that are considered to be
resolution exceptions:
A few of the exceptions might be thrown early, before the code that caused
the exception is actually run. This is usually because an error was detected
during the conversion of the IL code to native code (JIT compile time). The
following exceptions might be thrown early:
MissingFieldException
MissingMethodException
SecurityException
TypeLoadException
Exceptions are covered in more detail in Chapter 15, "Using Managed
Exceptions to Effectively Handle Errors."
Remote Execution
If it is determined that an object's identity cannot be shared then a
remoting boundary is put in place. A remoting boundary is implemented by the CLR
using proxies. A proxy represents an object on one side of the remoting boundary
and all instance field and method references are forwarded to the other side of
the remoting boundary. A proxy is automatically created for objects that derive
from System.MarshalByRefObject.
Note
Remoting is covered in more
detail in Chapter 13, "Building Distributed Applications with .NET
Remoting."
The CLR has a mechanism that allows applications running from within the same
operating system process to be isolated from one another. This mechanism is
known as the application domain. A class in the base class library
encapsulates the features of an application domain known as AppDomain. A
remoting boundary is required to effectively communicate between two isolated
objects. Because each application domain is isolated from another application
domain, a remoting boundary is required to communicate between application
domains.
Memory Access
All memory access from within the runtime environment must be properly
aligned. This means that access to int16 or unsigned int16
(short or ushort; 2-byte values) values must occur on even boundaries. Access to
int32, unsigned int32, and float32 (int, uint, and
float; 4-byte values) must occur at an address that is evenly divisible by 4.
Access to int64, unsigned int64, and float64 (long,
ulong, and double; 8-byte values) must occur at an address that is evenly
divisible by 4 or 8 depending on the architecture. Access to any of the native
types (native int, native unsigned int, &) must
occur on an address that is evenly divisible by 4 or 8, depending on that native
environment.
A side effect of properly aligned data is that read and write access to it
that is no larger than the size of a native int is guaranteed to be
atomic. That is, the read or write operation is guaranteed to be
indivisible.
Volatile Memory Access
Certain memory access IL instructions can be prefixed with the volatile
prefix. By marking memory access as volatile it does not necessarily
guarantee atomicity but it does guarantee that prior to any read access to the
memory the variable will be read from memory. A volatile write simply
means that a write to memory is guaranteed to happen before any other access is
given to the variable in memory.
The volatile prefix is meant to simulate a hardware CPU register. If this is
kept in mind, volatile is easier to understand.
CLR Threads and Locks
The CLR provides support for many different mechanisms to guarantee
synchronized access to data. Thread synchronization is covered in more detail in
Chapter 11. Some of the locks that are part of the CLR execution model are as
follows:
Synchronized methodsSynchronized method locks that the CLR provides
either lock on a particular instance (locks on the this pointer) or in the case
of static locks, the lock is made on the type to which the method is defined.
Once held, a method lock allows access any number of times from the same thread
(recursion, other method calls, and so forth); access to the lock from another
thread will block until the lock is released.
Explicit locksThese locks are provided by the base class
library.
Volatile reads and writesAs stated previously, marking access to a
variable volatile does not guarantee atomicity except in the case where the size
of the value is less than or equal to that of a native int and it is
properly aligned.
Atomic operationsThe base class library provides for a number of
atomic operations through the use of the System.Threading.Interlocked
class.
Summary
This chapter provided a brief overview of the framework under which managed
code runs. If you keep in mind that at the lowest level, the CLR is an engine
that allows the execution of IL instructions, you will have an easier time
understanding both IL and how your code runs with the CLR.
This chapter detailed the rules for loading an assembly and starting
execution of a method. It also supplied detailed information about control flow
from within a method call. It explored in depth the built-in mechanisms for
handling errors and exceptions from within this runtime environment. In
addition, it discussed the runtime support for remoting that is built into the
CLR. Finally, it revealed how the code that is running under the CLR accesses
memory and synchronizes access to methods when multiple threads could
potentially have access to the memory store.
From the book: .NET Common Language Runtime Unleashed
© Copyright Pearson Education. All rights reserved.