Monday, August 13, 2012

Threads, Threads Overhead ,Kernel Internals, Kernel Architecture and Asynchronous Methods in C# - Part 1

Before we get into Asynchronous methods and using them, let us find answers for:

What are threads, What is a kernel object and thread kernel object,  How threads are created and destroyed, How windows handles Threads, What a thread consumes, What is Context switching, Why do we use multiple threads, What is asynchronous in terms of CLR, Why we need asynchronous programming, How is it different from traditional threaded programming, What is ThreadPool, How can we achieve asynchronous behavior using ThreadPool, How ThreadPool and CLR are related etc.

I assure you I will not get into MultiThreading concepts or Thread synchronization concepts when trying to explain threads and asynchronous programming.

Threads:


As we understand basics of thread, Thread is a basic unit of execution. Threads can be thought of as logical CPUs. We can say thread's job is to virtualize CPU. Different processes run on different threads hence each of them have an isolated environment of execution. Thread is responsible for execution of application code. Also, Threading allows us to deviate from current or main thread and execute some computation or some method on different thread. We mostly use multithreading in scenarios where we need to keep the UI responsive irrespective of things going on underneath, when we need to compute\parse\work with large information which might not be of immediate concern, backup and restore activities which need not keep user idle, waiting for keystrokes or waiting for other processes or waiting on events etc.

Aim of any application is to avoid keeping user idle while he waits for something to happen. Threading allows to create separate execution context for these event based or computing based methods or I/O wait methods while letting the user work on the application. 

Everything sounds good and concept of threading seems to meet all our parallel execution or background execution needs. Don't get into parallel execution right away! We need to understand many other things before we get into parallel execution.

Quick Notes here:

a. All threads of a process share its virtual address space and system resources.

Threads Overhead: 


Thread is an OS concept than language or framework feature. Threads do have overhead like performance and space overhead associated with them. Two majors types of overhead with Threads are:

1. Memory & Resources that are associated with Thread.
2. Context Switching between Threads

OS allocates memory for each thread creation. Memory is allocated for different purposes. We will see each of them in brief. Every thread contains each of the following:

1. Thread Kernel Object:


Kernel object is a block of memory allocated by the kernel. This memory block is a data structure that contains and maintains information about the object. Typical Kernel objects are File objects, pipe object, mutex object, file mappings etc. In general any kernel(OS) function called by your application, that returns an handle back to the application may create a kernel object, but not always. An Application has to keep track of all kernel objects that are created by its process. In turn application can keep track of kernel objects by their handles. These handles can be later used by the process to clear off all the resources\objects that process created. Consider example of process. Process object is also a kernel object. Whenever a process is created, a table is created in which indexes of all the objects used by process are stored. Any kernel objects created by your process gets an entry in this table. 

Handle is nothing but the index to this kernel object's entry in this process table. Kernel takes care clearing a kernel object whenever the usage count of that object becomes zero.

Quick points here:

a. Usage count is the common property across all these kernel object data structures. 
b. Closehandle will decrease the usage count of kernel object by 1. It will cleared off only when the usage count reaches 0. Hence a kernel object may remain in memory even after the process dies. 
Does not this point touch something about "Disposing" unmanaged resources in our managed code? Yes, you got it! 

Now back to threads, OS creates and allocates a kernel object data structure for each thread created. This data structure contains information about thread. This data structure also contains information about "Thread context". This is a memory block that contains set of CPU registers. When Windows is running on a machine with an x86 CPU, the thread’s context uses about 700 bytes of memory. For x64 and IA64 CPUs, the context is about 1,240 and 2,500 bytes of memory, respectively.[CLR Via C#]. Each thread maintains exception handlers, a scheduling priority, thread local storage, a unique thread identifier, and a set of structures that system requires. These all information will be part of kernel object data structure.

Thread Context includes thread's set of CPU's registers, kernel stack, thread environment block and user stack in the address space of user's process. We will see each of this concept one by one.

2. Thread Environment Block(TEB or TIB): 


TEB is also called as Thread information Block (TIB). TEB is user-mode portion of Windows thread control structures. Structure of TEB can be seen here. TEB structure should not be directly accessed .  This contains head of thread's exceptional handling chain. 
This also contains storage for data that is local to the thread and also some data structures that can be used by user interface modules like OpenGL and GDI. This structure consumes 1 page of memory.

More information about TEB and viewing TEB information can be found here.

3. User Stack:


a. This stack is used to store all local variables and parameters passed to methods.
b. It also contains address indicating what thread should execute next after returning from current method.

1MB is allocated by default by windows for a thread's user stack.

4. Kernel Stack: 


In simple terms, this is the stack used by kernel. Any parameters that are passed from user-mode code to kernel are copied from thread's user stack to thread's kernel stack. OS does this copying. Kernel also stores local variables that are used by thread's kernel method on this stack. This stack is used by kernel to store all the arguments that are passed to other kernel methods, to store all local data of those functions and to store return addresses.

All arguments passed to kernel method is validated before OS starts operating on them. Once OS starts operating on these arguments , application will not be able to modify these arguments' values. The kernel-mode stack is 12 KB when running on a 32-bit Windows system and 24 KB when running on a
64-bit Windows system.

5. DLL Thread ATTACH and Thread DETACH notifications:


Whenever a thread is created in a process, windows calls DLLMain methods of all the dlls that are loaded in the process by passing DLL_THREAD_ATTACH flag. Also when a thread dies, windows calls DLLMain method with DLL_THREAD_DETACH flag on all the dlls that are loaded in the process. This may be required to initialize the thread with expected state as the executing library wants it.

Imagine the case where  there are numerous dlls loaded in our process and windows has to call DLLMain method on all dlls for each thread creation and destruction. That sure is a overhead, is not it?

All the above factors that a thread contains makes us understand, memory (space) and performance (time) overhead that is involved in thread creation, destroying and keeping it alive.

But this is not all, Context Switching between threads also adds to performance overhead. Let us see this in detail in my next article.

------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
I will continue with more information about context switching adding to thread overhead, more into kernel and OS architecture and need for asynchronous methods in next my article.

Note: This article is the information that I pooled from CLR Via C# book, MSDN articles, University websites, StackOverflow answers, Albahari's C# 4.0 Nutshell etc. If you have any problem with content please let me know.

No comments:

Post a Comment