Thursday, August 16, 2012

Threads, Threads Overhead ,Kernel Internals, Kernel Architecture and Asynchronous Methods in C# - Part 2

Thread Context Switching:


A single CPU can only do one thing at a time. Windows keeps switching between processes to give user robust & responsive system and also better overall experience. Windows Scheduler give slices of CPU time to each thread. It is called Quantum(Time period varies from architecture to architecture). After each time-slice elapses windows scheduler switches to another thread. This is called thread context switching. This switching enables multiple threads to share same CPU & hardware resources to provide "multitasking" support. 

The next thread that gets CPU might be from different process itself. In such case windows has to change to that process virtual address space as seen by the CPU before executing any code of that process or processing any data  of that process. 

Each Context switch  has to go through below steps:
  1. Save the context of the thread that just finished executing.
  2. Place the thread that just finished executing at the end of the queue for its priority.
  3. Find the highest priority queue that contains ready threads.
  4. Remove the thread at the head of the queue, load its context, and execute it.
Saving the context requires thread context must be stored somehow so that when next switch happens to that thread we can restore this information. This step of storing and restoring from thread's context structure includes,
  • Saving the values of CPU registers that were assigned to the current thread into thread's context structure inside thread's kernel object data structure.  
  • Changing virtual address space if required as explained before.
  • Loading from next thread's context into CPU registers.
This is performance overhead at the cost of giving user a responsive system. This switching would allow,
  • Avoiding CPU starvation by preempting ready threads over threads that are waiting for input or resource.
  • Execution Ready threads to be taken up according to their Priority.
  • CPU time to all the processes allowing each one to run.
  • Keep system responsive even when few threads\processes go into deadlock or infinite loop.
Hence multiple thread approach will add to number of context switches which in turn affects performance.  

Now lets see how new thread creation adds to Windows performance overhead:

When windows context switches CPU from one thread to another, previously executing thread's data and code reside in CPU cache, so that CPU does not have access RAM. This is to avoid latency in accessing information from memory. When a new thread is created, it might have to access different data and execute different code altogether. Since this will not be on CPU cache, it has to populate this data from RAM into cache, to speed up the processing speed. This might happen every time a new thread is added, thus causing performance overhead.[CLR Via C#].

Now that we understand why thread creation, destruction and maintenance causes affects performance and memory efficiency , we will see how number of threads affect GC performance.

Number of threads also affect the performance of Garbage Collector. Before collection or GC cycle, GC must suspend all the threads. GC walks through stack of each thread to mark the root of each heap object, walk their stack again to update the stack to updating the roots of objects that have moved during compaction. GC resumes all the threads only after this collection cycle. This causes GC to perform slow.

Multiple threads keep the application responsive but they also have above explained overheads. When designing an application with multiple threads we need to understand real intent of each thread and its necessity before creating one.

I will end thread overhead concepts here. In my next articles, I will go briefly into Kernel and its architecture.










No comments:

Post a Comment