Tuesday, March 11, 2008
Shared Memory, Multi/Many-Core and OpenMP
A typical hardware architecture these days could be outlines as follows: n cores or processors are interconnected to memory segments that can be shared. Each processor has the same access conditions (with the focus on performance) to each memory segment. OpenMP is a right choice for this hardware architecture (I’m not talking about the program code or the arithmetic inside). This is done by the developer using directives which is an explicit process. Parallel sections (typically loops) must be defined. The parallelization is done by the compiling system (using the OpenMP libraries). This means that data decomposition and communication are implicit. Parallel sections are called parallel regions; typically comprising a master thread and a team of threads which is basically a fork-join model. Only the master thread continues when the parallel region is completed. Variables can be shared among all threads or duplicated for each thread. Shared variables are used by threads for communication purposes. All this must be done carefully. OpenMP does not stop you from creating race conditions and dead-locks! More about the parallel regions and important environment variables in my next post.