<< Chapter < Page Chapter >> Page >

Given that we have multithreaded capabilities and multiprocessors, we must still convince the threads to work together to accomplish some overall goal. Often we need some ways to coordinate and cooperate between the threads. There are several important techniques that are used while the program is running with multiple threads, including:

  • Fork-join (or create-join) programming
  • Synchronization using a critical section with a lock, semaphore, or mutex
  • Barriers

Each of these techniques has an overhead associated with it. Because these overheads are necessary to go parallel, we must make sure that we have sufficient work to make the benefit of parallel operation worth the cost.

Fork-join programming

This approach is the simplest method of coordinating your threads. As in the earlier examples in this chapter, a master thread sets up some global data structures that describe the tasks each thread is to perform and then use the pthread_create( ) function to activate the proper number of threads. Each thread checks the global data structure using its thread-id as an index to find its task. The thread then performs the task and completes. The master thread waits at a pthread_join( ) point, and when a thread has completed, it updates the global data structure and creates a new thread. These steps are repeated for each major iteration (such as a time-step) for the duration of the program:


for(ts=0;ts<10000;ts++) { /* Time Step Loop */ /* Setup tasks */for (ith=0;ith<NUM_THREADS;ith++) pthread_create(..,work_routine,..) for (ith=0;ith<NUM_THREADS;ith++) pthread_join(...) }work_routine() { /* Perform Task */return; }

The shortcoming of this approach is the overhead cost associated with creating and destroying an operating system thread for a potentially very short task.

The other approach is to have the threads created at the beginning of the program and to have them communicate amongst themselves throughout the duration of the application. To do this, they use such techniques as critical sections or barriers.

Synchronization

Synchronization is needed when there is a particular operation to a shared variable that can only be performed by one processor at a time. For example, in previous SpinFunc( ) examples, consider the line:

globvar++;

In assembly language, this takes at least three instructions:


LOAD R1,globvar ADD R1,1STORE R1,globvar

What if globvar contained 0, Thread 1 was running, and, at the precise moment it completed the LOAD into Register R1 and before it had completed the ADD or STORE instructions, the operating system interrupted the thread and switched to Thread 2? Thread 2 catches up and executes all three instructions using its registers: loading 0, adding 1 and storing the 1 back into globvar . Now Thread 2 goes to sleep and Thread 1 is restarted at the ADD instruction. Register R1 for Thread 1 contains the previously loaded value of 0; Thread 1 adds 1 and then stores 1 into globvar . What is wrong with this picture? We meant to use this code to count the number of threads that have passed this point. Two threads passed the point, but because of a bad case of bad timing, our variable indicates only that one thread passed. This is because the increment of a variable in memory is not atomic . That is, halfway through the increment, something else can happen.

Get Jobilize Job Search Mobile App in your pocket Now!

Get it on Google Play Download on the App Store Now




Source:  OpenStax, High performance computing. OpenStax CNX. Aug 25, 2010 Download for free at http://cnx.org/content/col11136/1.5
Google Play and the Google Play logo are trademarks of Google Inc.

Notification Switch

Would you like to follow the 'High performance computing' conversation and receive update notifications?

Ask