A single process can contain multiple threads, all of which are executing the same program. These threads share the same global memory (data and heap segments), but each thread has its own stack (automatic variables).
Each of the threads in a process has a unique thread identifier (stored in the
type pthread_t
). This identifier is returned to the caller of
pthread_create()
, and a thread can obtain its own thread identifier using
pthread_self()
. Thread IDs are guaranteed to be unique only within a process.
A thread ID may be reused after a terminated thread has been joined, or a
detached thread has terminated. In all pthreads function that accept a thread ID
as an argument, that ID by definition refers to a thread in the same process as
the caller.
A thread-safe function is one that can be safely (i.e., it will deliver the same results regardless of whether it is) called from multiple threads at the same time. POSIX.1-2001 and POSIX.1-2008 require that all functions specified in the standard shall be thread-safe, except for the ones listed in the pthreads(7) manpage.
A thread is like a process, except for one difference: a thread shares the same address space as other threads in a process and thus can access the same data. The state of a thread is very similar to that of a process. It has a program counter/instruction pointer (PC/IP) that tracks where the program is fetching instructions from. Each thread has its own set of registers; thus, if there are two threads running on a single processor, when switching from one to the other, a context switch must take place. Another major difference between threads and processes concerns the stack. In a single-thread process, there is a single stack, usually residing at the bottom of the address space.
However, in a multi-threaded process, each thread runs independently and of course may call into various routines to do whatever work it is doing. Instead of a single stack in the address space, there will be one per thread. Thus, any stack-allocated variables, parameters, return values, and other things that we put on the stack will be placed in what is sometimes called thread-local storage, i.e., the stack of the relevant thread.
Let’s imagine a simple example where two threads wish to update a global shared variable. This is the code:
Each worker (thread) is adding a number to the shared variable counter
10
million times (1e7) in a loop. Thus, if we have 2 threads, the final counter
value should be 20,000,000.
If we run the program, we’ll see that we don’t necessarily get the desired
result.
To understand why this happens, we must understand the code sequence that the
compiler generates for the update to counter. Here’s a section of the
disassembly (objdump -d prog) of the generated executable:
In this case, the variable counter is located at memory address 0x60209c. First, the mov
instruction is used to put the value at 0x60209c into the eax
register. Then,
the add is performed, adding 1 ($0x1) to eax, and finally, the contents of eax
are stored back into memory, therefore updating the counter variable.
Let’s say that two threads are trying to update the counter with an initial value of 50. If thread 2 runs only after thread 1 has completed, everything will be fine and our counter will be set to 52. But here’s what will happen if thread 1 runs the first two instructions and then the OS switches to thread 2, executing all three instructions:
OS | Thread 1 | Thread 2 | PC | eax | counter |
mov 0x60209c,%eax | 400c1f | 50 | 50 | ||
add $0x1,%eax | 400c26 | 51 | 50 | ||
interrupt | |||||
save T1’s state | |||||
restore T2’s state | |||||
mov 0x60209c,%eax | 400c1f | 50 | 50 | ||
add $0x1,%eax | 400c26 | 51 | 50 | ||
mov %eax,0x60209c | 400c2b | 51 | 51 | ||
interrupt | |||||
save T2’s state | |||||
restore T1’s state | |||||
mov %eax,0x60209c | 400c2b | 51 | 51 |
As you can see, the counter will be 52 at the end. This is a race condition. Because multiple threads executing this code can cause a race condition, we call this code a critical section. A critical section is a piece of code that acesses a shared resource and must not be concurrently executed by more than one thread. What we want for this code is what we call mutual exclusion. This property guarantees that if one thread is executing within the critical section, the others will be prevented from doing so.
See also Edsger Dijkstra’s 1968 paper “Cooperating Sequential Processes” for an amazingly clear description of the problem.
#include <pthread.h>
int pthread_create(pthread_t * thread,
const pthread_attr_t attr,
void * (*start_routine)(void*),
void * arg);
There are four arguments: thread, attr, start_routine and arg. The first,
thread
, is a pointer to a structure of type pthread_t
; we’ll use this
structure to interact with this thread, and thus we need to pass it to
pthread_create()
in order to initialize it.
The second argument, attr
, is used to specify any attributes this thread might
have. Some examples include setting the stack size or perhaps information about
the scheduling priority of the thread. An attribute is initialized with a
separate call to pthread_attr_init()
. However, in most cases, the defaults will
be fine; so we simply pass the value NULL in.
The third argument is the most complex. It is a function pointer, and this one
tells us the following is expected: a function name (start_routine
), which is
passed a single argument of type void *
(as indicated in the parentheses after
start_routine
), and which returns a value of type void *
.
Finally, the fourth argument, arg
, is the argument to be passed to the
function where the thread begins execution. You might ask: why do we need these
void pointers? Well, the answer is quite simple: having a void pointer as an
argument to the function start_routine
allows us to pass in any type of
argument; having it as a return value allows the thread to return any type of
result.
Use pthread_join()
to wait for a thread’s completion:
int pthread_join(pthread_t thread, void **value_ptr);
The first argument of type pthread_t
is used to specify which thread to wait
for. This variable is initialized by pthread_create()
.
The second argument is a pointer to the return value you expect to get back.
Because the routine can return anything, it is defined to return a pointer to
void; because the pthread_join()
routine changes the value of the passed in
argument, you need to pass in a pointer to that value, not just the value itself.
Use locks for providing mutual exclusion to critical sections of code. The POSIX threads library provides several functions for this purpose:
int pthread_mutex_lock(pthread_mutex_t *mutex);
int pthread_mutex_unlock(pthread_mutex_t *mutex);
When you have a critical section that needs to be protected by locks, you can use this pair of functions:
pthread_mutex_t lock = PTHREAD_MUTEX_INITIALIZER;
pthread_mutex_lock(&lock);
x = x + 1; // Or whatever your critical section is
pthread_mutex_unlock(&lock);
If no other thread holds the lock when pthread_mutex_lock()
is called, the
thread will acquire the lock and enter the critical section. If another thread
does hold a lock, the calling thread will not return from the call until it has
acquired the lock (implying that the thread holding the lock has released it
through pthread_mutex_unlock()
). All locks must be properly initialized to
guarantee that they have correct values and thus work as desired when lock and
unlock are used. One way to initialize is using PTHREAD_MUTEX_INITIALIZER
. The
dynamic way is using pthread_mutex_init()
:
int rc = pthread_mutex_init(&lock, NULL);
assert(rc == 0);
A corresponding call to pthread_mutex_destroy()
should also be made when
you’re done with the lock. See manpage.
There are other routines to interact with locks. These might be useful:
int pthread_mutex_trylock(pthread_mutex_t *mutex);
int pthread_mutex_timedlock(pthread_mutex_t *mutex, struct timespec *abs_timeout);
The trylock
version returns failure if the lock is already held (instead of
the normal version which waits until it is released to acquire it); the
timedlock
version returns after a timeout or after acquiring the lock,
whichever happens first. Both should be generally avoided, but they might be
useful in some situations (e.g. deadlocks).
Condition variables are used for signaling between threads, so a thread can wait for another before it continues.
int pthread_cond_wait(pthread_cond_t *cond, pthread_mutex_t *mutex);
int pthread_cond_signal(pthread_cond_t *cond);
You initialize ~pthread_cond_t~s with PTHREAD_COND_INITIALIZER
To use a condition variable, one has to have a lock that is associated with this condition. When calling either of the above routines, this lock should be held.
The first routine, pthread_cond_wait
, puts the calling thread to sleep, and
thus waits for some other thread to signal it, usually when something in the
program has changed that the now-sleeping thread might care about.
man -k pthreads
- Keep it simple
- Above all else, any code to lock or signal between threads should be as simple as possible. Tricky thread interactions lead to bugs.
- Minimize thread interactions
- Try to keep the number of ways in which threads interact to a minimum. Each interaction should be carefully thought out and constructeed with tried and true approaches.
- Initialize locks and condition variables
- Failure to do so will lead to code that sometimes works and sometimes doesn’t.
- Check your return codes
- Even an assert() is better than nothing.
- Be careful with how you pass arguments to, and return values from, threads
- In particular, any time you are passing a reference to a variable allocated on the stack, you’re probably doing something wrong.
- Each thread has its own stack
- As related to the point above, remember each thread has its own stack. Thus, if you have a locally-allocated inside of some function a thread is executing, it is essentially private to that thread; no other thread can (easily) access it. To share data between threads, remember that the values must be in the heap.
- Always use condition variables to signal between threads
- While it is often tempting to use a simple flag, don’t do it.
- Use the manpages
- On Linux, in particular, the pthread manpages are highly informative and discuss much of the nuances presented here, often in even more detail. Read them carefully!