One problem that arises in multiprogrammed systems is deadlock. A process or thread is in a state of deadlock (or is deadlocked) if the process or thread is waiting for a particular event that will not occur. In a system deadlock, one or more processes are deadlocked. Most deadlocks develop because of the normal contention for dedicated resources (i.e., resources that may be used by only one user at a time). Circular wait is characteristic of deadlocked systems.
One example of a system that is prone to deadlock is a spooling system. A common solution is to restrain the input spoolers so that, when the spooling files begin to reach some saturation threshold, they do not read in more print jobs. Today's systems allow printing to begin before the job is completed so that a full, or nearly full, spooling file can be emptied or partially cleared even while a job is still executing. This concept has been applied to streaming audio and video clips, where the audio and video begin to play before the clips are fully downloaded.
In any system that keeps processes waiting while it makes resource-allocation and process scheduling decisions, it is possible to delay indefinitely the scheduling of a process while other processes receive the system's attention. This situation, variously called indefinite postponement, indefinite blocking, or starvation, can be as devastating as deadlock. Indefinite postponement may occur because of biases in a system's resource scheduling policies. Some systems prevent indefinite postponement by increasing a process's priority as it waits for a resourcethis technique is called aging.
Resources can be preemptible (e.g., processors and main memory), meaning that they can be removed from a process without loss of work, or nonpreemptible meaning that they (e.g., tape drives and optical scanners), cannot be removed from the processes to which they are assigned. Data and programs certainly are resources that the operating system must control and allocate. Code that cannot be changed while in use is said to be reentrant. Code that may be changed but is reinitialized each time it is used is said to be serially reusable. Reentrant code may be shared by several processes simultaneously, whereas serially reusable code may be used by only one process at a time. When we call particular resources shared, we must be careful to state whether they may be used by several processes simultaneously or by only one of several processes at a time. The latter kindserially reusable resourcesare the ones that tend to become involved in deadlocks.
The four necessary conditions for deadlock are: a resource may be acquired exclusively by only one process at a time (mutual exclusion condition); a process that has acquired an exclusive resource may hold it while waiting to obtain other resources (wait-for condition, also called the hold-and-wait condition); once a process has obtained a resource, the system cannot remove the resource from the process's control until the process has finished using the resource (no-preemption condition); and two or more processes are locked in a "circular chain" in which each process in the chain is waiting for one or more resources that the next process in the chain is holding (circular-wait condition). Because these are necessary conditions for a deadlock to exist, the existence of a deadlock implies that each of them must be in effect. Taken together, all four conditions are necessary and sufficient for deadlock to exist (i.e., if all these conditions are in place, the system is deadlocked).
The four major areas of interest in deadlock research are deadlock prevention, deadlock avoidance, deadlock detection, and deadlock recovery. In deadlock prevention our concern is to condition a system to remove any possibility of deadlocks occurring. Havender observed that a deadlock cannot occur if a system denies any of the four necessary conditions. The first necessary condition, namely that processes claim exclusive use of the resources they require, is not one that we want to break, because we specifically want to allow dedicated (i.e., serially reusable) resources. Denying the "wait-for" condition requires that all of the resources a process needs to complete its task be requested at once, which can result in substantial resource underutilization and raises concerns over how to charge for resources. Denying the "no-preemption" condition can be costly, because processes lose work when their resources are preempted. Denying the "circular-wait" condition uses a linear ordering of resources to prevent deadlock. This strategy can increase efficiency over the other strategies, but not without difficulties.
In deadlock avoidance the goal is to impose less stringent conditions than in deadlock prevention in an attempt to get better resource utilization. Avoidance methods allow the possibility of deadlock to loom, but whenever a deadlock is approached, it is carefully sidestepped. Dijkstra's Banker's Algorithm is an example of a deadlock avoidance algorithm. In the Banker's Algorithm, the system ensures that a process's maximum resource need does not exceed the number of available resources. The system is said to be in a safe state if the operating system can guarantee that all current processes can complete their work within a finite time. If not, then the system is said to be in an unsafe state. Dijkstra's Banker's Algorithm requires that resources be allocated to processes only when the allocations result in safe states. It has a number of weaknesses (such as requiring a fixed number of processes and resources) that prevent it from being implemented in real systems.
Deadlock detection methods are used in systems in which deadlocks can occur. The goal is to determine if a deadlock has occurred, and to identify those processes and resources involved in the deadlock. Deadlock detection algorithms can incur significant runtime overhead. To facilitate the detection of deadlocks, a directed graph indicates resource allocations and requests. Deadlock can be detected using graph reductions. If a process's resource requests may be granted, then we say that a graph may be reduced by that process. If a graph can be reduced by all its processes, then there is no deadlock. If a graph cannot be reduced by all its processes, then the irreducible processes constitute the set of deadlocked processes in the graph.
Deadlock recovery methods are used to clear deadlocks from a system so that it may operate free of the deadlocks, and so that the deadlocked processes may complete their execution and free their resources. Recovery typically requires that one or more of the deadlocked processes be flushed from the system. The suspend/resume mechanism allows the system to put a temporary hold on a process (temporarily preempting its resources), and, when it is safe to do so, resume the held process without loss of work. Checkpoint/rollback facilitates suspend/resume capabilities by limiting the loss of work to the time at which the last checkpoint (i.e., saved state of the system) was taken. When a process in a system terminates (by accident or intentionally as the result of a deadlock recovery algorithm), the system performs a rollback by undoing every operation related to the terminated process that occurred since the last checkpoint. To ensure that data in the database remains in a consistent state when deadlocked processes are terminated, database systems typically perform resource allocations using transactions.
In personal computer systems and workstations, deadlock has generally been viewed as a limited annoyance. Some systems implement the basic deadlock prevention methods suggested by Havender, while others ignore the problemthese methods seem to be satisfactory. While ignoring deadlocks may seem dangerous, this approach can actually be rather efficient. If deadlock is rare, then the processor time devoted to checking for deadlocks significantly reduces system performance. However, given current trends, deadlock will continue to be an important area of research as the number of concurrent operations and number of resources become large, increasing the likelihood of deadlock in multiprocessor and distributed systems. Also, many real-time systems, which are becoming increasingly prevalent, require deadlock-free resource allocation.