Process States
This is a unix-oriented description, but most real operating systems
are fundamentally the same in this area.
- fork
A process is brought into existence by some other process executing
the fork() function. This is the only way a process is ever created, except
for the first ever process with is built by hand by the O.S.
when it starts up.
- init
Creating a process isn't instantaneous. There is a period when the process
occupies space in the process table and has resources allocated to
it (memory mostly) but is not yet ready to run. This is the "init" state.
- ready
As soon as a process has been fully created it is ready to run. Processes
can only be created ready to run, as fork() creates an almost exact duplicate
of the calling process, and that process must have been runnable in order
to call the fork function in the first place.
A new process is put at the end of the scheduler's queue with the appropriate
priority.
- runnable
"runnable" is the state of a process that is ready to run (it has instructions
to execute and is not waiting for I/O to be completed), but isn't actually running
because some other process is on the CPU.
Runnable processes are kept in queues by the scheduler, usually it keeps a separate
queue for each possible priority level. When it is time to select a new process for
execution, the scheduler finds the highest priority non-empty queue, and selects
the first process in that queue.
- sched.
When the scheduler selects a new process for execution, that process' volatile
state (correct contents of CPU registers, page table, etc) are loaded from
wherever they were saved (virtual page 0, etc), and the process is allowed to run.
The scheduler requests a timer interrupt with a very short delay (just a few mS),
so that it can take over again and give another process a turn.
- running
"running" is the state of the one process that is actually executing on the
CPU right now (on multi-cpu systems there may be more than one of course).
Descriptions based on unix tend not to distinguish between runnable and running.
When you use the "ps" command, the currently executing process is not normally
distinguished in any way. It doesn't need to be: you know that when the output
from the ps command was produced, it was the process thatis executing ps that
was running.
- time
Just before letting a process run on the CPU, the scheduler requests a timer
interrupt for a short distance into the future. When that interrupt happens,
the scheduler takes over again. The process that was executing on the CPU is
put into suspended animation, its volatile state is saved somewhere safe,
and it it put back at the end of the scheduler's queue for the appropriate
priority level. Another process is then chosen.
Modern computers are so fast, and people interacting with computers so slow,
that most of the time there are no runnable processes on a typical
system. When a user finished typing a command, the computer can usually
finish executing it before any other users manage to complete any more commands.
When there is only one runnable process in the highest priority non-empty
queue, a sensible scheduler will not bother stopping it when the timer interrupt
occurs.
- request
Very frequently, a running process will not use up all of its turn on the
CPU before it reaches a point from which it can not continue without some
co-operation from the outside world. This may be a request for input or
output to or from the user, a disc file, a network connection, or whatever,
or it might be a hard page fault that needs to be resolved, or the process
might voluntarily surrender the CPU, perhaps using the sleep() or wait()
function, because it has nothing else to do for a while.
When this happens, the process is suspended in the normal way, but it
is not put back on the end of one of the scheduler's queues. Instead
it is kept in a pool of inactive processes.
- waiting
"waiting" is the state of a process which has not terminated, but
for some reason can not run just yet, so can not be put on any of
the scheduler's queues. A process may be waiting for I/O, a hard page
fault, a specified sleep time, or because the user interactively suspended
it.
Most systems distinguish between a large number of different
"waiting" states. A process that is waiting for a hard page fault to
be resolved is likely to wake up again much sooner than a process
that is waiting for user input, or one that is waiting for 20
seconds to pass. It can be helpful (at least to users) to be able
to tell the difference between the different knds of wait (see below
for more details).
- completed
When whatever event or action a waiting process was waiting for
happens (or is completed) the waiting process becomes able to run again.
It can't just take over the CPU immediately, because some other process
will be running. Instead, it is added to the end of the scheduler's
queue with the appropriate priority.
If a process with higher priority than the one currently running
becomes runnable again because of some event, most systems will
immediately simulate a timer interrupt, so the running process
is suspended, and the newly runnable high-priority can start to
run immediately. Don't think of that as "unfair" in any way. The
whole point of process priorities is that this is what is supposed
to happen. If you wanted all your processes to behave "fairly", you
would have geven them all the same priority level. High priority
is usually available only to essential system processes that only
rarely have to run.
A clever trick to make a slow system seem much faster involves
cheating a little with priority levels. When a process that was
waiting for interactive input from a user actually receives that
input (this is really the only time a user notices the responsiveness
of the system), it can be put at the end of the scheduler's queue
for a slightly higher priority level than it really should have.
This means that the interactive process will almost certainly be the
next to execute as soon as it receives interactive input, but after that
first accelerated turn on the CPU, it goes back to its normal placement
in the correct queue.
- exit
In unix, the only way a process can terminate is by executing a call
to the system function exit(). Even when you type control-C to stop
a program, or if the super-user sends a "kill -9" message to it, what
really happens is that the program jumps to a type of interrupt
handling function that then calls the exit() function. You can
change the interrupt handlers so that control-Cs are ignored, but
the handler for "kill -9" is not modifiable.
So only currently executing programs can be terminated. This simplifies
the state transition diagram, and ensures that when a process is terminated,
at least its most important virtual memory pages are readily available
in physical memory (otherwise it couldn't have been running).
- exitting
Under unix, a process can't completely terminate just because it wants to
every process that creates another process (using fork) has the right
and duty to be notified when that process terminates, and find out
whether it completed its task successfully or not. A process that has
executed the exit() function will never run again, and has most of
its resources released back to the system, but is not completely
destroyed until its parent has acknowledged receipt of the termination
notification message.
Processes that have exitted but whose parents have not (yet) acknowledged
that fact, are in the "exitting" state, and in the unix-speaking world
are called Zombie Processes. If you write a program that executes
fork() but does not execute wait() correctly, you will clutter up the
system with zombie processes. They don't occupy many resources, but
do fill up a slot in the process table.
If a parent process terminates without acknowledging the termination
messages from its children (perhaps because it doesn't bother with
wait() calls, or perhaps because the parent process just happens to
terminate before its children), the orphaned child processes are
adopted by the parent process' parent process (i.e. their grand-parent
process). The very top process of all (the first one created) has
as its sole duty repeatedly executing wait() so that all the orphaned
processes it eventually adopts can rest in peace.
- deleted
Once a process' termination signal has been acknowledged, the process
is completely removed. All its pages of memory and all other resources
are returned to the system, and its slot in the process table is vacated.
Eventually even its PID number may be re-used.
Unix distinguishes between (usually) five different wait states, although
systems can vary. The most common wait states are:
- Page Wait
The process has had a hard page fault and is waiting for a virtual
page to be paged back into physical memory. Page file reads are sometimes
given a higher priority than normal disc file reads, and on more expensive
systems, the virtual memory page file has a whole fast disc drive to itself,
so page waits are usually the shortest waits of all.
- Disc Wait
The process is waiting for a normal disc operation to complete. This usually
only takes a few milli-seconds, so this is a very short kind of wait. Many
versions of unix do not distinguish between Page waits and Disc waits.
- Sleeping
The program has executed the sleep(), usleep(), or similar functions,
requesting less that a few seconds of delay.
- Idle Wait
The process is waiting for something that the system has guessed will
probably take at least a couple of seconds. All input from an interactive
user causes an Idle wait. Usually network I/O does too, as do calls to
sleep() or usleep() that requested more than just a couple of seconds'
delay.
- Stopped
Usually means that the user has stopped the process by typing control-Z
to it. Unlike control-C, control-Z does not kill or attempt to terminate
the process, it just puts it to sleep for an indeterminate time. The
user can wake it up with the "fg" command.
The "ps" command
The "ps" command tells you about processes on the system. It has a lot
of options, but only a few are generally useful. Normally it only
provides minimal information, the "-u" option asks for more, and is
nearly always a good ide. Normally ps only tells you about processes
that you created, the "-a" option makes it tell you about processes
owned by other users, and is a good idea if you want to see some variety.
The "-x" option is only useful if you really want to see everything that
is going on, it makes even processes that are not attached to any
terminal get listed when they are normally ignored. Options can
be combined, the most useful form of ps is "ps -au". Here is the
output produced by "ps -au" at 3.25pm on Wednesday 21st November 2001:
USER PID %CPU %MEM VSZ RSS TT STAT STARTED TIME COMMAND
root 9008 0.0 0.1 432 220 p3 R+ 3:37PM 0:00.00 ps -au
root 366 0.0 0.2 924 516 v1 Is+ 6Sep01 0:00.01 /usr/libexec/getty Pc ttyv1
root 367 0.0 0.2 924 516 v2 Is+ 6Sep01 0:00.01 /usr/libexec/getty Pc ttyv2
root 368 0.0 0.2 924 516 v3 Is+ 6Sep01 0:00.01 /usr/libexec/getty Pc ttyv3
root 369 0.0 0.2 924 516 v4 Is+ 6Sep01 0:00.01 /usr/libexec/getty Pc ttyv4
root 370 0.0 0.2 924 516 v5 Is+ 6Sep01 0:00.01 /usr/libexec/getty Pc ttyv5
root 371 0.0 0.2 924 516 v6 Is+ 6Sep01 0:00.01 /usr/libexec/getty Pc ttyv6
root 372 0.0 0.2 924 516 v7 Is+ 6Sep01 0:00.01 /usr/libexec/getty Pc ttyv7
stephen 98642 0.0 0.4 1404 948 p6 Is+ 15Oct01 0:00.96 -tcsh (tcsh)
root 8291 0.0 0.2 924 580 v0 Is+ 17Oct01 0:00.00 /usr/libexec/getty Pc ttyv0
stephen 81305 0.0 0.3 1260 836 p3 Is Wed06PM 0:00.10 -tcsh (tcsh)
andrew 4817 0.0 0.3 1272 828 p1 Is Tue12PM 0:00.03 -tcsh (tcsh)
andrew 4818 0.0 1.3 4716 3384 p1 I+ Tue12PM 0:00.36 pine -i
root 8476 0.0 0.3 1276 856 p3 S 1:23PM 0:00.06 _su (tcsh)
As you can see, it is arranged neatly in columns, one line per process. The columns are:
- USER: the owner of the process. Without the -a option, you would only see
processes that have your own username here.
- PID: the process' unique identification number. You will never see two processes
with the same PID, although after a process has been deleted, its PID may later be reused.
- %CPU: What share of the available CPU time this process used over the past minute.
- %MEM: What share of the available physical memory this process used over the past minute.
- VSZ: Total amount of virtual memory currently in use, measured in K bytes 98642 is
using about 1.4 MB.
- RSS: Total amount of physical memory currently in use, also in K bytes. (RSS = Resident Set Size).
- TT: Controlling Terminal name (TT = TeleType). Doesn't mean much in these days of network connections,
but they are at least consistent. "p" numbers usually corrspond to telnet windows, so you
can see that processes 9008, 81305, and 8476 are all from the same telnet window (p3). Evidently
I logged in normally (process 98462 running the shell "tcsh") and issued the "su" command
(process 8476) to temporarily log me in as root; from that shell I typed "ps -au" (process
9008) to produce the output pasted above.
- STAT: The State of the process, exactly what we are interested in.
The first letter of the State column is the most important one, it tells you which
real state the process is in. The letters most commonly used are:
- R: Runnable or Running
- P: Page Wait (not used on rabbit)
- D: Disc Wait (also page wait on rabbit)
- S: Sleeping (short period)
- I: Idle Wait (long period, usually interactive input)
- T: Stopped
- Z: Exitting (Zombie)
Subsequent letters are very system dependent, the only interesting one
is W which annoyingly doesn't appear in the sample, because
(apparently) we have plenty of memory on this system. W means that
the process has been swapped out (i.e. lost nearly all of its physical
memory and been dumped on disc in the page file). The traditional
rule was that any process in the Idle state would automatically be
swapped out. If the system gets short of physical memory, processes
in other states, in reverse order of their likelihood of waking up soon
(i.e. S's are swapped out first, then D's, then P's and finally if
the system still needs to recover more memory, R's get swapped out)
If you ever see an RW process, you should probably consider buying more memory
or changing the system settings.
A Zombie process is never swapped out (ZW never appears) because
a zombie process has already given up nearly all of its pages anyway.
The other letters seen in the sample, "s" and "+", indicate that
the process is a "session leader" (s) which is rarely of any interest,
and doesn't really mean much, and (+) that it is the "foreground process"
on a particulate terminal. The foreground process is the one that would
receive any input typed on the keyboard, and would receive any kill
signalls generated by control-Cs.
If you use all the interesting options: "ps -aux", you see every process
that the system has. Sometimes that can be interesting.