Stack Frames
This covers: the allocation of memory for variables and other things in a running
program. You should be more-or-less familiar with the material in the
variables section before reading this.
When a program is running, the computer's memory is usually divided up into a few completely
distinct areas. The exact details vary from one implementation to another, but usually
the very lowest addresses (zero to a couple of hundred) are reserved for operating system things,
the next portion is occupied by the code of the program, the next portion for global variables
(actually, for anything with global extent; this includes local statics), the next portion
is set aside for dynamic memory allocation (used by malloc or new), then a very large
area of memory is left unused, and finally at the highest addresses of all, the local variables
are kept.
You can see evidence of this arrangement by running a program
that simply pints the addresses of a number of different kinds of variables. For instance, this
program:
#include <stdio.h>
void f(void)
{ int x,y;
static int w;
printf("statics in f: &w=%10u\n", &w);
printf("locals in f: &x=%10u\n", &x);
printf(" &y=%10u\n", &y); }
int a,b;
void main(void)
{ int m,n;
static o;
printf("functions: &main=%10u\n", &main);
printf(" &f=%10u\n", &f);
printf("globals: &a=%10u\n", &a);
printf(" &b=%10u\n", &b);
printf("dynamic: malloc()=%10u\n", malloc(100));
printf(" malloc()=%10u\n", malloc(100));
printf("static in main: &o=%10u\n", &o);
f();
printf("locals in main: &m=%10u\n", &m);
printf(" &n=%10u\n", &n); }
When compiled by the default C compiler under freeBSD produces these results:
functions: &main= 5788
&f= 5500
globals: &a= 8416
&b= 8412
dynamic: malloc()= 94208
malloc()= 94336
static in main: &o= 8388
statics in f: &w= 8384
locals in f: &x=4022328508
&y=4022328504
locals in main: &m=4022328524
&n=4022328520
(note that the addresses are printed with the format %10u because some
of them are very high. On a computer with a 32 bit processor, the addresses of the first few
local variables will be very close to 232. On such a computer, any number between
231 and 232 appears to be negative unless forced into an unsigned
interpretation).
All but two of these regions of memory have
a fixed size while the program is running (programs do not grow, nor do they sprout new
global variables). However, as new malloc()s or news are executed, the
space needed for dynamic allocation grows. As new functions are called, bringing their
local variables into existence, the space required for locals also grows. That is the resaon
for the large gap left between those two memory areas. The local area can grow downwards into
lower addresses, and the dynamic area can grow upwards into higher addresses without ever
interfering with each other until the whole of the computer's memory is used up.
As the number and size of global and static variables
is fixed, the compiler and linker can allocate storage space for them before the program begins to
run. Those variables have Static Extent meaning that they "exist" (i.e. keep their
mnemory locations) throughout the entire run of a program, even when they are out of scope.
Local variables come and go at the whim of a running
program. If a function is called resursively, many different versions of its local variables
will exist at the same time, each having their own private allocation of memory. Local variables
are kept on a Stack. Before a program starts, the Stack Pointer, a hardware CPU
register that is reserved for pointing to the end of the stack, is set to point to the highest
addresses available.
Every time a function is called, space is made on the
stack for all of its local variables (plus a little extra) simply by subtracting from SP the number of
bytes required. Usually at this point the new value of the stack pointer is copied into another
reserved CPU register called the Frame Pointer. This allows the stack to still be used
for saving temporary results, without losing track of our position.
The new area of stack carved out for a function call
is called a Stack Frame. The job of the frame pointer is to point to the base of the
currently active stack frame, which is how it got its name. While a function is running, all
accesses to local variables are relative to the frame pointer. In the example program above,
a typical compiler might decide that main and f each require 16 bytes of stack space.
They both have two local ints (which normally require 4 bytes each), and the "little extra"
mentioned in the previous paragraph will amount to 8 bytes.
So, the first action of the running program is the calling of
main. This results in two instructions being executed:
SUB #16, SP
MOVE SP, FP
to move the stack pointer down by 16 bytes (hence creating a new 16 byte stack frame), then
save the address of the base of that new frame safely in the frame pointer.
Inside any function, the first two full-size (4 byte) things
in the stack frame (addressed as 0(FP) and 4(FP)) will probably be the "little extras".
These are the return address, indicating where in the program
to continue execution after this function is finished, and the
saved frame pointer. Every function needs to have the frame pointer pointing at its own stack frame.
When one function calls another the frame pointer will be moved, so its original value must be saved
so that it can be restored when the new function exits and the original is allowed to continue.
After those little extras, the function's local variables are
found. Inside main, m will be addressed as 8(FP) and n
will be addressed as 12(FP) (or something very similar at least). Inside f,
x will be addressed as 8(FP) and y will be addressed as 12(FP).
This simple mechanism makes recursion work without any extra effort.
It is actually harder to implememnt a storage allocation scheme that doesn't support recursion.
Look at a simple recursive program to see how well it works:
void f(int x)
{ int y;
y=1001*x;
printf("A: x=%d, y=%d\n", x, y);
if (x>1) f(x-1);
printf("B: x=%d, y=%d\n", x, y); }
void main(void)
{ int w=3;
f(w); }
At the point when the program is printing "A: x=3, y=3003", the stack would look something like this:
(I am assuming that every useful value requires 4 bytes, so addresses are only shown in 4 byte
increments. It is also assumed that the satck pointer was set to exactly 232=4294967296
when the program started. I am pretending that 12345 is the address of the instruction immediately
following the call of f in main, and that 11223 is the address of the instruction immediately
following the recursive call of f (i.e. just before the second printf)).
Use | Address | Contents | |
W | 4294967292 | 3 | main's stack frame |
saved fp | 4294967288 | 0 |
return address | 4294967284 | 0 |
Y | 4294967280 | 3003 | F's 1st stack frame |
X | 4294967276 | 3 |
saved fp | 4294967272 | 4294967284 |
resturn address | 4294967268 | 12345 | <- FP |
Of course, the column labelled "contents" is all that is actually stored in the computer.
After a couple of recursive calls,
we will reach a position just as "A: x=1, y=1001" is being
printed, when the stack looks like this:
Use | Address | Contents | |
W | 4294967292 | 3 | main's stack frame |
saved fp | 4294967288 | 0 |
return address | 4294967284 | 0 |
Y | 4294967280 | 3003 | F's 1st stack frame |
X | 4294967276 | 3 |
saved fp | 4294967272 | 4294967284 |
return address | 4294967268 | 12345 |
Y | 4294967264 | 2002 | F's 2nd stack frame |
X | 4294967260 | 2 |
saved fp | 4294967256 | 4294967268 |
return address | 4294967252 | 11223 |
Y | 4294967248 | 1001 | F's 3rd stack frame |
X | 4294967244 | 1 |
saved fp | 4294967240 | 4294967252 |
return address | 4294967236 | 11223 | <- FP |
Note that the first part of the stack (the part that appeared in the first diagram) remained
completely unchanged. As the program progresses, it will see that x is not greater than 1,
so no further recursive call happens. "B: x=1, y=1001" is printed, and f exits.
When a function exits, the special information it contains
(old frame pointer and return address) is retrieved, so in this case the FP set set to 4294967252
and the program continues to run from address 11223 (just before the second printf), and the
stack frame is deleted by simply moving the frame pointer back to coincide with the restored value
of the frame pointer. Although the old data from this deleted stack frame remains in memory, it is
ignored because anything beneath SP is considered to be unallocated memory, and could be overwritten
at any time. So, after the third call of f exits, the stack will look like this:
Use | Address | Contents | |
W | 4294967292 | 3 | main's stack frame |
saved fp | 4294967288 | 0 |
return address | 4294967284 | 0 |
Y | 4294967280 | 3003 | F's 1st stack frame |
X | 4294967276 | 3 |
saved fp | 4294967272 | 4294967284 |
return address | 4294967268 | 12345 |
Y | 4294967264 | 2002 | F's 2nd stack frame |
X | 4294967260 | 2 |
saved fp | 4294967256 | 4294967268 |
return address | 4294967252 | 11223 | <- FP |
and the program will continue running from location 11223, just before the second printf in f.
It now has access to the versions of X and Y that were left in the second stack frame for f,
automatically through 8(FP) and 12(FP), so it continues to print
"B: x=2, y=2002".
Exactly the same mechanism lets it get back to f's
first frame and print out "B: x=3, y=3003", before finally returning to main and halting.
In summary, the entire output would be:
A: x=3, y=3003
A: x=2, y=2002
A: x=1, y=1001
B: x=1, y=1001
B: x=2, y=2002
B: x=3, y=3003
If the program had been changed slightly, so that the declaration in f is changed from
{ int y;
to
{ static int y;
things would be very different. Static variables are not allocated space in the stack frames;
the are allocated permanent space in the global variable area. Every time any call of f accesses
the variable y, it will be using the same memory locations, so when recursive calls exit,
there is no way to retrieve the previous values. X is still a normal local variable, living in
the stack frames, so its previous values are retrieved. The changed program would produce this
output:
A: x=3, y=3003
A: x=2, y=2002
A: x=1, y=1001
B: x=1, y=1001
B: x=2, y=1001
B: x=3, y=1001
It is not possible to declare the parameters of a function
to be static in C, but if it were, or if we were using a different language that does support that
behaviour, we would of course find that X also failed to recover its original values. The recursion
would still go through the same stages and terminate after 3 calls of f, because each time a function
exits, one stack frame is removed and we only had three frames for f to start with.
All modern langauges allocated normal local variables on the stack in the way we have just seen.
Some allow you to specifically designate local variables as being static, some do not. In older
languages like Fortran, not only are all variables (including parameters) static, but so are
the stack frames. The stack frames for a Fortran program are created, exactly one per function or
subroutine, just once, before the program starts to run. This means that Fortran is absolutely incapable
of supporting recursion, as it has nowhere to store the different return addresses for each call of
a function. In fact, Fortran does not use a stack, so its stack frames are not really stack frames at
all.
What we have just seen has been a very low level view, concerned with the actual mechanics of
implementation, right down to the level of individual CPU registers and memory locations. It is
not necessary to use a stack at all; that just turns out to be the most efficient implementation.
So long as you can create a new frame each time a function is called, and get back to the old frame
when a function exits, everything will work just as well.
Often, it is better to take a more abstract and high-level view
of things. An Activation Record is a data structure that holds all the information needed to
support one call of a function. It contains all the local variables of that function, and a reference
(or pointer) to another activation record; that pointer is known as the Dynamic Link.
Stack Frames are an implementation of Activation Records. The dynamic link corresponds to the "saved FP"
entry; it tells you which activation record to return to when the current function is finished.
The frame pointer itself is simply a way of indicating which activation record is currently in use.
The dynamic links tie all the activation records for a program together in one long linked list,
showing the order they would appear in a stack.
Putting everything in terms of activation records (A.R.s), the rules
for a modern language are:
- There is always a current A.R.; but before the program starts
to run this is null.
- Whenever a function is called:
- A new A.R. is created,
with entries in it for all of the function's local variables plus a dynamic link.
- The dynamic link of the new A.R. is set to refer to
the current A.R.
- The new A.R. becomes the current A.R.
- When a function is running, all local variables are to
be found within the current A.R.
- When a function exits:
- The current A.R. becomes temporarily
known as the old A.R.
- The A.R. referred
to in the Dynamic Link becomes the current A.R.
- The old A.R.
is deleted.
The rules
for Fortran are:
- Before the program starts, create an A.R. for each
program unit, each having entries for all of the program unit's variables plus a dynamic link.
- There is always a current A.R.; before the program starts
to run this is set to be the A.R. for the main program.
- Whenever a function is called:
- Find the A.R. for
that function. Temporarily refer to it as the new A.R.
- The dynamic link of the new A.R. is set to refer to the current A.R.
- The new A.R. becomes the current A.R.
- When a function is running, all local variables are to
be found within the current A.R.
- When a function exits:
- The A.R. referred
to in the Dynamic Link becomes the current A.R.