Stack Frames

This covers: the allocation of memory for variables and other things in a running program. You should be more-or-less familiar with the material in the variables section before reading this.

When a program is running, the computer's memory is usually divided up into a few completely distinct areas. The exact details vary from one implementation to another, but usually the very lowest addresses (zero to a couple of hundred) are reserved for operating system things, the next portion is occupied by the code of the program, the next portion for global variables (actually, for anything with global extent; this includes local statics), the next portion is set aside for dynamic memory allocation (used by malloc or new), then a very large area of memory is left unused, and finally at the highest addresses of all, the local variables are kept.
You can see evidence of this arrangement by running a program that simply pints the addresses of a number of different kinds of variables. For instance, this program:

      #include <stdio.h>

      void f(void)
      { int x,y;
        static int w;
        printf("statics in f:   &w=%10u\n", &w);
        printf("locals in f:    &x=%10u\n", &x);
        printf("                &y=%10u\n", &y); }

      int a,b;

      void main(void)
      { int m,n;
        static o;
        printf("functions:   &main=%10u\n", &main);
        printf("                &f=%10u\n", &f);
        printf("globals:        &a=%10u\n", &a);
        printf("                &b=%10u\n", &b);
        printf("dynamic:  malloc()=%10u\n", malloc(100));
        printf("          malloc()=%10u\n", malloc(100));
        printf("static in main: &o=%10u\n", &o);
        f();
        printf("locals in main: &m=%10u\n", &m);
        printf("                &n=%10u\n", &n); }

When compiled by the default C compiler under freeBSD produces these results:

      functions:   &main=      5788
                      &f=      5500
      globals:        &a=      8416
                      &b=      8412
      dynamic:  malloc()=     94208
                malloc()=     94336
      static in main: &o=      8388
      statics in f:   &w=      8384
      locals in f:    &x=4022328508
                      &y=4022328504
      locals in main: &m=4022328524
                      &n=4022328520

(note that the addresses are printed with the format %10u because some of them are very high. On a computer with a 32 bit processor, the addresses of the first few local variables will be very close to 2³². On such a computer, any number between 2³¹ and 2³² appears to be negative unless forced into an unsigned interpretation).

       All but two of these regions of memory have a fixed size while the program is running (programs do not grow, nor do they sprout new global variables). However, as new malloc()s or news are executed, the space needed for dynamic allocation grows. As new functions are called, bringing their local variables into existence, the space required for locals also grows. That is the resaon for the large gap left between those two memory areas. The local area can grow downwards into lower addresses, and the dynamic area can grow upwards into higher addresses without ever interfering with each other until the whole of the computer's memory is used up.

       As the number and size of global and static variables is fixed, the compiler and linker can allocate storage space for them before the program begins to run. Those variables have Static Extent meaning that they "exist" (i.e. keep their mnemory locations) throughout the entire run of a program, even when they are out of scope.
       Local variables come and go at the whim of a running program. If a function is called resursively, many different versions of its local variables will exist at the same time, each having their own private allocation of memory. Local variables are kept on a Stack. Before a program starts, the Stack Pointer, a hardware CPU register that is reserved for pointing to the end of the stack, is set to point to the highest addresses available.
       Every time a function is called, space is made on the stack for all of its local variables (plus a little extra) simply by subtracting from SP the number of bytes required. Usually at this point the new value of the stack pointer is copied into another reserved CPU register called the Frame Pointer. This allows the stack to still be used for saving temporary results, without losing track of our position.
       The new area of stack carved out for a function call is called a Stack Frame. The job of the frame pointer is to point to the base of the currently active stack frame, which is how it got its name. While a function is running, all accesses to local variables are relative to the frame pointer. In the example program above, a typical compiler might decide that main and f each require 16 bytes of stack space. They both have two local ints (which normally require 4 bytes each), and the "little extra" mentioned in the previous paragraph will amount to 8 bytes.
       So, the first action of the running program is the calling of main. This results in two instructions being executed:

        SUB   #16, SP
        MOVE   SP, FP

to move the stack pointer down by 16 bytes (hence creating a new 16 byte stack frame), then save the address of the base of that new frame safely in the frame pointer.
       Inside any function, the first two full-size (4 byte) things in the stack frame (addressed as 0(FP) and 4(FP)) will probably be the "little extras". These are the return address, indicating where in the program to continue execution after this function is finished, and the saved frame pointer. Every function needs to have the frame pointer pointing at its own stack frame. When one function calls another the frame pointer will be moved, so its original value must be saved so that it can be restored when the new function exits and the original is allowed to continue.
       After those little extras, the function's local variables are found. Inside main, m will be addressed as 8(FP) and n will be addressed as 12(FP) (or something very similar at least). Inside f, x will be addressed as 8(FP) and y will be addressed as 12(FP).

       This simple mechanism makes recursion work without any extra effort. It is actually harder to implememnt a storage allocation scheme that doesn't support recursion. Look at a simple recursive program to see how well it works:

      void f(int x)
      { int y;
        y=1001*x;
        printf("A: x=%d, y=%d\n", x, y);
        if (x>1) f(x-1);
        printf("B: x=%d, y=%d\n", x, y); }
        
      void main(void)
      { int w=3;
        f(w); }

At the point when the program is printing "A: x=3, y=3003", the stack would look something like this: (I am assuming that every useful value requires 4 bytes, so addresses are only shown in 4 byte increments. It is also assumed that the satck pointer was set to exactly 2³²=4294967296 when the program started. I am pretending that 12345 is the address of the instruction immediately following the call of f in main, and that 11223 is the address of the instruction immediately following the recursive call of f (i.e. just before the second printf)).

Use	Address	Contents
W	4294967292	3	main's stack frame
saved fp	4294967288	0
return address	4294967284	0
Y	4294967280	3003	F's 1st stack frame
X	4294967276	3
saved fp	4294967272	4294967284
resturn address	4294967268	12345		`<- FP`

Of course, the column labelled "contents" is all that is actually stored in the computer.
After a couple of recursive calls, we will reach a position just as "A: x=1, y=1001" is being printed, when the stack looks like this:

Use	Address	Contents
W	4294967292	3	main's stack frame
saved fp	4294967288	0
return address	4294967284	0
Y	4294967280	3003	F's 1st stack frame
X	4294967276	3
saved fp	4294967272	4294967284
return address	4294967268	12345
Y	4294967264	2002	F's 2nd stack frame
X	4294967260	2
saved fp	4294967256	4294967268
return address	4294967252	11223
Y	4294967248	1001	F's 3rd stack frame
X	4294967244	1
saved fp	4294967240	4294967252
return address	4294967236	11223		`<- FP`

Note that the first part of the stack (the part that appeared in the first diagram) remained completely unchanged. As the program progresses, it will see that x is not greater than 1, so no further recursive call happens. "B: x=1, y=1001" is printed, and f exits.
When a function exits, the special information it contains (old frame pointer and return address) is retrieved, so in this case the FP set set to 4294967252 and the program continues to run from address 11223 (just before the second printf), and the stack frame is deleted by simply moving the frame pointer back to coincide with the restored value of the frame pointer. Although the old data from this deleted stack frame remains in memory, it is ignored because anything beneath SP is considered to be unallocated memory, and could be overwritten at any time. So, after the third call of f exits, the stack will look like this:

Use	Address	Contents
W	4294967292	3	main's stack frame
saved fp	4294967288	0
return address	4294967284	0
Y	4294967280	3003	F's 1st stack frame
X	4294967276	3
saved fp	4294967272	4294967284
return address	4294967268	12345
Y	4294967264	2002	F's 2nd stack frame
X	4294967260	2
saved fp	4294967256	4294967268
return address	4294967252	11223		`<- FP`

and the program will continue running from location 11223, just before the second printf in f. It now has access to the versions of X and Y that were left in the second stack frame for f, automatically through 8(FP) and 12(FP), so it continues to print "B: x=2, y=2002".
Exactly the same mechanism lets it get back to f's first frame and print out "B: x=3, y=3003", before finally returning to main and halting.
In summary, the entire output would be:

      A: x=3, y=3003
      A: x=2, y=2002
      A: x=1, y=1001
      B: x=1, y=1001
      B: x=2, y=2002
      B: x=3, y=3003

If the program had been changed slightly, so that the declaration in f is changed from

      { int y;

      { static int y;

things would be very different. Static variables are not allocated space in the stack frames; the are allocated permanent space in the global variable area. Every time any call of f accesses the variable y, it will be using the same memory locations, so when recursive calls exit, there is no way to retrieve the previous values. X is still a normal local variable, living in the stack frames, so its previous values are retrieved. The changed program would produce this output:

      A: x=3, y=3003
      A: x=2, y=2002
      A: x=1, y=1001
      B: x=1, y=1001
      B: x=2, y=1001
      B: x=3, y=1001

       It is not possible to declare the parameters of a function to be static in C, but if it were, or if we were using a different language that does support that behaviour, we would of course find that X also failed to recover its original values. The recursion would still go through the same stages and terminate after 3 calls of f, because each time a function exits, one stack frame is removed and we only had three frames for f to start with.
       All modern langauges allocated normal local variables on the stack in the way we have just seen. Some allow you to specifically designate local variables as being static, some do not. In older languages like Fortran, not only are all variables (including parameters) static, but so are the stack frames. The stack frames for a Fortran program are created, exactly one per function or subroutine, just once, before the program starts to run. This means that Fortran is absolutely incapable of supporting recursion, as it has nowhere to store the different return addresses for each call of a function. In fact, Fortran does not use a stack, so its stack frames are not really stack frames at all.

What we have just seen has been a very low level view, concerned with the actual mechanics of implementation, right down to the level of individual CPU registers and memory locations. It is not necessary to use a stack at all; that just turns out to be the most efficient implementation. So long as you can create a new frame each time a function is called, and get back to the old frame when a function exits, everything will work just as well.
       Often, it is better to take a more abstract and high-level view of things. An Activation Record is a data structure that holds all the information needed to support one call of a function. It contains all the local variables of that function, and a reference (or pointer) to another activation record; that pointer is known as the Dynamic Link. Stack Frames are an implementation of Activation Records. The dynamic link corresponds to the "saved FP" entry; it tells you which activation record to return to when the current function is finished. The frame pointer itself is simply a way of indicating which activation record is currently in use. The dynamic links tie all the activation records for a program together in one long linked list, showing the order they would appear in a stack.

Putting everything in terms of activation records (A.R.s), the rules for a modern language are:

There is always a current A.R.; but before the program starts to run this is null.
Whenever a function is called:
- A new A.R. is created, with entries in it for all of the function's local variables plus a dynamic link.
- The dynamic link of the new A.R. is set to refer to the current A.R.
- The new A.R. becomes the current A.R.
When a function is running, all local variables are to be found within the current A.R.
When a function exits:
- The current A.R. becomes temporarily known as the old A.R.
- The A.R. referred to in the Dynamic Link becomes the current A.R.
- The old A.R. is deleted.

The rules for Fortran are:

Before the program starts, create an A.R. for each program unit, each having entries for all of the program unit's variables plus a dynamic link.
There is always a current A.R.; before the program starts to run this is set to be the A.R. for the main program.
Whenever a function is called:
- Find the A.R. for that function. Temporarily refer to it as the new A.R.
- The dynamic link of the new A.R. is set to refer to the current A.R.
- The new A.R. becomes the current A.R.
When a function is running, all local variables are to be found within the current A.R.
When a function exits:
- The A.R. referred to in the Dynamic Link becomes the current A.R.