Arrays, Strings, and Pointers

To properly understand how pointers work, and how they relate to arrays and strings, it is essential to be able to understand declarations properly. Simple declarations are easy, but once stars and square brackets start to appear together, things can get complicated.

Here are the essential things to know: That's all it takes ever, and that last example was one that most professional programmers would get a panic attack from. Don't worry if it is hard to see exactly what is happening right away. Just having this much understanding of how declarations really work puts you well ahead of the crowd. A bit of experience will make it all seem easy.


Another important thing to keep in mind is that a pointer is just a number, the numeric address of the memory location that contains the thing we're pointing to. In fact, most useful things don't fit in a single location (an integer normally takes 4, an array of 20 integers would then require 80), so a pointer is only really a pointer to the beginning of something, and a pointer to a single integer is exactly the same in every way as a pointer to an array of a million integers.
        If you have declared an array like this: int a[20];, then a pointer to the whole array is exactly the same as a pointer to its first element, so &a and &a[0] would be exactly the same things; undistinguishable in any way.
        A consequence of that fact is that if you see a declaration like this: int *p;, you can't tell whether p is supposed to be a pointer to an integer, or a pointer to a whole array of integers. Normally it doesn't matter, but sometimes it will be much easier to understand a program if you keep in mind the idea that the type int * could be read as both "pointer to integer" and "pointer to a whole bunch of integers".


Yet another important thing to be aware of is just what a name refers to. (It might sound better if a wrote "variable" instead of "name"; it would certainly make more sense, but it would be wrong, as you'll see soon.)
        Here are the important cases, starting with the really obvious one, just so that you see the pattern:

The Differences Between Arrays and Pointers

You have seen that pointers and arrays are completely interchangable, and even indistinguishable, in many circumstances. In other circumstances there are real differences.

If you make these two declarations:
        char x[80];
        char *y;
80 bytes of memory are allocated for x, and x is set to point to the first of those bytes. The name x behaves like a pointer, but it is more like one that was declared as char * const x rather than the normal char *x. It is important to remember that x is not a variable, it is a constant pointer.
        You can not say x=y; The name x refers to the address of the array, so the assignment would be trying to change the array's address, which is impossible.
        You can of course say x[0]=44; or x[7]=x[72]*3; or anything like that. If you couldn't arrays would be useless.
        And, because arrays and pointers are vrey similar, you can also say y[0]=44; or y[7]=y[72]*3; or anything like that, but be careful: y is just a simple variable, with just enough space to store a pointer to an array. The declaration char *y does not create an array, it just gives you a place to store a pointer to one that already exists. If you say y[3]=99 straight away, your program will go wrong. Make y point to something real before you use it.
        You can say y=x; The name x refers to the address of the array, so it behaves as a pointer; the name y refers to a variable that can hold pointers, so it is just right. Once you have said y=x, you have given y something real to point to, so it is now reasonable to say y[0]=44; or y[7]=y[72]*3; or anything like that. Until you make y point somewhere else, y[i] will be exactly the same as x[i] always.
        Remember x[0] is the same as *x, &x[0] is the same as x, &x[17] is the same as x+17, and x[17] is the same as *(x+17).
        You can make use of those equivalences to make arrays a little more convenient. If you have a large array, something like this: int large[1000]; and at some point you find that a small segment of that array (perhaps elements 360 to 369) are very frequently accessed, almost as though they were their own little 10 element sub-array buried inside the whole, you can declare a convenient extra access method: int *small=large+360;. With this, small is a pointer to integers, and the place it points to is 360 locations into the array large. So, *small is the same thing as *(large+360), which is the same thing as large[360];. Similarly, small[3] is the same thing as *(small+3), which is the same thing as *(large+360+3), which is the same thing as large[363];. Small is an array in its own right, but it does not occupy any new memory. small[0] to small[9] are just alternate, more convenient, names for large[360] to large[368].

Copying Arrays and Strings

A string is just an array of chars, so anything that applies to a string also applies to an array, and vice-versa. The one exception is that strings always have a zero at their ends, but arrays don't. If you want to copy an array, you need to know how long it is. If you want to copy a string, you don't need to know how long it is: you can just copy characters until you reach the zero.

Suppose we have two strings:
        char one[50];
        char two[50]; one has useful data in it, and we want to copy it into two.
        You have already seen that two=one is not allowed. The name two does not refer to the contents of the array, but to its address, so the assignment wouldn't do the right thing even if it were allowed. The only way to do it is with a loop. There are plenty of ways to arrange the loop, the following is a reasonable example:
            int i=0;
            while (i<50)
            { one[i]=two[i];
              if (two[i]==0) break;
              i+=1; }
If you don't know the maximum size of the destination string, just replace the while (i<50) with while (1) and hope that the source string isn't too long.

If you have a string and a pointer:
        char one[50];
        char *ptr; one has useful data in it, and we want to copy it into ptr.
        We need to decide what is meant by "copy it into prt". You can't copy a string into a pointer, it won't fit.
There are three different things that we might reasonable want to happen:
  1. Perhaps we want to make the pointer point to the string held in one. That would be easy, just say ptr=one. After this, there is still only one copy of the original string, and it is stored in one. Ptr is just an extra reference to the same data. If we change the string in one, we will also be changing the string in ptr because they are the same thing.
  2. Perhaps we want to make the thing that ptr already points to contain a copy of the string in one. For that, we would use a loop like the previous one:
                int i=0;
                while (1)
                { ptr[i]=one[i];
                  if (one[i]==0) break;
                  i+=1; }
    
    For this to work, ptr would already have to be pointing to an array of characters long enough to take the new string. Whatever string ptr was pointing to will be overwritten with the contents of one.
  3. Perhaps we want to make a totally new copy of the string, and make ptr point to that new copy. For that, we would first need to find the length of the string to be copied, then use malloc to allocate new memory, then use the same old loop to actually copy the characters.
            Malloc is a system-provided function that finds an amount of new memory for you. It always returns a pointer to an array that is guaranteed to be free for your use, and noaccessed by anyone else unless you allow them access to it.
                int len, i;
                len=strlen(one);
                ptr=malloc(len+1);
                for (i=0; i<=len; i+=1)
                  ptr[i]=one[i];
    
    Now you have a completely new and private copy of the string, in a permanently safe place, pointed to by ptr. If later the string in one is modified, it doesn't matter, the new string is not connected with it in any way.
            The one disadvantage to this method is that malloc makes a permanent allocation of memory, not from the current stack frame. That memory will remain allocated until the entire program exits, even if you exit from the current function and the pointer ptr that tells you where the memory is, ceases to exist. This means that it is possible to completely lose access to memory that you have allocated.
            That isn't really a disadvantage, you just have to be careful. You wouldn't use malloc unless you particularly wanted a permanent allocation. And it isn't really permanent anyway. When you no longer need the new string, just say free(ptr) and the system will take it back.
            Most programs don't bother to use free to recycle memory found by malloc. When a program completely finishes, all of its memory is taken back anyway, so a program that is about to exit is just wasting effort in freeing all its mallocs. But, if a program is to continue running for a while, it can't just keep saying malloc every time it needs some more memory, unless it also says free to return old memory that it doesn't need any more. There is only a finite amount of memory in the computer, so if you don't recycle, your program will suffocate itself.