Arrays, Strings, and Pointers

To properly understand how pointers work, and how they relate to arrays and strings, it is essential to be able to understand declarations properly. Simple declarations are easy, but once stars and square brackets start to appear together, things can get complicated.

Here are the essential things to know:

In both declarations and expressions, the square brackets that represent an array access have a higher priority than the star that represents following a pointer. *a[3] is understood as *(a[3]), not (*a)[3], so a is primarily an array because the [3] operation is applied directly to it. a[3] is a pointer because the * operation is applied to it. Therefore a is an array of pointers.
You are allowed to write parentheses in declarations, just like in expressions, if you want to over-ride the priorities of the operators, or if you want to make them more obvious.
When you are reading a complex declaration, it is useful to write in all the parentheses that are made implicit by the priorities of the operators. Or at least imagine that they are written in. So, when you see int *a[7];, read it as int *(a[7]);.
Once the parentheses are written in, or at least being imagined, identify what name is being declared. In the above example it is obviously a, and you may imagine that it is always equally obvious. It isn't. When pointers to functions are being declared it is sometimes quite difficult to work out just which name is being declared.
The name being declared is the centre of the declaration. Once you have identified it, and put in the parentheses, read the declaration from the centre outwards. Start with the name and always go to the next most closely connected thing. For example, if you see this declaration: int *(*a)[9];, there is no need to panic, just put in all the parentheses, to see: int (*((*a)[9]));, then start at the centre a and work outwards through these stages:
1. The name being declared is a;
2. A * is connected to it, so a is a pointer to something;
3. A [9] is connected to that, so a is a pointer to an array of nine things;
4. A * is connected to that, so a is a pointer to an array of nine pointers to somethings;
5. The int is connected to that, so a is a pointer to an array of nine pointers to integers.

That's all it takes ever, and that last example was one that most professional programmers would get a panic attack from. Don't worry if it is hard to see exactly what is happening right away. Just having this much understanding of how declarations really work puts you well ahead of the crowd. A bit of experience will make it all seem easy.

Another important thing to keep in mind is that a pointer is just a number, the numeric address of the memory location that contains the thing we're pointing to. In fact, most useful things don't fit in a single location (an integer normally takes 4, an array of 20 integers would then require 80), so a pointer is only really a pointer to the beginning of something, and a pointer to a single integer is exactly the same in every way as a pointer to an array of a million integers.
        If you have declared an array like this: int a[20];, then a pointer to the whole array is exactly the same as a pointer to its first element, so &a and &a[0] would be exactly the same things; undistinguishable in any way.
        A consequence of that fact is that if you see a declaration like this: int *p;, you can't tell whether p is supposed to be a pointer to an integer, or a pointer to a whole array of integers. Normally it doesn't matter, but sometimes it will be much easier to understand a program if you keep in mind the idea that the type int * could be read as both "pointer to integer" and "pointer to a whole bunch of integers".

Yet another important thing to be aware of is just what a name refers to. (It might sound better if a wrote "variable" instead of "name"; it would certainly make more sense, but it would be wrong, as you'll see soon.)
        Here are the important cases, starting with the really obvious one, just so that you see the pattern:

Declaration: int a;
- The name a refers to the value of the variable. So if you say 3+a it means "three plus the value of the variable"; if you say a=9 it means "make the value of the variable be 9".
- &a is allowed, and refers to the address of the variable.
- *a is not allowed, because integers are distinguished from pointers, even though pointers really are just integers deep inside the machine.
Declaration: const int b=3;
- The name b refers to the number 3. It is not a variable.
- &b is not allowed; there is no variable to have an address.
- *b is also not allowed.
Declaration: int c[7];
- The name c refers to the address of the array. c is what you should type if you think you want to type &c.
- &c is not really allowed. c is already a pointer.
- *c means exactly the same thing as c[0].
- Remember that you can do simple arithmetic + and - on pointers, if you get the priorities right. The * that means "follow this pointer" has a higher priority than all of the arithmetic operators.
  - *c+1 means exactly the same thing as c[0]+1, and is nothing special at all.
  - *(c+1) means the same thing as c[1]. If you add one to the address of the beginning of an array, you get the address of the second thing in that array.
Declaration: int *d=&a;
- The name d refers to the variable that you just declared. d is a perfectly normal variable that just happens to be able to point to other variables.
  - Little note: Variables can point to other variables, but no variable can ever point to itself. To make something capable of pointing to the integer a, I had to declare it as a pointer-to-integer: int *d; if I wanted something to point to d it would be a pointer-to-pointer-to-integer, declared int **e;. A pointer has to have more stars in its type than the thing it points to. There is one obscure exception to this rule, but there's no need to worry about that yet.
- &d is allowed, it is a pointer to a pointer to an integer. &d is a pointer to d itself; it is not the same thing as d, and it is not another pointer to the thing that d points to.
- *d is allowed, it refers to the variable whose address d contains. In other words, it means the thing that d points to.
- If you say d=something, you are changing d itself, making it point to something else; you are not changing the value of the variable that it points to.
- If you say *d=something, you are not changing d at all, it remains pointing to the same variable. You are changing the value of the variable that it points to.
- Pointers have to be used with care. Remember that there is no difference between a pointer to a thing, and a pointer to an array of things. So, if we have: int a=6; int *d=&a;, d is a pointer to a, a simple single integer variable, so *a has the value 6. But we are also allowed to say *(d+72) or even d[72] (which is exactly the same thing always). Both of these would be bad things to do; they are accessing unpredictable memory locations. If you were lucky, they would crash your program. It is much more likely that they would just induce wrong answers that might never be noticed.
Declaration: const int *f;
- This is not necessarily what everyone would expect. f is not a constant. It is not even a pointer to a constant. In this context, the const, being most closely attached to the int, is saying that f is a pointer to an integer that it will treat as a constant. In other words, you can make f point to any integer at all (for example f=&a) that integer does not need to be a constant, and it won't suddenly become one. But you will not be able to use f to change it. *f=something is not allowed. This is a Useful Thing. It lets you protect your own variables from accidental changes.
- In every other way, f behaves the same as d did.
- Declarations of this sort are much more commonly used for parameters to functions than for normal variables.
Declaration: int * const g = &a;
- This is again not necessarily what everyone would expect. This time g is a constant (the word const is most closely attached to it), so although it is a perfectly normal pointer in every other way, it can not be changed. g will always point to a.
- &g is not allowed. Usually constants do not have addresses.
- *g is allowed, and is the same thing as a.
- g=something is not allowed.
- *g=something is allowed; it changes the value of a.
- Note: int const *gg; is exactly the same thing as const int *gg;, which is not the same thing as int *const gg;.
Declaration: const int * const h = &a;
- This means what most people expect the previous two examples to mean. You can't change h, nor can you use it to change the thing it points to. Both h=something and *h=something are forbidden.
- Declarations of this sort are much more commonly used for parameters to functions than for normal variables.

The Differences Between Arrays and Pointers

You have seen that pointers and arrays are completely interchangable, and even indistinguishable, in many circumstances. In other circumstances there are real differences.

If you make these two declarations:
        char x[80];
        char *y;
80 bytes of memory are allocated for x, and x is set to point to the first of those bytes. The name x behaves like a pointer, but it is more like one that was declared as char * const x rather than the normal char *x. It is important to remember that x is not a variable, it is a constant pointer.
        You can not say x=y; The name x refers to the address of the array, so the assignment would be trying to change the array's address, which is impossible.
        You can of course say x[0]=44; or x[7]=x[72]*3; or anything like that. If you couldn't arrays would be useless.
        And, because arrays and pointers are very similar, you can also say y[0]=44; or y[7]=y[72]*3; or anything like that, but be careful: y is just a simple variable, with just enough space to store a pointer to an array. The declaration char *y does not create an array, it just gives you a place to store a pointer to one that already exists. If you say y[3]=99 straight away, your program will go wrong. Make y point to something real before you use it.
        You can say y=x; The name x refers to the address of the array, so it behaves as a pointer; the name y refers to a variable that can hold pointers, so it is just right. Once you have said y=x, you have given y something real to point to, so it is now reasonable to say y[0]=44; or y[7]=y[72]*3; or anything like that. Until you make y point somewhere else, y[i] will be exactly the same as x[i] always.
        Remember x[0] is the same as *x, &x[0] is the same as x, &x[17] is the same as x+17, and x[17] is the same as *(x+17).
        You can make use of those equivalences to make arrays a little more convenient. If you have a large array, something like this: int large[1000]; and at some point you find that a small segment of that array (perhaps elements 360 to 369) are very frequently accessed, almost as though they were their own little 10 element sub-array buried inside the whole, you can declare a convenient extra access method: int *small=large+360;. With this, small is a pointer to integers, and the place it points to is 360 locations into the array large. So, *small is the same thing as *(large+360), which is the same thing as large[360];. Similarly, small[3] is the same thing as *(small+3), which is the same thing as *(large+360+3), which is the same thing as large[363];. Small is an array in its own right, but it does not occupy any new memory. small[0] to small[9] are just alternate, more convenient, names for large[360] to large[368].