Linked Lists

The idea of a growable array is all fine and dandy, bu has one major disadvantage: Every time the array size is changed, a whole new array has to be created, and everything from the old array copied into it. When arays start to get big (and in computing terms, big could easily mean multiple millions of entries) all that copying is just too time-consuming. Linked Lists provide a more flexible storage method than arrays, and although they have their own significant disadvantages, you never have to worry about copying into a new array.


Building a linked list is like taking a large group of small children to a museum. Instead of hiring an omniscient super-teacher, who can keep direct control of all the children all the time (like an array directly holding all of its elements), you make the children into a crocodile (at least, that's what we used to call it. Perhaps you know it by a different name). The teacher holds the hand of the first child, the first child holds the hand of the second child, the second child holds the hand of the third child, and so on, right down the line until you reach the last child who is left with one free hand, and can slightly misbehave.
        With a linked list, the program just keeps hold of one of the items; each item is made responsible for holding on to one other item, so they all form a nice long chain or crocodile.
        For this to work, each object in the list has to be capable of holding on to another similar objects. Children have hands; most objects in computer programs don't. However, we can easily give an object a hand, in the form of a pointer. If we are going to hold a class of objects in a linked list, we have to add an extra member to each object (traditionally called "next") to act as the hand. Here is an example of a simplified version of Person, modified for linked-list-ification:
class Person
{ public:
    char *name;         // some real data
    char *address;      // some more real data
    Person *next;       // the "hand"
    Person(void);       // the default constructor
    Person(char *nam, char *adr)    // the other constructor
    void print(void);   // a useful method
    etc; };


If you've got a bunch of Persons it wouldn't be difficult to run through them all, linking them together. For example, if they were initially in an array, this would do the trick:
Person *world;              // to hold the linked list's beginning
Person *original[1000];     // the original array of them;
int i;
world=original[0];
for (i=0; i<999; i+=1)
  original[i]->next=original[i+1];
original[999]->next=NULL;
"World" corresponds to the teacher, holding on to child[0], each child[i] holds on to child[i+1]. We make the last child hold on to the special value NULL so that it hasn't got a free hand to misbehave with. If you were expecting a lot of ampersands in there to make pointers, don't. The array "original" is already an array of pointers, so there's no need for any ampersands.
        We know better than to use a plain array of Persons, but if we ever lost our marbles and used one, the example would have to be like this:
Person *world;             // to hold the linked list's beginning
Person original[1000];     // the original array of them;
int i;
world=&original[0];
for (i=0; i<999; i+=1)
  original[i].next=&original[i+1];
original[999].next=NULL;


Fairly easy stuff. If we want to do anything with the contents of the linked list, there isn't much more to it. Just start at the beginning with the object that "world" points to; after dealing with it, follow its "next" pointer, and deal with the object you find there. When you find an object whose "next" is NULL, you have finished. For example...
Printing them all:
Person *ptr=world;
while (ptr!=NULL)
{ ptr->print();
  ptr=ptr->next; }

Finding the average:
int num=0;
int total=0;
Person *ptr=world;
while (ptr!=NULL)
{ total+=ptr->weight;
  num+=1;
  ptr=ptr->next; }
printf("Average weight=%d\n", total/num);

Finding a particular one:
Person *ptr=world;
while (ptr!=NULL && strcmp(ptr->name,"Enry Iggins")!=0)
{ ptr=ptr->next; }
if (ptr==NULL)
  printf("Couldn't find Enry Iggins anywhere\n");
else
{ printf("We have found Enry Iggins:\n");
  ptr->print(); }


Normally of course, we would not have an array-full of data waiting to be made into a linked list. We would create a linked-list right from the beginning, with no middle-man to consume resources. In these cases, the procedure would not be much different. For instance, reading Person data line-by-line from a file and directly creating a linked list, would be something like this:
FILE *fil=fopen(......);
Person *world=NULL;
Person *temp;
char line[500];
while (1)
{ if (fgets(line,499,fil)==NULL) break;
  temp=new Person(line);
  temp->next=world;
  world=temp; }
fclose(fil);
At each stage, world contains the whole list as read so far. When we read a new person, it is to become the first element in the list, so we store a pointer to the whole list so far in its next member, and let it take over.

In general, the recurring theme in linked list operations: while&nsbp;(ptr!=NULL) { ...; ptr=ptr->next; } is very easy to see, and is the source of the one main weakness of linked lists: with an array, if you want the millionth element, you can just say
ptr=array[1000000];
With a linked list, you must say
ptr=world;
for (int i=0; i<1000000; i+=1)
  ptr=ptr->next;
And that takes a lot of work. If you want to process every object, a linked list is just as fast as an array, and better in just about every other way. If you want to find a particular object, arrays give you instant access, but linked lists take time proportional to the size of the list.
        With a modern computer like a Pentium II, you can expect to be able to run through a linked list of a million elements in no more than a tenth of a second. It doesn't sound like much, but in a large program, little times soon add up to a long time. Accessing an element of an array should take less than a tenth of a millionth of a second. Quite a difference.
        But remember, a linked list is only slow if it is long and you have to find a particular element. In just about all other circumstances they are pretty good.

An Improvement

Making each Person object contain a special member (next) to support membership in a linked list introduces its own problems. Consider the plight of the poor Goldfish. We need one very long linked list to keep the membership book's list of all goldfish: no problem, just give the Goldfish objects a "next" member just like the Persons. However, as each Person may own a large number of Goldfish, we have a second use for a linked list. The club has a linked list of all goldfish, and each human member has a linked list of the goldfish that he, she, or it owns. Every goldfish has to be in two linked lists at the same time. This is possible to do (just have two members, "next1" and "next2"), but starts to get very confusing.
        A much better solution is to leave the Persons and Goldfish alone. Let them do their own thing, representing people and fish, without having to worry about the details of how they are stored. Our "real" objects can play a totally passive role in their own storage, if we introduce artificial container objects to handle the storage tasks instead. To represent a linked list or Persons, we introduce a new object with just two members, a pointer to a Person, and the "next" pointer:
class PersonLink
{ public:
    Person *thisone;
    PersonLink *next; 
A similar link is introduced for Goldfish:
class GoldfishLink
{ public:
    Goldfish *thisone;
    GoldfishLink *next; 
Clearly, it is possible to make linked lists from each of these kinds of objects. These lists would not be linked lists of Persons or of Goldfish, because they do not contain any Person or Goldfish data. However, they would be linked lists of pointers to Persons and pointers to Goldfish, and as we already have seen, pointers to things are often better than the things themselves.

        It is very important to see the difference. The Person objects are not directly involved in these new linked lists. Persons are pointed to by the links, but they do not have to do any pointing themselves. This means that any Person or Goldfish could be involved in a million different linked lists without ever knowing it. It could also be involved in none at all. The PersonLinks form the crocodile, all pointing to one another, and each PersonLink in the crocodile points to a Person.

When we are creating a linked list, it really doesn't get very complex. I'll take the original example of reading Persons from a file, and modify it to fit the new idea:
FILE *fil=fopen(......);
PersonLink *world=NULL;
PersonLink *temp;
Person *p;
char line[500];
while (1)
{ if (fgets(line,499,fil)==NULL) break;
  p=new Person(line);
  temp=new PersonLink();
  temp->thisone=p;
  temp->next=world;
  world=temp; }
fclose(fil);
Processing the list of Persons is not much different: just walk through the PersonLinks in the normal way, at each step, follow the "thisone" pointer to find the appropriate Person. I'll combine two examples in one: Listing all the people, and calculating their average weight at the same time:
int num=0;
int total=0;
PersonLink *ptr=world;
while (ptr!=NULL)
{ total+=ptr->weight;
  num+=1;
  ptr->print();
  ptr=ptr->next; }
printf("Average weight=%d\n", total/num);
Very similar really. But now, to illustrate the possibility, we'll build a second linked list, containing all the people whose names begin with the letter 'W'. All Persons will remain in the original list, but those with names beginning with 'W' will also appear on the new list. All we have to do is construct the new list in exacty the same way as we constructed the original, except that this time we are not reading the Persons from a file, we are retrieving them from the original list:
PersonLink *WList=NULL;
PersonLink *temp;
PersonLink *ptr=world;
while (ptr!=NULL)
{ if (ptr->thisone->name[0]=='W')
  { temp=new PersonLink();
    temp->thisone=ptr->thisone;
    temp->next=WList;
    WList=temp; }
  ptr=ptr->next; }
Notice that we play around with pointers a lot, but the Person objects themselves are totally untouched. If all this gets a bit confusing, and it can, make up a smallish data set, and run through the algorithm by hand. Draw a nice big and very clear picture of all the objects with all the pointers, and just watch it happen.