Playing with Strings

Part II - Your Turn To Do The Work

For these exercises, let's go back to the early 70's -- a glamorous time of bell-bottoms, platform shoes with transparent heels, and Led Zeppelin with Robert Plant screeching like a woman. You are one of the programmers at Bell Labs and your buddy Mennis M. Ditchie just finished writing a compiler for a new programming language called C.

Mennis has got a big problem on his hands: he would like to spend some more time on C, but now his supervisor wants him to work on the prototype for a new operating system called Unix. So he left you in charge of writing the string library. The following is a handout he left on your desk this morning with a brief description of how strings are supposed to work in C, and some functions he would like you to implement.

The standard way of representing strings in C is with an array of characters, where the last character is a null character (i.e. '\0').

So, in memory, a string containing the word hello would look like this:

'h' 'e' 'l' 'l' 'o' '\0'

Something else worth mentioning is that, in both C and C++, when people talk about strings, they really mean the address of the first character of the string. That's why you can declare strings by writing

  char *s;

Just keep in mind that when you have a declaration like the one above, NO MEMORY IS ALLOCATED FOR s. All you have declared is a variable that can point to a character (i.e. hold the address of a character, which is exactly what we were just talking about).

If you want to allocate memory for a string created this way, you have to call malloc(). If you'd like to know more about dynamic memory allocation, click here.

On the other hand, you could also make s point to another string that already exists in memory, just by assigning the address of that string to s. For instance, if you have a string pointed to by a variable p, and you want s to point to the same string, you can write:

  s = p;

This also works when p is an array of characters. This is because in C and C++, array names are always pointers to the first element of the array. Therefore if p is declared as

  char p[10];

p points to the first character of the array.

Thanks for helping me,

Mennis M. Ditchie

1. Write a function that takes in a string, a character, and an integer that contains the length of the string. This function should fill the string with the character that was passed to it.

The prototype for this function should look like this:

  void fill(char *s, char c, int length);

And this is how you would use this function:

  void main()
  { char name[30];
    fill(name, 'x', 30); }

In this example, we first create a string of 30 characters and call it name. Note that if instead we said

  char *name;

this program would be incorrect, because we never allocated any memory for the string name It might even end up overwriting some memory that was being used by something else, and, if you were using a bad operating system like DOS, running this program would probably crash the computer.

2. Now that you are finished with the first exercise, is it really necessary for the length of the string to be passed in as a parameter? Of course not. All you have to do is replace every character of the string with the character that was passed in, until you reach the terminating '\0'.

Rewrite the fill() function so that it doesn't need to have the length of the string as one of its parameters.

Something else that's very important to understand is the difference between the size of a string and the length of a string. These are really not the same thing.

For instance, take a look at the following block of code:

  { char str[100];
    strcpy(str, "rabbit");
    ... }

In the first line, we declare an array of 100 characters. Then, we store the string "rabbit" in that array. At this point, the length of this string is 6 (the number of characters in "rabbit"), and the size of this string is 100 bytes (str is an array of 100 characters, and the size of a character is 1 byte). Now, suppose that later on in this block, we call strcpy:

  strcpy(str, "abc");

Right after this statement is executed, the length of the string will be 3. Note that the size of the string remains constant.

3. Write a function that takes a string and returns its length.

The prototype for this function should look like this:

  int strlen(char *s);

And here is an example of how you would use it:

  void main()
  { char str[] = "hello";
    printf("The length of the string is %d\n", strlen(str)); }

In the first line of main(), we declare a string called str, which is initialized to "hello" (take a look at the diagram above). Note that the reason why the size of str doesn't have to be specified in the declaration is because we are initializing it while we are still in the declaration. Otherwise, the compiler wouldn't have a way of knowing how much memory to set aside for str.

Then we call our function, and print out the returned value with printf().

4. One of the most basic things people often have to do with strings is copying them. Write a function that takes in two strings. One string will be the destination string, and the other will be the source string. Your function should copy the contents of the source string into the destination string.

The prototype should look like this:

  void strcpy(char *destination, char *source);

Keep in mind that the destination string must have enough memory allocated for it (i.e. enough memory to hold all of the characters in the source string) prior to calling the function.

5. Another really useful string operation is comparison. Write a function whose prototype is:

  int strcmp(char *s1, char *s2);

and returns:

a positive value if s1 > s2

zero if s1 is equal to s2

a negative value if s1 < s2

Example 1: strcmp("7", "seven") returns a negative number, because the ASCII code of the character 7 is less than that of the character s. If you want to learn more about ASCII codes, click here.
Example 2: strcmp("bob", "boB") returns a positive number, because the ASCII code of the character b is greater than that of the character B.

6. Sometimes people need to add strings together. For example, suppose you had a string with a customer's first name, and another with the custumer's last name. If you wanted to make them into a single string, you would normally call strcat(), one of C's string library functions. strcat() takes two arguments: a source string, and a destination string. It appends the source string to the end of the destination string, leaving the source string untouched. It is assumed that the destination string will have enough space to hold the resulting string.

Here is strcat()'s prototype:

  void strcat(char *dest, char *source);

Write the body of this function.

7. This last one is a little tricky, but I'm sure you can do it if you have a good think and spend enough time on it.

Whenever someone writes a program that involves some string processing, it is often useful to be able to know if a certain string contains another. For instance, the string

  "C is not a programming language for kids"

(which will from now on be referred to as the big string) contains the string "programming", but does not contain the string "BASIC". Keep in mind that this is not restricted to words only: the string "ogram" is also contained in big string.

If we are writing a function that does that, we might as well add some functionality and, instead of returning 1 or 0 (i.e. true or false), we could return a pointer to the first occurence of "programming" in the big string. This will come in handy whenever you want to modify the big string. But don't worry about that right now.

But anyway, we'll call this function strstr(), and it should have the following prototype:

 char *strstr(char *bigstring, char *littlestring);

If littlestring appears inside bigstring, strstr() should return a pointer to the beginning of the first occurence of littlestring. If littlestring does not appear in bigstring, the function should return NULL. Also, just so you know, if littlestring appears more than once in bigstring, the function should return a pointer to the first occurence only, so don't worry about that.

aw 3-Apr-2000