Playing with Strings

Part I - The Worked-Out Exercises

The following worked-out exercises were designed to help you become more comfortable with using strings in C. Make sure you understand and are able to use the language constructs used in the answers provided, as well as the algorithms they were used to implement.

1. Write a function that takes one character and one string as its parameters, and returns the number of times that character appears in the string.

Ans: This should be pretty simple. All we have to do is declare a variable that will keep track of the number of times we have seen the character (let's call it count, and initially set it to zero. Then, we go through every character of the string, and if it is equal to the character we are counting, we increment count. After we are done going through the string (i.e. we reach the '\0' at the end of the string), we can return count. Here is the code:

code	comments
`int count_char_in_string(char c, char *s)`	// `count_char_in_string` is a function that takes in // a character `c` and a string `s`, and returns an integer
`{ int count = 0, i = 0;`	// declare `count` and `i` (the current position in the string), // both integers initialized to zero
`while (s[i] != '\0')`	// while we are not at the end of the string
`{ if (s[i] == c) count++;`	// if the current character is equal to `c`, increment `count`
`i++; }`	// otherwise move on to the next character
`return count; }`	// done counting, so return `count`

This will work, but it's important to understand that the same function could be written in a stlightly different way. Consider the following function:

  void increment(int i)
  { i++; }

Now let's try to predict what would happen if we were to execute this block:

  { int x = 2;
    increment(x);
    printf("%d\n", x); }

What value would be printed?

The answer is 2. Don't panic if you didn't get this right. The reason why x didn't change after increment was called is because in C, whenever you pass an argument to a function, the function does not have access to the actual variable you passed in. It does instead have its own copy of that variable, which sits in the function's memory space.

And here is how we could take advantage of this:

code	comments
`int count_char_in_string(char c, char *s)`	// `count_char_in_string` is a function that takes in // a character `c` and a string `s`, and returns an integer
`{ int count = 0;`	// declare `count` and initialize it to zero
`while (*s != '\0')`	// while the value pointed to by `s` is not `'\0'` // (i.e. while we are not at the end of the string
`{ if (*s == c) count++;`	// if the current character is equal to `c`, increment `count`
`s++; }`	// otherwise move on to the next character
`return count; }`	// done counting, so return `count`

In this example, we used s as an actual pointer, instead of an array (as in the previous example). Instead of using the square brackets [], we used the indirection operator, or *. All the * does is tell the compiler "follow this pointer". Therefore, if you have a pointer p, whenever you write *p, you are actually referring to the "thing" that p points to.

So the initial value of s is the address of the first character of the string that we are looking at, and we are incrementing it (that's when we say s++) every time around the loop, until s is pointing to a '\0' (i.e. s is the address of a memory location that holds a value of '\0'.

Some of you might be thinking "hmm... but that's not very good, because after the function returns, the string that was passed to it will have been changed (after all, we changed what s was pointing to!" The reason why that's not a problem is that we are not doing anything with the pointer that was initially passed in to our function. We are modifying s, which is a copy of it, and that means that we can do whatever we want with it!

One thing to keep in mind, though, is that although you cannot permanently modify the value of a pointer that was passed to a function, we can modify the things that it points to! For instance, we could write a function that empties strings:

  void make_empty(char *s)
  { *s = '\0'; }

Although we never changed the value of s, we made a permanent change to the character s points to. This means that if you executed the following block:

  { char *s = "hello";
    printf("%s\n", s);
    make_empty(s);
    printf("%s\n", s); }

The output would be:

  hello

The first call to printf() printed the word hello, then we called make_empty(), which put a '\0' in the first character of string s, thus making it an empty string. That's why the second line of the output is empty (actually, it is the newline character printed by printf()).

2. Write a function that takes one character and one string as its parameters, and returns the location (i.e. index in the string) of the first occurence of the character that was passed in. If the character does not appear in the string, the function should return -1.

Ans: This one should be even easier than the last one! All we have to do is go through the string a character at a time, and if we ever see the character we are looking for, we can return our current position. If we ever get to the end of the string, that means that the character is not part of the string, and we return -1. The code for this is really simple:

code	comments
`int find_first_occurence(char c, char *s)`	// `find_first_occurence` is a function that takes in // a character `c` and a string `s`, and returns an integer
`{ int pos = 0;`	// declare `pos` (the current position) and initialize it to // zero (we'll start at the first character of `s`).
`while (s[pos] != '\0')`	// while we are not at the end of the string
`{ if (s[pos] == c) return pos;`	// if the current character is equal to `c`, return the current position
`pos++; }`	// otherwise move on to the next character
`return -1; }`	// if we ever get this far, `c` never appears in `s` // so we return `-1`

Pretty straightforward, right?

So, what if we were to try writing this function the other way (using pointers instead of arrays)? Let's see what it would look like:

code	comments
`int find_first_occurence(char c, char *s)`	// `find_first_occurence` is a function that takes in // a character `c` and a string `s`, and returns an integer
`{ int pos = 0;`	// declare `pos` (the current position) and initialize it to // zero (we'll start at the first character of `s`).
`while (*s != '\0')`	// while we are not at the end of the string
`{ if (*s == c) return pos;`	// if the current character is equal to `c`, return the current position
`s++; pos++; }`	// otherwise move on to the next character, // without forgetting to update the position
`return -1; }`	// if we ever get this far, `c` never appears in `s` // so we return `-1`

It didn't turn out as nice as we hoped it would have done, huh? There is absolutely no advantage to doing it this way, since we weren't able to get rid of pos. If we didn't have pos, we wouldn't be able to return the position, because all we have is a pointer to the string, right?

Well, not quite. With a slight modification, we could actually get rid of pos. Watch:

code	comments
`int find_first_occurence(char c, char *s)`	// `find_first_occurence` is a function that takes in // a character `c` and a string `s`, and returns an integer
`{ char *orig = s;`	// declare `orig` (the original string) and initialize it to `s`.
`while (*s != '\0')`	// while we are not at the end of the string
`{ if (*s == c) return s - orig;`	// if the current character is equal to `c`, return the current position (explained later)
`s++; }`	// otherwise move on to the next character,
`return -1; }`	// if we ever get this far, `c` never appears in `s` // so we return `-1`

The only difference between the body of this function and that of the last one is that now, instead of keeping track of the current position, we are saving the address of the first character of the string that was passed in. Later, this address is used to calculate the current position (i.e. if c appears in the string) so that we can return it.

"So, you ask, "how exactly does the expression ( s - orig ) yield the current position?"

That just happens to be one of the weird things about C. When you subtract one pointer from another, the result is the number of elements in between the elements that are pointed to. Granted, it's not a very straightforward way of doing things... But some people like doing this sort of thing, so you might as well become familiar with it.

Note: Pointer subtraction is not always allowed. You are only allowed to subtract pointers when both pointers are pointing to elements of the same array (if they don't the result is undefined).

Now that you have seen these solutions, try to do the following exercises on your own. They are slight variations on the previous ones, and will help these concepts sink in more easily. Remember, programming is all about practice. The more you do it, the better you will be!

3. Write a function that takes one character and one string as its parameters, and returns a pointer to the first ocurrence of the character in the string. If the character never appears in the string, the function should return NULL.

4. Write a function that takes one character and one string as its parameters, and returns the location (i.e. index in the string) of the last occurence of the character that was passed in. If the character does not appear in the string, the function should return -1.

Click here to go to part II, where you get to put these concepts into practice by writing your very own string library.

aw 4-Apr-2000