1

5. (33%)

I want to make a list of all the words that appear in a text document, without repetitions of course. I have decided that the easiest way to solve this problem is to make you do it. The plan is to have a program which creates a linked list of strings, and just lets the user type. Every time the user types a string, the program will search the list to see if that string is already there. If the string is already in the list, nothing happens. If the string is not already in the list, it is added.

A. (6%)

Write correct struct or class definitions for creating a linked list of strings suitable for use in the program described above.

B. (8%)

Write a function which searches through a linked list, looking for a particular string. (The linked list and the string should both be parameters to the function) The function should return the result true if the string does appear in the list, and false if it does not.

C. (8%)

Write a function which adds a string to a linked list. (The linked list and the string should both be parameters to the function) The function does not need to return any result.

D. (7%)

Write a function that prints all the strings in a linked list.

E. (4%)

Write a main function that brings all your previous answers together into a working program. It does not need a good user interface. When the program is run, it should simply let the user type in a lot of strings (it doesn’t matter what the strings are, any space-separated sequences of characters will do), and add the unique ones to the linked list. If the user types “****”, that should not be added to the list, but instead taken as a signal to stop, and print out all the words.

The following program answers all five parts in one. Each part of the program is labelled with the letter A-E for the part of the question that it answers:

#include <iostream>

#include <string>

struct link

{ string word;

link * next; };

struct list

{ link * first; };

void add(list & L, string s)

{ link * n = new link;

n->word = s;

n->next = L.first;

L.first = n; }

bool ispresent(const list & L, string s)

{ link * curr = L.first;

while (curr!=NULL)

{ if (curr->word == s)

return true;

curr = curr->next; }

return false; }

void printall(const list & L)

{ link * curr = L.first;

while (curr!=NULL)

{ cout << curr->word << "\n";

curr = curr->next; } }

(This next function is not required for the question, but is something that a person might add if they want to show off for a chance at some extra credit. It makes the program much more useful)

string standardise(string s)

{ string r = "";

for (int i=0; i<s.length(); i+=1)

{ char c=s[i];

if (c>='A' && c<='Z')

c-='A'-'a';

if (c>='a' && c<='z')

r=r+c; }

return r; }

void main(void)

{ list words;

words.first=NULL;

while (1)

{ string w;

cin >> w;

if (w == "****") break;

w = standardise(w);

if (! ispresent(words, w))

add(words, w); }

printall(words); }

All it needed was the basic linked list functions, with a tiny bit of thought about input and output.

6. (33%)

Here are a bunch of functions. If you had a really really cheap computer whose hardware was incapable of doing any of the standard arithmetic operations (+, -, *, etc), but it could increment and decrement variables (just add or subtract 1), these functions would help out:

int add(int x, int y)

{ int result=x;

while (y>0)

{ result+=1;

y-=1; }

return result; }

int subtract(int x, int y)

{ int result=x;

while (y>0)

{ result-=1;

y-=1; }

return result; }

int multiply(int x, int y)

{ int result=0;

while (y>0)

{ result=add(result, x);

y-=1; }

return result; }

int power(int x, int y)

{ int result=1;

while (y>0)

{ result=multiply(result, x);

y-=1; }

return result; }

The operations I really want to work with are doubling, squaring, and finding n-to-the-power-of-n. With the functions shown above as a library, I can easily define the functions I care about:

int ddouble(int n)

{ return add(n, n); }

int square(int n)

{ return multiply(n, n); }

int super(int n)

{ return power(n, n); }

A. (7%)

Using the “big-O” notation, how long does it take ddouble(n) to produce its result?

add(a,b) involves one loop that executes b times; each time around the loop, it does the same simple operations (one increment and one decrement), so takes the same time. Therefore the total time is proportional to the number of times round the loop, which is equal to b. thus the time for add(a,b) is O(b).

ddouble(n) is defined to be add(n,n), therefore the time for ddouble(n) is O(n).

B. (7%)

Using the “big-O” notation, how long does it take square(n) to produce its result?

multiply(a,b) involves one loop that executes b times; each time around the loop, it calls add(result,a) [note that the second parameter does not change] we have already established that add(result,a) takes time O(a), and this happens b times, so multiply(a,b) takes time O(ab).

square(n) is defined to be multiply(n,n), therefore the time for ddouble(n) is O(n²).

C. (7%)

Using the “big-O” notation, how long does it take super(n) to produce its result?

This one doesn’t just blindly follow the same pattern; there is a small difference...

power(a,b) involves one loop that executes b times; each time around the loop, it calls multiply(result,a) [note that the second parameter does not change] we have established that add(result,a) takes time that depends on both parameters, so the simple reasoning doesn’t apply.

(However, this question is about the technique, not the details, so failure to notice that was minimally penalised)

To work out the time, we need to take into account the values that the variable result will have each time around the loop. In computing nⁿ, the values of result are 1, n¹, n², n³, n⁴, n⁵, ..., n^n-2, n^n-1, the time to compute multiply(result,a) is O(result´a), and in this loop, a is always equal to n. So the times are n, n², n³, n⁴, n⁵, ..., n^n-1, nⁿ, and the total time is found by adding them all together,

which comes to O(nⁿ).

Which is not the O(n³) that nearly everybody wrote as the answer.

But as I said before, that doesn’t matter. The point of this question was to see you applying the technique in a reasonable way; the final answer to c was not important. In fact one person did get it right, and I was so completely caught off guard that I marked it as wrong.

Division is a little harder than the other operations, and I’m not feeling very bright today, so the best algorithm I can come up with, to work out the value of A divided by B, is to search for a number X which when multiplied by B gives A:

int divide(int A, int B)

{ int result=0;

while (result < A)

{ if (multiply(B, result) == A)

return result;

result+=1; } }

This divide function is obviously not very good. If A is not exactly divisible by B, it doesn’t return anything.

Regardless of that, I want to use it to find half of something.

int half(int n)

{ return divide(n, 2); }

D. (7%)

Using the “big-O” notation, how long does it take half(n) to produce its result?

If you were using the approximation that the time for multiply is O(n²), as most did, the accepted reasoning is that every time round the loop, a O(n²) multiplication is done, and the loop is quite likely to execute n times, so the total time is O(n³).

Using the completely correct time for multiply, multiply(a,b) takes O(ab), we notice that the second argument to multiply is increasing each time round the loop, and in dividing n by 2, we compute multiply(2,0), multiply(2,1), multiply(2,2), multiply(2,3), multiply(2,4), multiply(2,5), ..., multiply(2,n-2), multiply(2,n-1). Of course, sometimes an answer is found halfway through that list, but sometimes the answer is never found. The worst case is that all the computations are performed, but the time still really only comes out to O(n²). Again, points were for doing the right things, not necessarily for getting the right final answer.

E. (5%)

Define a better version of divide. One that works all the time, and if possible is a somewhat faster. Of course, you must follow the same restrictions: don’t use the normal arithmetic operators. Using the “big-O” notation, how could you represent the speed of your divide function?

A much more reasonable way to divide A by B, is to simply count the number of Bs you can add together before the result exceeds A:

int divide(int A, int B)

{ int sum = 0, number = 0;

while (sum <= A)

{ sum = add(sum, B);

number += 1; }

return number; }

Extra Credit

There is a much simpler way to find half of a number, using only a loop, increment, and decrement. It doesn’t even need to use the other functions (add, subtract, etc), and is relatively fast too. What is that method, and (using big-O) how fast is it?

Division by two can be achieved very simply by simply counting down in a special way. For every two that you subtract from N, add only 1 to the result:

int half(int N)

{ int r = 0;

while (N >= 0)

{ N -= 1;

N -= 1;

r += 1; }

return r; }

The time for this function is O(n). It performs the same basic operations each time round the loop, and the loop executes half N times.

7. (33%)

For the first parts of this question, you need to refer to your answers to question 6. If your answers to question 6 are wrong, don’t worry. This one will be graded pretending that your previous answers are correct (if they are even slightly reasonable). If you could not answer question 6 in any reasonable way, make up some answers that allow you to show some knowledge on this question; state what those assumed answers are, and just carry on as though they were correct.

Referring to your answers to question 6:

For these questions, I will take the most commonly given answers for question 6:

A ddouble(n) is O(n)

B square(n) is O(n²)

C super(n) is O(n³)

A. (7%)

If it takes a particular computer 10μS to compute ddouble(100), how long would you expect it to take to compute:

i. ddouble(200);

ii. ddouble(10000);

The easy way: ddouble is linear O(n), so if you multiply the data size by m, you also multiply the time it takes by m. part (i) is multiplying the original data size by 2, and part (ii) by 100, so the times will increase by 2 and 100:

i. 20ms

ii. 1000ms = 1mS

The hard way: time is O(n) means time is approx kn for some k. if t=10^-5S when n=10², then k is approx (10^-5S¸10²) = 10^-7S. Now the answers can be calculated:

i. t = 10^-7S ´ 200 = 20mS

ii. t = 10^-7S ´ 10000 = 1mS

B. (7%)

If it takes a particular computer 1mS to compute square(100), how long would you expect it to take to compute:

i. square(200);

ii. square(10000);

The easy way: square is quadratic O(n²), so if you multiply the data size by m, you must multiply the time it takes by m². part (i) is multiplying the original data size by 2, and part (ii) by 100, so the times will increase by 4 and 10000:

i. 4mS

ii. 10000ms = 10S

The hard way: time is O(n²) means time is approx kn² for some k. if t=10^-3S when n=10², i.e. n²=10⁴, then k is approx (10^-3S¸10⁴) = 10^-7S again. Now the answers can be calculated:

i. t = 10^-7S ´ 200² = 10^-7S ´ 40000 = 4mS

ii. t = 10^-7S ´ 10000² = 10^-7S ´ 10⁸ = 10S

C. (7%)

If it takes a particular computer 100mS to compute super(100), how long would you expect it to take to compute:

i. super(200);

ii. super(10000);

The easy way: (assuming) super is quadratic O(n³), so if you multiply the data size by m, you must multiply the time it takes by m³. part (i) is multiplying the original data size by 2, and part (ii) by 100, so the times will increase by 8 and 1000000:

i. 800mS = nearly 1 second

ii. 100´1000000ms = 10⁵S = about a day

The hard way: time is O(n³) means time is approx kn³ for some k. if t=10^-1S when n=10², i.e. n³=10⁶, then k is approx (10^-1S¸10⁶) = 10^-7S again! (Don’t rely on the constant always coming out the same; I arranged for the calculations to be easy this time, but they won’t always be). Now the answers can be calculated:

i. t = 10^-7S ´ 200³ = 10^-7S ´ 8´10⁶ = 0.8S

ii. t = 10^-7S ´ 10000³ = 10^-7S ´ 10¹² = 10⁵S, which is still about a day.

D. (6%)

Read this carefully.

If it takes a particular computer 10μS to compute ddouble(100), how long would you expect it to take to compute:

i. square(100);

ii. super(100);

iii. super(10000);

This time the easy was isn’t available, we are changing algorithms as well as data sizes, so we need to calculate the approximate value of the constant. Fortunately it is very easy.

ddouble is linear, so time is approx kn. so k = 10^-5S ¸ 100, which is 10^-7 S again!

i. If square is O(n²), then t is approx kn² = 10^-7 S ´ 100² = 1mS

ii. If super is O(n³), then t is approx kn³ = 10^-7 S ´ 100³ = 100mS

iii. If super is O(n³), then t is approx kn³ = 10^-7 S ´ 10000³ = 10⁵S, which is still a day.

E. (6%)

We are now comparing two sorting functions, a simple one (based on selection sort, or bubble sort if you prefer), and a fast one (based on merge-sort):

i. If the slow function takes 1 second to sort 1000 data items, how long would you expect the fast function to take?

ii. If the fast function takes 10 mS to sort 1000 data items, how long would you expect the slow function to take?

For this, you need to know that the slow sorts are O(n²) and mergesort is O(nlog₂n), and to remember that the constant comes out to be very similar for nearly all sorting functions.

i. t=1, n=10³, for O(n²) tells us k » 1¸(10³)² , so k is about 10^-6S.

when t is O(nlog₂n), t » k´n´log₂n = 10^-6S´10³´10, so it should be about 10mS.

ii. t=10^-2, n=10³, for O(nlog₂n) tells us k » 10^-2¸(10³´10) , so k is about 10^-6S.

when t is O(n²), t » k´n² = 10^-6S´(10³)², so it should be about 1S.

8. (1%)

Draw a picture of a dog with at least three legs.

^_^ /

(o o) /

| | /

W \______/

(______)

/ | | | \

\ / | \ |