5. (33%)
I want to make a list of all the words that appear in
a text document, without repetitions of course. I have decided that the easiest
way to solve this problem is to make you do it. The plan is to have a program
which creates a linked list of strings, and just lets the user type. Every time
the user types a string, the program will search the list to see if that string
is already there. If the string is already in the list, nothing happens. If the
string is not already in the list, it is added.
A. (6%)
Write correct struct or class definitions for
creating a linked list of strings suitable for use in the program described
above.
B. (8%)
Write a function which searches through a linked list,
looking for a particular string. (The linked list and the string should both be
parameters to the function) The function should return the result true if the string does appear in the list, and false if it does not.
C. (8%)
Write a function which adds a string to a linked list.
(The linked list and the string should both be parameters to the function) The
function does not need to return any result.
D. (7%)
Write a function that prints all the strings in a
linked list.
E. (4%)
Write a main function
that brings all your previous answers together into a working program. It does not need a good user interface. When
the program is run, it should simply let the user type in a lot of strings (it
doesn’t matter what the strings are, any space-separated sequences of
characters will do), and add the unique ones to the linked list. If the user
types “****”, that should not be added to the list, but instead
taken as a signal to stop, and print out all the words.
The
following program answers all five parts in one. Each part of the program is labelled with the letter A-E for the part of the question
that it answers:
E:
#include <iostream>
#include <string>
A:
struct link
{ string word;
link * next; };
A:
struct list
{ link * first; };
C:
void add(list & L,
string s)
{ link * n = new link;
n->word = s;
n->next = L.first;
L.first = n; }
B:
bool ispresent(const
list & L, string s)
{ link * curr
= L.first;
while (curr!=NULL)
{ if (curr->word
== s)
return true;
curr = curr->next; }
return false; }
D:
void printall(const
list & L)
{ link * curr
= L.first;
while (curr!=NULL)
{ cout << curr->word << "\n";
curr = curr->next; } }
(This
next function is not required for the question, but is something that a person
might add if they want to show off for a chance at some extra credit. It makes
the program much more useful)
string standardise(string
s)
{ string r = "";
for (int i=0; i<s.length();
i+=1)
{ char c=s[i];
if (c>='A' && c<='Z')
c-='A'-'a';
if (c>='a' && c<='z')
r=r+c; }
return r; }
E:
void main(void)
{ list words;
words.first=NULL;
while (1)
{ string w;
cin >> w;
if (w == "****") break;
w = standardise(w);
if (! ispresent(words,
w))
add(words, w); }
printall(words); }
All
it needed was the basic linked list functions, with a tiny bit of thought about
input and output.
6. (33%)
Here are a bunch of functions. If you had a really really cheap computer whose hardware was incapable of doing
any of the standard arithmetic operations (+, -, *, etc), but it could
increment and decrement variables (just add or subtract 1), these functions
would help out:
int
add(int x, int y)
{ int result=x;
while (y>0)
{ result+=1;
y-=1; }
return result; }
int
subtract(int x, int y)
{ int result=x;
while (y>0)
{ result-=1;
y-=1; }
return result; }
int
multiply(int x, int y)
{ int result=0;
while (y>0)
{ result=add(result, x);
y-=1; }
return result; }
int
power(int x, int y)
{ int result=1;
while (y>0)
{ result=multiply(result, x);
y-=1; }
return result; }
The operations I really want to work with are doubling, squaring, and finding n-to-the-power-of-n.
With the functions shown above as a library, I can easily define the functions
I care about:
int
ddouble(int n)
{
return add(n, n); }
int
square(int n)
{
return multiply(n, n); }
int
super(int n)
{
return power(n, n); }
A. (7%)
Using the “big-O” notation, how long does it take ddouble(n) to produce its result?
add(a,b)
involves one loop that executes b times; each time around the loop, it does the
same simple operations (one increment and one decrement), so takes the same
time. Therefore the total time is proportional to the number of times round the
loop, which is equal to b. thus the time for add(a,b)
is O(b).
ddouble(n) is defined to be add(n,n), therefore the time for ddouble(n)
is O(n).
B. (7%)
Using the “big-O” notation, how long does it take square(n) to produce its result?
multiply(a,b) involves one loop that executes b times; each time
around the loop, it calls add(result,a) [note that
the second parameter does not change] we have already established that add(result,a) takes time O(a), and this happens b times, so
multiply(a,b) takes time O(ab).
square(n) is defined to be
multiply(n,n), therefore the time for ddouble(n) is O(n2).
C. (7%)
Using the “big-O” notation, how long does it take super(n) to produce its result?
This one doesn’t just
blindly follow the same pattern; there is a small difference...
power(a,b)
involves one loop that executes b times; each time around the loop, it calls
multiply(result,a) [note that the second parameter
does not change] we have established that add(result,a)
takes time that depends on both parameters, so the simple reasoning doesn’t
apply.
(However, this question is
about the technique, not the details, so failure to notice that was minimally penalised)
To work out the time, we
need to take into account the values that the variable result will have each
time around the loop. In computing nn, the
values of result are 1, n1, n2, n3, n4,
n5, ..., nn-2, nn-1, the time to compute
multiply(result,a) is O(result´a), and in this loop, a is
always equal to n. So the times are n, n2, n3, n4,
n5, ..., nn-1, nn,
and the total time is found by adding them all together,
which comes to O(nn).
Which is not the O(n3)
that nearly everybody wrote as the answer.
But as I said before, that
doesn’t matter. The point of this question was to see you applying the
technique in a reasonable way; the final answer to c was not important. In fact
one person did get it right, and I was so completely caught off guard that I
marked it as wrong.
Division is a little harder than the other operations,
and I’m not feeling very bright today, so the best algorithm I can come up
with, to work out the value of A divided by B, is to search for a number X
which when multiplied by B gives A:
int
divide(int A, int B)
{ int result=0;
while (result < A)
{ if (multiply(B, result) == A)
return result;
result+=1; } }
This divide function is obviously not very good. If A
is not exactly divisible by B, it doesn’t return anything.
Regardless
of that, I want to use it to find half of something.
int
half(int n)
{
return divide(n, 2); }
D. (7%)
Using the “big-O” notation, how long does it take half(n) to produce its result?
If you were using the
approximation that the time for multiply is O(n2), as most did, the
accepted reasoning is that every time round the loop, a O(n2)
multiplication is done, and the loop is quite likely to execute n times, so the
total time is O(n3).
Using the completely
correct time for multiply, multiply(a,b) takes O(ab), we notice that the second argument to multiply is
increasing each time round the loop, and in dividing n by 2, we compute
multiply(2,0), multiply(2,1), multiply(2,2), multiply(2,3), multiply(2,4),
multiply(2,5), ..., multiply(2,n-2), multiply(2,n-1). Of course, sometimes an
answer is found halfway through that list, but sometimes the answer is never
found. The worst case is that all the computations are performed, but the time
still really only comes out to O(n2). Again, points were for doing
the right things, not necessarily for getting the right final answer.
E. (5%)
Define a better version of divide. One that works all the time, and if possible is a
somewhat faster. Of course, you must
follow the same restrictions: don’t use the normal arithmetic operators. Using the “big-O” notation, how could you
represent the speed of your divide function?
A much more reasonable way
to divide A by B, is to simply count the number of Bs you can add together
before the result exceeds A:
int
divide(int A, int B)
{ int sum = 0, number = 0;
while (sum <= A)
{ sum = add(sum, B);
number += 1; }
return number; }
Extra Credit
There is a much simpler way to find half of a number,
using only a loop, increment, and decrement. It doesn’t even need to use the
other functions (add, subtract, etc), and is relatively fast too. What is that
method, and (using big-O) how fast is it?
Division by two can be
achieved very simply by simply counting down in a special way. For every two
that you subtract from N, add only 1 to the result:
int
half(int N)
{ int r = 0;
while (N >= 0)
{ N -= 1;
N -= 1;
r += 1; }
return r; }
The time for this function
is O(n). It performs the same basic operations each time round the loop, and
the loop executes half N times.
7. (33%)
For the first parts of this question, you need to
refer to your answers to question 6. If your answers to question 6 are wrong,
don’t worry. This one will be graded pretending that your previous answers are
correct (if they are even slightly reasonable). If you could not answer
question 6 in any reasonable way, make up some answers that allow you to show
some knowledge on this question; state what those assumed answers are, and just
carry on as though they were correct.
Referring to your answers to question 6:
For these questions, I
will take the most commonly given answers for question 6:
A ddouble(n) is O(n)
B square(n) is O(n2)
C super(n) is O(n3)
A. (7%)
If it takes a particular computer 10μS to compute
ddouble(100), how long would you expect it to take to compute:
i. ddouble(200);
ii. ddouble(10000);
The easy way: ddouble is linear O(n), so if you multiply the data size by
m, you also multiply the time it takes by m. part (i)
is multiplying the original data size by 2, and part (ii) by 100, so the times
will increase by 2 and 100:
i. 20ms
ii. 1000ms = 1mS
The hard way: time is O(n)
means time is approx kn for some k. if t=10-5S
when n=102, then k is approx (10-5S¸102) = 10-7S.
Now the answers can be calculated:
i. t = 10-7S ´ 200 = 20mS
ii. t = 10-7S ´ 10000 = 1mS
B. (7%)
If it takes a particular computer 1mS to compute square(100), how long would you expect it to take to compute:
i. square(200);
ii. square(10000);
The easy way: square is
quadratic O(n2), so if you multiply the data size by m, you must
multiply the time it takes by m2. part (i)
is multiplying the original data size by 2, and part (ii) by 100, so the times
will increase by 4 and 10000:
i. 4mS
ii. 10000ms = 10S
The hard way: time is O(n2)
means time is approx kn2 for some k. if t=10-3S when n=102,
i.e. n2=104, then k is approx (10-3S¸104) = 10-7S
again. Now the answers can be calculated:
i. t = 10-7S ´ 2002 = 10-7S
´
40000 = 4mS
ii. t = 10-7S ´ 100002 = 10-7S
´
108 = 10S
C. (7%)
If it takes a particular computer 100mS to compute super(100), how long would you expect it to take to compute:
i. super(200);
ii. super(10000);
The easy way: (assuming)
super is quadratic O(n3), so if you multiply the data size by m, you
must multiply the time it takes by m3. part (i)
is multiplying the original data size by 2, and part (ii) by 100, so the times
will increase by 8 and 1000000:
i. 800mS = nearly 1 second
ii. 100´1000000ms = 105S = about a day
The hard way: time is O(n3)
means time is approx kn3 for some k. if t=10-1S when n=102,
i.e. n3=106, then k is approx (10-1S¸106) = 10-7S
again! (Don’t rely on the constant always coming out the same; I arranged for
the calculations to be easy this time, but they won’t always be). Now the
answers can be calculated:
i. t = 10-7S ´ 2003 = 10-7S
´
8´106
= 0.8S
ii. t = 10-7S ´ 100003 = 10-7S
´
1012 = 105S, which is still about a day.
D. (6%)
Read this carefully.
If it takes a particular computer 10μS to compute
ddouble(100), how long would you expect it to take to compute:
i. square(100);
ii. super(100);
iii. super(10000);
This time the easy was
isn’t available, we are changing algorithms as well as data sizes, so we need
to calculate the approximate value of the constant. Fortunately it is very
easy.
ddouble is linear, so time is
approx kn. so k = 10-5S ¸ 100, which is 10-7
S again!
i. If square is O(n2), then t is
approx kn2 = 10-7 S ´ 1002 = 1mS
ii. If super is O(n3), then t is
approx kn3 = 10-7 S ´ 1003 = 100mS
iii. If super is O(n3), then t is approx
kn3 = 10-7 S ´ 100003 = 105S,
which is still a day.
E. (6%)
We are now comparing two sorting functions, a simple
one (based on selection sort, or bubble sort if you prefer), and a fast one
(based on merge-sort):
i. If the slow function takes 1 second to sort
1000 data items, how long would you expect the fast function to take?
ii. If the fast function
takes 10 mS to sort 1000 data items, how long would
you expect the slow function to take?
For this, you need to know
that the slow sorts are O(n2) and mergesort
is O(nlog2n), and to remember that the constant comes out to be very
similar for nearly all sorting functions.
i. t=1, n=103, for O(n2)
tells us k »
1¸(103)2
, so k is about 10-6S.
when t is O(nlog2n), t » k´n´log2n = 10-6S´103´10, so it should be about
10mS.
ii. t=10-2, n=103, for
O(nlog2n) tells us k » 10-2¸(103´10) , so k is about 10-6S.
when t is O(n2), t » k´n2 = 10-6S´(103)2,
so it should be about 1S.
8. (1%)
Draw a picture
of a dog with at least three legs.
^_^ /
(o o)
/
| | /
W \______/
(______)
/ | | | \
/ | | | \
\ / | \ |