EEN218 Class Notes Tuesday 2004-02-10
1: Defining Objects
A fraction, we found, is best represented by three values: a number for the top,
a number for the bottom, and a true/false for whether or not it is valid (it is
possible to use something like 2/0 as a fraction which is invalid. Anything
created by adding, subtracting, multiplying, or dividing invalid fractions
is also invalid, so a record needs to be kept. This is not what we are used to
for simpler kinds of numbers; there is no such thing as an invalid int, any
sequence of digits is acceptable).
So
to create fraction as a new data-type that can be used
in programs, this declaration is used:
struct fraction
{ int top, bot;
bool ok; };
*
A vector in three-dimensional space is represented by three real numbers.
It is not normally possible to have an invalid vector, so we might create
vector as a new data-type with this declaration:
struct vector
{ double x, y, z; };
*
Chemical elements have a huge amount of information associated with them. If
we were writing a program that is supposed to work on them, we would have to
decide which pieces of information are going to be useful (Perhaps if we
were just creating a giant chemistry database we would want to record everything,
but in most cases we have to be practical: any information we choose to record will
occupy memory in a running program, and somebody is going to have to type
it all). So making an assumption because this is just an example, perhaps
for each element we need its name (e.g. Carbon or Gold), its symbol (e.g. C or Au),
its atomic number (e.g. 6 or 79), its melting point (e.g. 3900.0°K or 1336.1°K), and
its boiling point (e.g. 5100.0°K or 3239.0°K). The first two are clearly strings,
and the third is an int. Normally all floating point numbers should be coded as
doubles. In this case it is not necessary to use doubles, The boiling points of
chemical elements are not normally known to any great accuracy, so a simple float
would do the job just as well, and use a little less memory too.
To
create element as a new data-type for a chemistry program, we would
use this declaration:
struct element
{ string name, symbol;
int at_number;
float m_point, b_point; };
It is worth putting a little effort into thinking of good names for the contents
(fields) of a struct. Should we type out "atomic_number", or could we reduce
it even further to something like "an"? There is no one correct answer; it
depends on your intentions for the program. Anyone using a chemistry program could
be expected to realise that at_number stands for Atomic Number, but something like
"an" might be too small to be unambiguous. On the other hand, the less
typing that is required, the happier your users are going to be.
Also,
we should ask ourselves whether enough information is given to make the program
reliable in use. When we see the data listed in a book, units are always given.
Nobody could reasonably say that the melting point of gold is "1336.1".
Is it 1336.1 degrees Fahrenheit, Centigrade, or Kelvin? Once something is just
reduced to a number in a program, there is no way that units can remain associated
with it, a number is a number, and nothing else. What are we going to do about it?
A recent space probe failed expensively because its specifications were all
written in inches, but part of it was built by a European manufacturer who assumed
centimetres. We don't want to be responsible for any mix-ups.
One
good solution to the problem
of units is to remember that nobody can access any of the data inside a struct
without typing its name. If we make the name include a reminder of the units,
perhaps "melting_point_in_Kelvin", nobody who uses the data provided
by this program can ever claim they thought it was centigrade. Of course, that
name is too long, so again we have to use some judgement. Chemists are used to
working with melting and boiling points, and know what is meant by mp
and bp, so perhaps a better definition of the struct could be this:
struct element
{ string name, symbol;
int at_number;
float mp_K, bp_K; };
There is an interesting alternative solution to the problem of knowing the
units of melting points and boiling points. We could create a new
type to represent temperatures as a normal number plus a letter
(either F, C, or K) to indicate the units:
struct temperature
{ float value;
char scale; };
Then we could redefine the element data-type so that it has
two of those values instead of two floats:
struct element
{ string name, symbol;
int at_number;
temperature mp, bp; };
We could even define special functions for doing arithmetic operations
on temperatures, so that the units are automatically checked, and converted
if they aren't already the same. That would be an interesting and useful
thing, but it will have to wait a few weeks.
*
If we are writing a program that has to deal with real people (perhaps a
customer database or an address book application), again we are faced with
an enormous amount of data, and the need to decide which items are important.
Without knowing the intended use of the program, we can only guess. Normally
you will either be told what needs to be recorded, or you'll be able to work
it out by knowing what the program is meant for. Let's assume that for each
person we need to know their title (Mr., Mrs, Ms., etc), their first name
(Sally, Bub, etc), their Middle Initial (J., P., etc), their last name
(Jones, Slugge, Smith, etc), their street address (123 N.W. Cat St., etc)
their city (Miami, Fort Lauderdale, Timbuktu, etc), their state (FL, GA, etc),
their zip code (33333, 12542, etc), their account number (6327627, 0234639, etc),
and their phone number (305-111-2222, 212-694-2347, etc). That's a lot, but
only a fraction of what might be expected in a real commercial application.
How should all of those be
represented? It is clear that many should be strings. Middle initial could
perhaps be a single char, but is that a good decision? What about
people with two middle initials, or with none? Maybe a string would be best
there too. Zip codes can of course be ints, they have a small
size and a well-defined format. Phone numbers can be thought of as ints,
but give a range problem. On common computers, the maximum int value is just
over 2000000000; that is perfect for nine-digit social security numbers, and
probably for account numbers too (but it depends on the company), but isn't
quite enough for ten-digit phone numbers. As long long ints aren't
really standard C++, phone numbers should probably be strings. Unless we
are going to be concerned about international customers, whose addresses
might have a completely different format, we can probably leave it like this.
struct person
{ string title, fname, midinit, lname, streetaddr, city, state, phone;
int zipcode, account; };
Of course, other decisions are possible. We could decide not to record
state at all, knowing that the first two digits of a zip code always
tell you the state. We could decide that the whole name (title, fname, midinit, lname)
could be stored as one big string instead of four little ones. The
same could be done for the address, giving:
struct person
{ string name, address, phone;
int account; };
This would remove all the problems associated with foreign addresses, but
would make common operations difficult and inefficient to perform. For example,
head office might want to know how many customers live in a particular state,
and that would take a lot of work if the state had to be extracted from the
address string for every single customer.
2: Using Objects
Once a new struct has been declared, the data-type that it defines
can be used anywhere in a program, just like any standard type (int,
double, string, etc). You can have variables of type
person, arrays of persons, functions that take person
parameters, and functions that return person results. Just about
anything. Of course, the system will not know how to perform operations
like +, *, < on a
person, and it won't know how to print or read them, but that
makes sense: how could it know how to add two persons together? It just
doesn't make sense. Even for temperatures, where the idea does make sense,
we can't expect C++ to have built-in knowledge of how temperature scales
work. It is out job to teach the system how to perform the required
operations on our new data-types by defining functions that do the job.
Already-Known Operations
There are a few things that C++ does already know how to do, even if we
don't define a function to do the job. As soon as you define a new struct,
C++ is already capable of the following operations:
- Assignment =, When one object is assigned to another
of the same type (e.g. both persons or both elements), C++ simply copies
each component individually. If the two objects are not of the same type,
assignment is not defined.
- Initialisation =, When a variable is first declared
and created, it may be given an initial value by assigning the list of
values of its components, separated by commas, and surrounded by curly
brackets. Example: fraction half = { 1, 2, true };; Another Example:
element gold = { "gold", "Au", 79, { 1336.1, 'K' }, { 3239.0, 'K' } };
This is only allowed when initialising a new variable, not in a normal
assignment statement.
- Component Access ., the individual components
may be extracted by giving the name of the whole object, followed by a dot,
followed by the name of the component. For example, if we have already
made this declaration: person cust;, we could change that
person's name like this: cust.fname="Jilly";, or we could
check her state like this: if (cust.state=="AK")
cout << "The customer is cold";.
Here are some examples of things that can be done, using a slightly reduced
version of the person definition:
struct person
{ string fname, lname, street, city, state;
int zip;
string phone; };
void main()
{ person a = { "Jilly", "Jones", "1234 Cat St.", "Monkeyville", "PA", 21432, "414-555-2323" };
person b = { "Arthur", "O'Pod", "72 N.W. 14th Ave., #22A", "Hellzapoppin", "PA", 21427, "414-555-7264" };
person c = { "Jane", "Grit", "221B Baker St.", "Frog City", "FL", 33314, "305-555-1234" };
// this will print Jilly Jones's phone number is 414-555-2323
cout << a.fname << " " << a.lname << "'s phone number is " << a.phone << endl;
// this will allow someone to correct b's zip code if it was wrong
cout << "Enter new zip code for " << b.fname << " " << b.lname << ": ";
cin >> b.zip;
// this creates a new person object with exactly the same data as c
person d = c;
// Perhaps a and b get married, and b moves in with a:
b.street = a.street;
b.city = a.city;
b.state = a.state;
b.zip = a.zip;
// This looks to see if b and d have the same phone number:
if (b.phone == d.phone)
cout << "same phone number\n";
Good Design
It is often a good rule of thumb (almost a rule, but not quite) that the objects
in your program should correspond fairly exactly with the objects in the real world
that your program is trying to model. In most of the examples above, that rule
of thumb was followed perfectly, but in the case of the person, it wasn't.
A person's address represents a home, a real object. It might be more sensible
if we realised that addresses are meant to represent real things, and created
a special address data-type for that job. It would simplify the
definition of person too:
struct address
{ string streetaddr, city, state;
int zip; };
struct person
{ string fname, lname;
address home;
string phone; };
The examples given above would remain almost the same, but with a few notable exceptions:
void main()
{ person a = { "Jilly", "Jones", { "1234 Cat St.", "Monkeyville", "PA", 21432 }, "414-555-2323" };
person b = { "Arthur", "O'Pod", { "72 N.W. 14th Ave., #22A", "Hellzapoppin", "PA", 21427 }, "414-555-7264" };
person c = { "Jane", "Grit", { "221B Baker St.", "Frog City", "FL", 33314 }, "305-555-1234" };
// this will print Jilly Jones's phone number is 414-555-2323
cout << a.fname << " " << a.lname << "'s phone number is " << a.phone << endl;
// this will allow someone to correct b's zip code if it was wrong
cout << "Enter new zip code for " << b.fname << " " << b.lname << ": ";
cin >> b.home.zip;
// this creates a new person object with exactly the same data as c
person d = c;
// Perhaps a and b get married, and b moves in with a:
b.home = a.home;
// This looks to see if b and d have the same phone number:
if (b.phone == d.phone)
cout << "same phone number\n";
We normally expect a better design for the structure of our data-types to result
in a smaller, simpler, and clearer program.
*
Now of course, we have to create functions that perform all the common operations on
our new data-types. For the examples above, the only operations we can be fairly sure
will be needed are those for reading and writing the objects; anything else will be
totally application-dependent.
struct address
{ string streetaddr, city, state;
int zip; };
struct person
{ string fname, lname;
address home;
string phone; };
address read_address()
{ address temp;
cout << "Street addr: ";
cin >> temp.streetaddr;
cout << "City: ";
cin >> temp.city;
cout << "State: ";
cin >> temp.state;
cout << "Zip: ";
cin >> temp.zip;
return temp; }
person read_person()
{ person temp;
cout << "First Name: ";
cin >> temp.fname;
cout << "Last Name: ";
cin >> temp.lname;
temp.home = read_address();
cout << "Phone: ";
cin >> temp.phone;
return temp; }
void print(address a)
{ cout << a.streetaddr << ", " << a.city << ", " a.state << " " << a.zip };
void print(person p)
{ cout << p.fname << " " << p.lname << ", of ";
print(p.home);
cout << ", tel: " << p.phone); }
There is a difference between the styles of definition of the print functions
and the read functions. We can have two functions called print because
they have differently typed parameters, so the system can always tell which one
to use just by looking at the parameter. The read functions return a
new object as their result, so they do not need parameters. That means there
is no information the system can use to tell which is needed; they can't
both be called read, the name needs to provide the necessary information.
In many cases, possibly even this one,
it may be preferrable to have nice simple unforgettable names for all functions. If
we were to redefine the read functions so that they are given a parameter (the
empty object to read data into) instead of returning a result, this could be done.
(Rememeber that only reference parameters (with an & in their declaration)
can successfully be modified by a function).
The alternative version of the two read
functions would be:
void read(address & a)
{ cout << "Street addr: ";
cin >> a.streetaddr;
cout << "City: ";
cin >> a.city;
cout << "State: ";
cin >> a.state;
cout << "Zip: ";
cin >> a.zip; }
void read(person & p)
{ cout << "First Name: ";
cin >> p.fname;
cout << "Last Name: ";
cin >> p.lname;
read(p.home);
cout << "Phone: ";
cin >> p.phone; }
So now, if we want to read the information on a whole load of people into
a database, we could simply create a large array of person objects
to be the database, and have a simple loop:
const int max=1000;
person database[max];
int num=0;
.
.
.
void main()
{ .
.
.
while (num<max)
{ cout << "More data to enter? (Y or N) ";
string s;
cin >> s;
if (s=="N") break;
if (s!="Y") continue;
read(database[num]);
num+=1; }
.
.
.
cout << "The database contains " << num << " people:\n";
for (int i=0; i<num; i+=1)
{ cout << i << ": ";
print(database[i]);
cout << "\n"; }
.
.
. }
Everything becomes much simpler, doesn't it?
You should note that there is a slight flaw in this implementation. It is
not a problem with the objects themselves; everything in that respect is
perfectly correct. The problem is with using cin for input.
When cin is told to read a string, it skips over any White-Space
(that is spaces, tabs, and end-of-lines) that appear before any solid characters,
then it reads as many solid (visible) characters as are available, but it stops
reading as soon as it meets another white-space character. This means that if
comeone dares to have a space in their address (for example "123 N.W. Cat St.",
then cin >> a.streetaddr; will just read the "123",
then stop, leaving the "N.W." to be read as the city.
If you
really want to use cin and cout for input and output,
you will have to learn how to take control of them, and it isn't
always easy. There is a function called getline, whose
job is to read a whole line of input, regardless of whether or not
it contains spaces. As an illustration of it, here is a handy
little function:
string read_line()
{ string temp
getline(cin, temp);
return temp; }
As usual, cin.eof() and cin.fail() become true after
using getline if it failed to read anything.