Starting a little example to remind us of all the essential things. Plan: A slightly useful, slightly interactive program to save time in introductory chemistry. User enters a molecular formula, such as CuSO4, UF6, CO, Co, program calculates and prints the molecular weight. Requirements: 1. Something that analyses the input string, initialised with the user's input, perhaps "CuSO4", somehow able to give us the parts of it, "Cu", "S", "O", "4", one at a time as needed. 2. Internal data storage, something to represent the elements' symbols, names, and atomic weights, allowing easy retrieval. Something that given a symbol "S" would give back the atomic weight of Sulphur, 32.06. 3. Database initialisation: We would expect the raw data to be stored in a disc file, looking something like this: H Hydrogen 1.008 He Helium 4.003 ...... This needs to be read from the file and inserted into the internal data storage system. AN ESSENTIAL PRINCIPLE OF SOFTWARE ENGINEERING: Always try to make sure that the data objects in your program correspond as strongly and clearly as possible to the real objects in the real world that you are trying to model. The unsophisticated programmer would look at the contents of the data file as three arrays - an array of strings holding the symbols "H", "He", "Li", ... an array of strings containing the names "Hydrogen", "Helium", ... an array of doubles containing the weights 1.008, 4.003, ... But that would be missing the essential structure. The file really represents just one single array of ELEMENTS. Elements are real things that our program works with, so elements should have their own proper representation in the program. string element { string name, symbol; double weight; }; This introduces a new type, just like int or string, but called "element". We can now create variables that contain elements, arrays of elements, functions that return elements, etc. element He, S; creates two perfectly ordinary variables called He and S. These two variables each have their own three internal parts, which are variables themselves. In a way, we have created four variables: two elements: He and S, four strings: He.name, He.symbol, S.name, and S.symbol, two doubles: He.weight and S.weight. The whole internal database can be implemented as one single array of elements (called "table" because it is a sort of periodic table) element table[105]; So I can do things like this: He.symbol="He"; He.name="Helium"; He.weight=4.003; table[2]=He; or table[1].symbol="H"; table[1].name="Hydrogen"; table[1].weight=1.008; Or even better, I could make a function to do the repetitive parts element make(string s, string n, double w) { element r; r.symbol=s; r.name=n; r.weight=w; return r; } allowing table[1] = make("H", "Hydrogen", 1.008); table[2] = make("He", "Helium", 4.003); table[92] = make("U", "Uranium", 238.030); It would be very annoying to have to type that whole table, but of course, nobody has to do that. It will be read from a file.