Duality of Tables and Relations

an example

Consider two tables in a database: If we want to pair people with animals born on the same date, a simple SQL query would do the trick:
        SELECT p.name, a.name
        FROM people p, animals a
        WHERE p.birthdate=a.birthdate
The condition p.birthdate=a.birthdate is a simple calculated condition; in an ER diagram we would naturally represent it with a diamond shape.    However, we could just ask that query once, then create a new table: and just copy the results of the query into it. If we have a sufficiently sophisticated implementation of SQL, we could even automate that task, with this command:
        INSERT INTO samebirthdays
               SELECT p.id, a.id
               FROM people p, animals a
               WHERE p.birthdate=a.birthdate
Then, any future query about people and animals who were born on the same days could be answered by a table lookup instead of the computation. For example, this query:
        SELECT p.name, p.address, a.name, c.name
        FROM people p, animals a, circusses c
        WHERE p.birthdate=a.birthdate AND a.circus_id=c.id
could become:
        SELECT p.name, p.address, a.name, c.name
        FROM people p, animals a, samebirthdays s, circusses c
        WHERE p.id=s.person_id AND s.animal_id=a.id AND a.circus_id=c.id
Even for simple queries like this, it could speed things up (perhaps for some reason you can't create an index on animal.birthdate but you can create an index on a new "samebirthdays" table); if the condition is complex, converting it into a table could make a really significant difference to the speed.

It should be clear that the same idea could be applied to any condition: any true-or-false expression can be precomputed and stored in a new table; diamonds in an ER diagram can always be converted into rectangles if you are willing to expend the storage space. It may not be sensible to do so, but it is always possible.


Now consider a database with four tables, extracts from which are shown below:
AnimalNames
idname
17Flossie
......
102Oinky
......
144Fluffy
......

Species
idnamekindsize
4Elephantmammallarge
............
13Sealmammalmedium
............
92Penguinbirdsmallish
............

Circusses
idname
1Ringling Brothers
2Al's Discount Circus
3Barnum and Bailey's
......
6Circusses 'R' Us
......

Animals
idnameidspeciesidownerid
114442
217133
310246
............

This collection of tables results from avoiding the redundancies that naturally arise because lots of animals have the same names, lots of animals have the same species, and each circus owns lots of animals. The original non-normalised table might have looked like this:
Animals
idnamespeciesowner
1FluffyElephantAl's Discount Circus
2FlossieSealBarnum and Bailey's
3OinkyElephantCircusses 'R' Us
............

so you can see that normalisation probably saved a lot of space. Anyway, The first version of the "Animals" table (with only numbers in it) could be replaced by a computation:
         name(id) = 106*id*id - 445*id + 483
      species(id) =  -9*id*id +  36*id -  23
        owner(id) =     id*id -   2*id +   3
You can easily verify that this is correct: name(2) = 106*2*2-445*2-483 = 17. The point is that mathematics guarantees that a formula exactly fitting any data, no matter how unnatural it seems, can always be produced. It takes a lot of work to produce the formula, and it is easy to make a mistake along the way, but it can be done. The formulæ will generally involve really big numbers, so computation can take a long time, but it is always possible. We can even simulate indexes on different fields by working out an inverse formula, for example:
       id(owner) = -1*owner*owner/6 + 11*owner/6 -2
so that id(6) = -1*6*6/6+11*6/6-2 = 3.

So, a query that involves accessing data from the animals table, such as:
        SELECT n.name
        FROM animalnames n, animals a, circusses c
        WHERE c.name='Circusses R Us'
          AND a.ownerid=c.id
          AND a.nameid=n.id
(which finds the name of the animal owned by 'Circusses R Us') can be replaced by an equivalent query that calculates the connection instead:
        SELECT n.name
        FROM animalnames n, circusses c
        WHERE c.name='Circusses R Us'
          AND n.id=106*(-1*c.id*c.id/6+11*c.id/6-2)*(-1*c.id*c.id/6+11*c.id/6-2)
                  -445*(-1*c.id*c.id/6+11*c.id/6-2)
                  +483
The same transformation could be applied to any table.

It os not often a sensible thing to do, but any condition can be turned into a table, and any table can be turned into a condition, so tables and conditions are really equivalent at a deep level. There is only one kind of thing in a relational database, and it is called a Relation. A relation can be seen as a table or as a condition, depending on how you choose to look at (or implement) it.