EEN521 Autumn 2001 Test 23rd October Solutions
1. [The original question]
a.
What is a page fault? What
are the two kinds of page fault and how do they differ?
What causes a page fault?
What happens when a page fault occurs?
What could possibly be done
to reduce the number of page faults?
b.
Explain and illustrate how
the hardware supporting a modern virtual memory system works, and what it does.
c.
Examine the following two
snippets of code (they have the same overall result, and differ only in their
last lines):
float A[1000][1000],
sum=0.0; ... /* A is filled with data
*/ for (int i=0; i<1000;
i+=1) for (int j=0; j<1000; j+=1) sum+=A[i][j]; |
float A[1000][1000],
sum=0.0; ... /* A is filled with data
*/ for (int i=0; i<1000;
i+=1) for (int j=0; j<1000; j+=1) sum+=A[j][i]; |
How could a virtual memory
system affect each of them differently? Which would you expect to be best? Why?
[End of question]
a.
Recall
that information on page translation is stored in two places: the operating
system maintains in normal memory a "page table" for each process,
which records the complete mapping from virtual page number to physical page
number plus protections. The memory management hardware at any given time only
contains a small subset of this mapping. This hardware performs automatic and
almost instantaneous translations from virtual addresses to physical addresses
every time memory is accessed. If the information required for this translation
is not part of the small subset stored in the hardware, a page fault occurs. A
page fault is an interrupt generated by the address translation hardware to
tell the operating system that its small subset of the process' page table
needs to be changed.
When
a page fault is signalled, the operating system takes over from the current
process, and attempts to resolve the problem. First, it searches the process'
page table to find the needed translation information if it is found and there
are no complications, the translation hardware is given the new information, and
the original process continues. This is a small, simple, fast operation, and is
called a soft page fault.
Occasionally,
the page table will record the fact that the virtual page that the process was
trying to access has been "paged out". This means that it is no
longer resident in physical memory, but has been saved to disc, in order to
free up some memory for use elsewhere. In this case it will be a long time
before the process is able to continue running, because the page of memory it
needs will have to be read back from disc. This is called a hard page fault,
and results in execution of the process being suspended for some time.
The
number of page faults could be reduced by increasing the number of entries that
the translation hardware can hold at the same time, or by increasing the size
of a page, or by making a better choice of which old translation entry should
be removed when a new one has to be added, or by writing programs so that
temporally close memory accesses are more likely to be grouped together in a
few pages.
b.
Basic stuff from notes, see this.
c.
Assuming
that it takes 4 bytes to store a float, we can easily reproduce the calculation
that the running program performs to access an element of the array "float
A[1000][1000]": the element A[x][y] is at address A+4000x+4y. You can
verify that A[0][0] and A[0][1] are immediate neighbours in memory, as are
A[0][1] and A[0][2], A[1][0] and A[1][1], and A[0][999] and A[1][0].
In
the first version of the loop, the second index (y) is varied most quickly:
after accessing A[0][0], it next accesses A[0][1], then A[0][2], etc. As these
addresses are immediate neighbours in memory, the loop has strong
"locality of access". We will get 128 accesses to each 512 byte page
of memory before moving on to another page, therefore 128 accesses for each
page fault.
In
the second version of the loop, the first index (x) is varied most quickly:
after accessing A[0][0], it next accesses A[1][0], then A[2][0], etc. As these
addresses are not even near neighbours in memory (they are 4000 bytes apart),
the loop has no "locality of access". We will get just one access to
each 512 byte page of memory before moving on to another page, therefore one
page fault for each access.
So
the first version would be much faster.
2. [The original question]
a.
Define concisely the
following terms:
Surface
Cylinder
Track
Sector
Block
Rotational Latency
Track-to-track seek time
Cluster
b.
Given that a particular disc
drive has the following parameters:
Rotation speed: 7200 rpm
Track-to-track seek: 3 mS
Capacity: 50 GB (i.e. 50´230 bytes)
Cylinders: 100,000
Surfaces: 8
How many
sectors will it have?
What is the
average time taken to read a random block?
What are the
best and worst possible times to read 28,000 bytes?
If the disc
contains very sensitive secret information which you want to completely erase,
how long would it take to overwrite every single byte?
c.
Describe, illustrate, and
explain the two level index file format.
For this format, what is:
The maximum file capacity?
The minimum (worst possible) disc use efficiency?
The expected time to read the next byte (sequential
access)?
The expected time to read a random byte?
[End of question]
Part a is just basic
definitions from your notes.
Part b:
The
maximum capacity of a disc is surfaces times cylinders times sectors times
(bytes per block), so 50´230
is 8 times 100,000 times sectors times 512 (512 is a standard assumption). A
little calculation (and no calculator is needed) gives the number of sectors as
131.072.
This
tells us that either one of the supplied pieces of data was an approximation,
or that the number of sectors can vary from track to track (perhaps not so many
sectors on the inner tracks), but either way, there must be on average
approximately 128 sectors.
Average
time for a random read is rotational latency plus track seek time. Rotational
latency is half of the time for one rotation. At 7200 revolutions per minute,
one revolution takes 8.3mS, so the rotational latency is just over 4mS. That
gives a random read time of just over 7mS.
This
relies on the assumption that the track-to-track seek time is a good average
seek time. This is not always a reasonable assumption. The average of the
track-to-track seek time and the full-stroke seek time might be better.
28,000
bytes occupies 56 blocks, which is less than one track, so the best possible
time to read 28,000 bytes, (which happens when those bytes are written in
consecutive blocks on the same track, and the first of those blocks is just
about to pass under the heads) is simply the time the disc takes to rotate by
56 blocks. We calculated about 131 blocks per track, so 56 blocks can be read
in 56/131 of a rotation time, or about 3.5mS.
Actually,
it can be 8 times faster. If the 56 blocks occupied by the data are actually 7
consecutive blocks in the same positions on each of the 8 surfaces, we can read
8 blocks at a time (assuming our disc controller is capable of such a feat,
which IDE and EIDE and ATA discs are not), the best time could be reduced to
just less than ½mS.
The
worst possible time would be based on each block being at opposite ends of the
disc and in exactly the most unfortunate sectors, and having a very
unintelligent controller, so a full-stroke seek and a full rotation would be
required before each of the 56 blocks. It would be acceptable to use the
track-to-track seek time instead of the full-stroke seek time.
With
a reasonable disc controller, we would only have to wait for 100,000 times (one
rotation time plus one track-to-track seek time) to overwrite every byte. Each
track can be written in a single rotation of the disc, and a good controller
would allow every track of a cylinder to be written at the same time. This
would give 100,000 times (8.3 plus 3) mS, or 1130 seconds, or about 19 minutes.
However,
most disc controllers could not perform a write on all tracks of a cylinder
simultaneously, so the most likely time would be 8 times longer than that, or
something less than 3 hours.
Part c:
In
the two-level-index file format, each file is represented by a header block
which contains a few bytes of general data about the file, and then enough
block pointers to fill the remainder of the header block. (Let's assume 100
such pointers).
Each
of those pointers gives access to another block, completely filled with
block-pointers. Assuming 512 bytes per block and 4 bytes per pointer, that
gives 128 such pointers.
Each
of those pointers gives access to a block of data, being the contents of the
file.
This
gives a maximum capacity of 100´128´512 = 6,553,600
bytes.
A
file containing only one block's worth of real data would still have to occupy
3 blocks on the disc (the header, one block of pointers, and the block of
data). This gives a worst-case efficiency of 33.3%.
Each
time one byte of data is accessed, the whole block containing it must be read
from the disc. Any sensible system will save that block in a buffer, so only
one access in 512 (the number of bytes in a block) will need an actual disc
access. The expected time for a random block read was previously calculated as
7mS, so the average time to read one byte (with sequential access) would be
7/512 or 14mS.
With
random access, the chances of one byte being in the same block as the
previously wanted one are negligible. Each access can be assumed to require a
different block to be read from the disc. Naively, this would seem to require 2
disc accesses: one to read the block of pointers, then one to read the
appropriate data block. However, with a two-level-index there are at most 100
pointer blocks, which would occupy a total of at most 51,200 bytes. It is very
likely that a programmer who knows that random accesses on the file are to be
performed would read all 100 pointer blocks into memory first, and keep them
there. Then each random access will require only one disc access, for an
average of 7mS.