EEN521 Autumn 2001 Test 23rd October Solutions

 

1. [The original question]

        a.

What is a page fault? What are the two kinds of page fault and how do they differ?

What causes a page fault? What happens when a page fault occurs?

What could possibly be done to reduce the number of page faults?

 

        b.

Explain and illustrate how the hardware supporting a modern virtual memory system works, and what it does.

 

        c.

Examine the following two snippets of code (they have the same overall result, and differ only in their last lines):

 

float A[1000][1000], sum=0.0;

... /* A is filled with data */

for (int i=0; i<1000; i+=1)

  for (int j=0; j<1000; j+=1)

    sum+=A[i][j];

 

 

float A[1000][1000], sum=0.0;

... /* A is filled with data */

for (int i=0; i<1000; i+=1)

  for (int j=0; j<1000; j+=1)

    sum+=A[j][i];

 

How could a virtual memory system affect each of them differently? Which would you expect to be best? Why?

[End of question]

 

a.

Recall that information on page translation is stored in two places: the operating system maintains in normal memory a "page table" for each process, which records the complete mapping from virtual page number to physical page number plus protections. The memory management hardware at any given time only contains a small subset of this mapping. This hardware performs automatic and almost instantaneous translations from virtual addresses to physical addresses every time memory is accessed. If the information required for this translation is not part of the small subset stored in the hardware, a page fault occurs. A page fault is an interrupt generated by the address translation hardware to tell the operating system that its small subset of the process' page table needs to be changed.

When a page fault is signalled, the operating system takes over from the current process, and attempts to resolve the problem. First, it searches the process' page table to find the needed translation information if it is found and there are no complications, the translation hardware is given the new information, and the original process continues. This is a small, simple, fast operation, and is called a soft page fault.

Occasionally, the page table will record the fact that the virtual page that the process was trying to access has been "paged out". This means that it is no longer resident in physical memory, but has been saved to disc, in order to free up some memory for use elsewhere. In this case it will be a long time before the process is able to continue running, because the page of memory it needs will have to be read back from disc. This is called a hard page fault, and results in execution of the process being suspended for some time.

The number of page faults could be reduced by increasing the number of entries that the translation hardware can hold at the same time, or by increasing the size of a page, or by making a better choice of which old translation entry should be removed when a new one has to be added, or by writing programs so that temporally close memory accesses are more likely to be grouped together in a few pages.

 

b.

Basic stuff from notes, see this.

c.

Assuming that it takes 4 bytes to store a float, we can easily reproduce the calculation that the running program performs to access an element of the array "float A[1000][1000]": the element A[x][y] is at address A+4000x+4y. You can verify that A[0][0] and A[0][1] are immediate neighbours in memory, as are A[0][1] and A[0][2], A[1][0] and A[1][1], and A[0][999] and A[1][0].

In the first version of the loop, the second index (y) is varied most quickly: after accessing A[0][0], it next accesses A[0][1], then A[0][2], etc. As these addresses are immediate neighbours in memory, the loop has strong "locality of access". We will get 128 accesses to each 512 byte page of memory before moving on to another page, therefore 128 accesses for each page fault.

In the second version of the loop, the first index (x) is varied most quickly: after accessing A[0][0], it next accesses A[1][0], then A[2][0], etc. As these addresses are not even near neighbours in memory (they are 4000 bytes apart), the loop has no "locality of access". We will get just one access to each 512 byte page of memory before moving on to another page, therefore one page fault for each access.

So the first version would be much faster.

 

 

 

 

2. [The original question]

        a.

Define concisely the following terms:

Surface

Cylinder

Track

Sector

Block

Rotational Latency

Track-to-track seek time

Cluster

 

        b.

Given that a particular disc drive has the following parameters:

Rotation speed:                    7200 rpm

Track-to-track seek:             3 mS

Capacity:                               50 GB  (i.e. 50´230 bytes)

Cylinders:                              100,000

Surfaces:                                8

 

How many sectors will it have?

What is the average time taken to read a random block?

What are the best and worst possible times to read 28,000 bytes?

If the disc contains very sensitive secret information which you want to completely erase, how long would it take to overwrite every single byte?

 

        c.

Describe, illustrate, and explain the two level index file format.

For this format, what is:

                The maximum file capacity?

                The minimum (worst possible) disc use efficiency?

                The expected time to read the next byte (sequential access)?

                The expected time to read a random byte?

[End of question]

 


 

Part a is just basic definitions from your notes.

If in doubt, see this.

 

 

Part b:

The maximum capacity of a disc is surfaces times cylinders times sectors times (bytes per block), so 50´230 is 8 times 100,000 times sectors times 512 (512 is a standard assumption). A little calculation (and no calculator is needed) gives the number of sectors as 131.072.

This tells us that either one of the supplied pieces of data was an approximation, or that the number of sectors can vary from track to track (perhaps not so many sectors on the inner tracks), but either way, there must be on average approximately 128 sectors.

 

Average time for a random read is rotational latency plus track seek time. Rotational latency is half of the time for one rotation. At 7200 revolutions per minute, one revolution takes 8.3mS, so the rotational latency is just over 4mS. That gives a random read time of just over 7mS.

This relies on the assumption that the track-to-track seek time is a good average seek time. This is not always a reasonable assumption. The average of the track-to-track seek time and the full-stroke seek time might be better.

 

28,000 bytes occupies 56 blocks, which is less than one track, so the best possible time to read 28,000 bytes, (which happens when those bytes are written in consecutive blocks on the same track, and the first of those blocks is just about to pass under the heads) is simply the time the disc takes to rotate by 56 blocks. We calculated about 131 blocks per track, so 56 blocks can be read in 56/131 of a rotation time, or about 3.5mS.

Actually, it can be 8 times faster. If the 56 blocks occupied by the data are actually 7 consecutive blocks in the same positions on each of the 8 surfaces, we can read 8 blocks at a time (assuming our disc controller is capable of such a feat, which IDE and EIDE and ATA discs are not), the best time could be reduced to just less than ½mS.

The worst possible time would be based on each block being at opposite ends of the disc and in exactly the most unfortunate sectors, and having a very unintelligent controller, so a full-stroke seek and a full rotation would be required before each of the 56 blocks. It would be acceptable to use the track-to-track seek time instead of the full-stroke seek time.

 

With a reasonable disc controller, we would only have to wait for 100,000 times (one rotation time plus one track-to-track seek time) to overwrite every byte. Each track can be written in a single rotation of the disc, and a good controller would allow every track of a cylinder to be written at the same time. This would give 100,000 times (8.3 plus 3) mS, or 1130 seconds, or about 19 minutes.

However, most disc controllers could not perform a write on all tracks of a cylinder simultaneously, so the most likely time would be 8 times longer than that, or something less than 3 hours.

 


Part c:

In the two-level-index file format, each file is represented by a header block which contains a few bytes of general data about the file, and then enough block pointers to fill the remainder of the header block. (Let's assume 100 such pointers).

Each of those pointers gives access to another block, completely filled with block-pointers. Assuming 512 bytes per block and 4 bytes per pointer, that gives 128 such pointers.

Each of those pointers gives access to a block of data, being the contents of the file.

This gives a maximum capacity of 100´128´512 = 6,553,600 bytes.

 

A file containing only one block's worth of real data would still have to occupy 3 blocks on the disc (the header, one block of pointers, and the block of data). This gives a worst-case efficiency of 33.3%.

 

Each time one byte of data is accessed, the whole block containing it must be read from the disc. Any sensible system will save that block in a buffer, so only one access in 512 (the number of bytes in a block) will need an actual disc access. The expected time for a random block read was previously calculated as 7mS, so the average time to read one byte (with sequential access) would be 7/512 or 14mS.

 

With random access, the chances of one byte being in the same block as the previously wanted one are negligible. Each access can be assumed to require a different block to be read from the disc. Naively, this would seem to require 2 disc accesses: one to read the block of pointers, then one to read the appropriate data block. However, with a two-level-index there are at most 100 pointer blocks, which would occupy a total of at most 51,200 bytes. It is very likely that a programmer who knows that random accesses on the file are to be performed would read all 100 pointer blocks into memory first, and keep them there. Then each random access will require only one disc access, for an average of 7mS.