Low Level Disc Operations
The best way to learn how file systems work is
to implement one yourself, using only the low-level operations supported by the discs themselves:
reading and writing a disc block (512 unformatted bytes of data) at a particular location on the disc
to or from the CPU's memory.
Naturally, students on a shared computer can't
be allowed access to these functions for the computer's real discs; you could easily read, modify,
or erase all other users' data by accident, or on purpose. So we have a simulator. This small
collection of functions allows you to create a pretend disc drive (which is really just a large
file in your own directory) and access it in the same way that the real operating system accesses
the real disc drives. Of course, you can't create a pretend disc quite as big as real disc drives
are today; the pretend disc you create has to fit in your quota of real disc space. That is not a
problem, size is not important. A large file system should work in the same way as a small one.
You can download the C++ source and header files
directly from here. They are quite simple, so you should have no trouble using
them anywhere. The files on this server have names ending in ".txt", so that your browser won't
try doing anything strange with them. You should change them to ".cpp" and ".h" before use.
fakedisc.h
the header file (about 800 bytes).
fakedisc.cpp
the C++ source file (about 1400 bytes).
If you wish to run this software on your
own computer you may have to make some very small modifications. On some unix systems, required system library .h files
may be in different places, requiring you to edit the #includes in the source file. If you can't
find (for example) file.h, the unix command "find /usr/include -name file.h -print"
should tell you where it is. When a filename appears in <pointy brackets> after #include, it
is automatically assumed to be in /usr/include, so leave out that part of the path.
Using the Simulation Software
This system assumes the standard of 512 bytes
per block. The .h file first defines two types: byte to be equivalent to
unsigned char, giving a range of 0-255, avoiding the confusion caused by two's
complement representations, and block to be equivalent to an array of 512
bytes.
Before doing anything with a filesystem on a pretend
disc, you will have to create the pretend disc as a file in your own directory. The function
createdisc does this. It takes two parameters; the first is a string giving the name
of the new disc, the second is an integer saying how big it is to be, in blocks. The name
you provide is used both as the name of the file created in your directory to hold the pretend disc,
and the name that other functions use to refer to this new disc. If a file of that name already
exists, it will be overwritten and destroyed, so be careful. The createdisc function
returns 1 if the pretend disc was successfully created, 0 if it failed. A newly created disc
is given the equivalent of a low-level format. All its blocks are filled with 512 zero bytes.
Before you can do anything with a pretend disc,
your program has to open it (this corresponds to the physical operation of mounting a disc). The
function mountdisc takes just one parameter: the name of an already created
pretend disc. You can not do anything with a pretend disc until it has been mounted. The function
returns -1 if it fails. If it succeeds, it returns a small non-negative integer, called
the disc identification number, that you use to identify the open disc when using other functions.
This is because it is possible to be using more than one pretend disc at the same time, so when
you want to read a block, you must be able to say which disc it should be read from.
The discsize function may be used
to find the number of blocks in a mounted pretend disc. Its one parameter is a disc identification
number; its result is an integer giving the number of blocks. Remember that block numbers
range from 0 to one-less-than this result.
Before a program that has been using pretend discs
exits, it should dismount those discs. If you fail to dismount a disc, changes you made to its
contents may be lost. The function dismountdisc takes one parameter: the
identification number of the disc to be dismounted.
A new disc will have nothing but zeros in it.
To write information to a disc, you must first construct an array of 512 bytes, containing exactly
the data that is to be written. A block is the smallest unit of a disc that can be read or
written in a single operation. The function writeblock takes three parameters:
a disc identification number (saying which pretend disc to write to), a block number (saying
which block on that disc to overwrite), and (a pointer to) the block itself. Remember that in C++,
the name of an array is automatically considered to be a pointer, so you won't
need the &. This does not apply to objects or structs, for which you will need an
&. Blocks on a disc are numbered from 0 up; if you specified a disc size of 200
blocks when you used createdisc, block numbers will be integers in the range 0 to 199.
To read a block from a pretend disc, the function
readblock works in almost exactly the same way as writeblock; it takes
exactly the same parameters: a disc identification number, a block number, and (a pointer to) a
block of memory. This time, the block/array of memory that you pass in is overwritten with the
512 bytes read from the disc. It is valid to read from a block that you have not previously written to.
Remember that this is simulating a real disc; all blocks exist right from the beginning.
Both writeblock and readblock
return 1 to signify success and 0 for failure. The two functions are declared as taking a void*
instead of a block or block* as their third parameter.
Example
If for some unimaginable reason, you wanted to
create a new pretend disk with just 20 blocks, and write the string "hello!" into the first
block, and the string "TEST" into the remaining 19, then read back block number 12 to check that
it still says "TEST", this piece of program would do the job:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include "fakedisc.h"
void copy(void * destination, const void * source, int numbytes)
{ byte *d=(byte *)destination;
byte *s=(byte *)source;
for (int i=0; i<numbytes; i+=1)
d[i]=s[i]; }
void main(void)
{ block buffer;
while (1)
{ printf("\n1: create new disc\n");
printf("2: write test data to disc\n");
printf("3: read from a single disc block\n");
printf("choice? ");
int ch;
scanf("%d", &ch);
if (ch==1)
{ int ok=createdisc("myfirstdisc", 20);
if (ok)
printf("OK\n");
else
{ fprintf(stderr, "Could not create the disc\n");
exit(1); } }
else if (ch==2)
{ int discid=mountdisc("myfirstdisc");
if (discid<0)
{ fprintf(stderr, "Could not access the disc\n");
exit(1); }
int nblocks=discsize(discid);
if (nblocks>0)
printf("disc opened with %d blocks\n", nblocks);
else
{ fprintf(stderr, "There is something the disc\n");
exit(1); }
for (int i=0; i<512; i+=1)
buffer[i]=0;
copy(buffer, "hello!", 6);
int ok=writeblock(discid, 0, buffer);
if (!ok)
{ fprintf(stderr, "Write to block 0 failed\n");
exit(1); }
for (int i=0; i<512; i+=1)
buffer[i]=0;
copy(buffer, "TEST", 4);
for (int blocknum=1; blocknum<20; blocknum+=1)
{ ok=writeblock(discid, blocknum, buffer);
if (!ok)
{ fprintf(stderr, "Write to block %d failed\n", blocknum);
exit(1); } }
dismountdisc(discid);
printf("OK\n"); }
else if (ch==3)
{ int blocknum;
printf("Which block? ");
scanf("%d", &blocknum);
int discid=mountdisc("myfirstdisc");
if (discid<0)
{ fprintf(stderr, "Could not access the disc\n");
exit(1); }
for (int i=0; i<512; i+=1)
buffer[i]='X';
int ok=readblock(discid, blocknum, buffer);
if (!ok)
{ fprintf(stderr, "Read from block %d failed\n", blocknum);
exit(1); }
printf("Block %d contains the text \"%s\"\n", blocknum, buffer);
dismountdisc(discid);
printf("OK\n"); } } }
Complex Structures
When you design your disc structures, you will probably
come up with things that are not convenient to convert item-by-item into an array of 512 unsigned
chars. You don't need to. Just design a C++ struct that contains all the information you want to put
in a block, in whatever format you choose. Make sure its size is exactly 512 bytes (using C++'s sizeof
function), then pass a pointer to it as the third parameter to readblock
or writeblock. The declaration of the third parameter as a void*
means that it will accept any pointer type. You just have to make sure that what you give
it is big enough. For example:
struct superblock
{ char discname[20];
int rootdiraddr;
int numblocks;
short int somethingelse;
byte padding[482]; };
void main(void)
{ int discid;
superblock sss;
if (sizeof(superblock)==512)
printf("struct superblock is 512 bytes long\n");
else
{ printf("struct superblock is NOT 512 bytes long\n");
exit(1); }
discid=mountdisc("myfirstdisc");
if (discid<0) exit(1);
strcpy(sss.discname, "DISK1");
sss.rootdiraddr=123;
sss.numblocks=20;
sss.somethingelse=987;
writeblock(discid, 0, &sss);
dismountdisc(discid); }
We can then check that this worked correctly by reading the information back thus:
struct superblock
{ /* as before */ };
void main(void)
{ int discid;
superblock sss;
/* as before */
readblock(discid, 0, &sss);
printf("\"%s\" %d %d %d\n", sss.discname, sss.rootdiraddr,
sss.numblocks, sss.somethingelse);
/* as before */ }