Assignments 2 and 3
File-System Implementation, part 1
When a new operating system is being designed, very early in the process someone has to write
the function that give access to the low-level hardware supported operations of the disc drives
(and other peripherals). The result will be something like the redablock and writeblock
function described in "Low Level Disc Operations". You give them 512
bytes of data and tell them which disc block to use, and they will either read or write the
data for you.
No normal programmer would ever use those functions.
No normal programmer would be allowed to, because they give direct access to the whole disc, so anyone
using them can see or overwrite any data anywhere. No normal programmer would want to use them either,
because it would require too much effort. These functions don't support files, directories, or
anything. The programmer has to remember which block contains what data, and is limited to a block
size of 512 bytes.
Someone has to write the functions that give more
convenient access. That person has to decide on the disc format (superblock contents and location,
how to keep track of which blocks are free, file formats, etc), write a function that "formats" a
new disc with this information, and design and implement functions for creating new files,
opening old files, reading data from files, and writing data to files, in a way that is reasonably
conveient to a programmer. That person is you.
You do not need to write all the C library
input and output functions such as fopen, fclose, getc, putc, printf, scanf, and fseek; there
is no need to go that far (although it isn't that difficult, and would earn you some extra credit).
You should think more in terms of the unix library input and output functions such as open, create,
close, read, write, and lseek (You can use the man command (e.g. man write) to see in
detail what they do). The basic idea of these functions is:
create
| You give it a filename, and it creates a new empty file with that name, leaving it open
for you to write data to
|
open
| You give it a filename, and it looks for an already existent file with that name. If found,
it opens the file, ready for you to read or write.
|
| Those two functions return a special value (usually a small integer) to refer to
an open file. All the other functions take that integer as a parameter (in addition to the
obvious parameters that describe what data you want to read or write) so that they know which
file to work on. You should be allowed to have a reasonable number of files open at the same time
(to allow copying from one file to another, or merging two files into one, etc).
|
close
| Used when you no longer need to use a file that was previously opened.
|
lseek
| You give it some description of a position (e.g. 0=beginning, 1=end) and it prepares the
file so that the next read or write operation will happen from that position.
|
read
| The parameters are: file identifier, amount of data, and pointer to array. It attempts to read
the requested amount of data into the provided array.
|
write
| The parameters are: file identifier, amount of data, and pointer to array. It attempts to write
the requested amount of data from the provided array into the file.
|
These functions must be made to work in a useful way.
When a file is open, you must maintain some data concerning that file, particularly concerning its
current position, and a block-sized buffer. If the user opens a file, then uses the read function
on it twice, requesting 5 bytes each time, the function should first deliver the first 5 bytes of the file,
and next deliver the second five bytes. It can not start from the beginning of the file or move on to
the next block before each request. Also, the first request for 5 bytes will obviously require a block
to be read from the file; that block must be kept, so that the second request for 5 bytes will not
result in the same block being read again. Buffering is an important part of efficiency.
Your assignment is to implement those functions,
and provide a program that tests them convincingly. Your main guidelines in the design of these functions
should be "are these functions convenient enough for me to use them in my own programs instead of
the standard C or unix ones?" and "If I had to use these functions, what would I complain about
the most?"
You do not need to worry about directory hierarchies
or protections (yet), but you will need to design a simple format for a directory file. For this
assignment just assume that every file belongs in the root directory.
Warning: You will experience some very strange problems if you give your functions the same
names as functions in the C or unix library. A good way to avoid all risk of this is to add your
own initials to the fron of each function name, for example SM_create instead of create.
Here are some small examples hinting at how your functions should behave:
Example 1: fill a file with user-provided text.
{ int f, ok;
char line[100];
f=SM_create("file1");
if (f<0)
{ printf("failed to create file");
exit(1); }
printf("Enter contents of file, blank line at end...\n");
while (1)
{ fgets(line, 100, stdin);
/* fgets reads a line from a file (stdin is your terminal), and
stores it in a string, with a proper zero at the end. It also
leaves the \n newline character in the string. */
if (line[0]=='\n') break;
/* if the \n is at the beginning, the line must have been empty */
ok=SM_write(f, line, strlen(line));
/* treat the string as an array to be sent to the file. strlen
counts the number of chars in it, which is exactly the amount
of data we want to write. */
if (!ok)
{ printf("failed to write\n");
break; } }
SM_close(f); }
Example 2: Show the contents of a file (hope it's text), counting number of lines.
{ int f, ok, numlines=0;
char line[1];
f=SM_open("file1");
if (f<0)
{ printf("failed to open file");
exit(1); }
while (1)
{ ok=SM_read(f, line, 1);
/* Just ask for a single character. The place to put it must still
be an array (or at least a pointer), but can be much smaller
this time. */
if (!ok) break;
/* This time failure probably means end-of-file */
putchar(line[1]);
/* putchar is standard C function for writing character to terminal.
Nothing to do with our functions. */
if (line[1]=='\n') numlines+=1; }
printf("\n\nThe file had %d lines\n", numlines);
SM_close(f); }
For extra credit when you have written
the unix-level functions described above, also write another layer of C-level functions that behave
like the familiar fopen, fclose, getc, putc, printf, etc. These C-level functions should use your
unix-level functions to access the actual files. This is exactly what happens in a unix/C system.
File-System Implementation, part 2
This part is due a few weeks after the first part. It relies on the first part, so make sure you
have got it working. Your new assignment is to add proper support for directories.
The create and open functions
should accept filenames with paths attached to them. You can design the syntax of a path, but make
it something sensible. For examples: open("bunny") would look for a file called "bunny" in
the root directory of your filesystem; open("sally/hw1/bunny") should look for a directory
called "sally" in the root directory, then look for a directory called "hw1" in sally, then look for
and open a file called "bunny" in the hw1 directory.
create and open should look
for directories that appear in the path of their filename parameter; if any of those directories
don't exist, the functions should fail. Do not create a directory just because a function looks
for it. You will need to add two new functions: makedir(path) which is used to create
new directories (that is all it does, and nothing else can do that job), and delete
which is used to delete files. You can either make delete capable of deleting directories
(so long as it checks they are empty first), or provide a special function capable only of
deleting directories, to keep things separate.
You will also need to be able to list the
contents of a directory (the equivalent of the "dir" or "ls" commands). There are a number of ways
of doing that. One would be to have a single function that takes the path to a directory as its
argument, and returns a single string containing the names of all the files in that directory
as its result. A more convenient way would be to have a set of three much simpler functions:
opendir is just like open except that it can only open directories. closedir
is just like close except that it can only close directories. (Of course, the normal
function open should not be capable of opening a directory). readdir simply
returns the name of the next file in a directory each time it is called.
Remember the warning from the first part. Think of
new function names. opendir, closedir, etc., all appear in the standard libraries.
Example: This program should print out "file1" and "file2" then stop.
{ int f, d, ok;
char fname[100];
ok=makedir("sally/hw1");
if (!ok)
{ printf("failed to create directory\n");
exit(1); }
f=create("sally/hw1/file1");
close(f);
f=create("sally/hw1/file2");
close(f);
d=opendir("sally/hw1");
while (1)
{ ok=readdir(d, fname);
if (!ok) break;
printf("\"%s\"\n", fname); }
closedir(d); }
Provide a program that convincingly tests the
functionality of your creations.