EEN512 Third Homework

There are two independent parts to this assignment, and two hand-in dates. The two problems require a bit of trickiness to come up with a good solution, and I can't expect you to see the right trick immediately.
        The first hand-in date (Tuesday 26th October) is for a reasonable attempt at a solution to both problems, but no tricks are expected. Do your best, but don't worry if your solution isn't very efficient.
        On 28th October in class, we will discuss problems of this kind, and you'll probably see some new ideas. The second hand-in date (Tuesday 2nd November) is for a good clean, efficient solution to both problems.
        On both hand-in dates, points are awarded for good programming style, an object-oriented design (as far as is reasonable), and working programs. On the second hand-in date, points will also be awarded for efficient algorithms.
        Be responsible! If your program gets into an infinite (or nearly infinite) loop, stop it (control-C). People who leave processes in the background consuming all the computer's power will be penalised.

Problem A

Imagine you are working for a large transportation or shipping company. You have to deliver a lot of packages of different sizes, all to the same place. Perhaps you are in charge of one of the major air routes; although packages go from all over Florida to all over Colorado, those packages will be taken to the hub at Miami airport, flown to the other hub at Denver airport, then sent on their way to their ultimate destinations. You are in charge of the Miami-to-Denver bottleneck.
        You usually have more packages than you could possible put on the aeroplane at the same time. The aeroplane has a maximum capacity that you know in advance, and must not exceed. Each package has an exact weight, and must not be split. You must make the best possible use of the plane's capacity.
        That means, given the plane's maximum capacity (in pounds) and the weights (in pounds) of a large number of packages, find the collection of packages (i.e. subset) that comes closest to the planes maximum capacity without exceeding it.
        The input to your program will be a list of numbers, one per line. At the end, there will be a line containing just the word "END". The first number is the maximum capacity of the plane. All the other numbers are the weights of packages.
        Your program should print out the list of weights of the packages that you choose to put on the plane.

Example input (unnaturally small to make it readable):
    100
    60
    10
    42
    93
    25
    END
This descibes a situation in which the plane can carry only 100 pounds, and you have 5 packages weighing 60, 10, 42, 93, and 25 pounds each. Unless I am mistaken (which could happen) the best combination is 60+10+25=95. I think all other combinations come to either less than 95 or more than 100.
So your program should print out:
    60
    10
    25
Of course, you could print those three numbers in any order.


Problem B

This time you are dealing with employee records. Employee records are always stored in plain text files, one employee per line, no blank lines. Each line contains a nine-digit social security number followed by a space then the employee's last name followed by a space and finally the employee's first name.
Example of an employee file:
    264827593 Soup Sally
    298345703 Iguana Ian
    059038385 Dinosaur Dan
    128723984 Slug Silvia
    387452352 Jones Jughead
    382984392 Denver Billy_Joe_Bob
You have to write a program that reads two such files, and prints out either (depending on the user's request) a list of those employees who appear in BOTH files, or a list of all employees who appear in EITHER file. So you are calculating the union and intersection of two large sets of data.
        You can not rely on the input files being sorted in any way. You do not need to produce the output an any particular order. For the union operation, make sure that you only print each employee once.

When your program is run, the user will type three lines of input; the first two lines will just contain file names (telling you the names of the two employee files to be read). The third line will contain just one word, either "UNION" or "INTERSECTION", telling you which operation to perform.
        The output of your program should simply be a list of employees in the same format as the input files.


Assistance

It is not cheating to use this code in programs you write for this class.

Reading a bunch of numbers, one per line, the secure way:
{ char line[100];
  int i, n;
  char *x;
  int numbers[100];
  int datasize=0;
  while (1)
  { printf("< ");
    x=fgets(line,99,stdin);
    if (x==NULL) break;
    if (strcmp(line,"END\n")==0) break;
    if (datasize>=100)
    { printf("Too much data\n");
      exit(1); }
    n=sscanf(line,"%d",&numbers[datasize]);
    if (n!=1)
    { printf("Error in input, not a number: %s", line);
      exit(1); }
    datasize+=1; }
  printf("Received %d numbers:\n", datasize);
  for (i=0; i<datasize; i+=1) 
    printf("%2d: %d\n", i, numbers[i]); }
Notes
1. fgets will never read more than the number of characters you specify, so you don't have to worry about the buffer (in this case, line) overflowing.
2. the third parameter to fgets can be any file that has been opened for input. stdin just refers to your terminal.
3. fgets returns NULL when you reach end of file.
4. fgets (unlike gets) gives you the whole line, including the end of line character \n at the end. Remember that when you do strcmps.
5. scanf, fscanf, and sscanf all return an integer; its value is the number of assignments the function successfully did, so you can use it to check for success or failure.
6. sscanf does not read any input; you have to give it a string that you have already read in.
7. sscanf will never modify in any way the string that it is reading from.
8. You must #include <stdio.h> for this.


Reading a load of strings:
{ char line[100];
  char *lines[100];
  char *x;
  int i, n;
  int datasize=0;
  while (1)
  { printf("< ");
    x=fgets(line,99,stdin);
    if (x==NULL) break;
    if (datasize>=100)
    { printf("Too much data\n");
      exit(1); }
    n=strlen(line)-1;
    line[n]=0;
    lines[datasize]=malloc(n);
    strcpy(lines[datasize],line);
    datasize+=1; }
  printf("Received %d strings:\n", datasize);
  for (i=0; i<datasize; i+=1) 
    printf("%2d: \"%s\"\n", i, lines[i]); }
Notes
1. To remove the \n from the end of a line, find the length of the line, and use that to overwrite the last character with a zero.
2. It is no good saving each line by saying lines[datasize]=line; you need to make a permanent copy of each string.
3. You must #include <stdio.h> and <stdlib.h> for this.

To read from a file
    FILE *f;
    f=fopen("filename","r");
    if (f==NULL)
    { printf("Couldn't open file \"filename\"\n");
      exit(1); }
Then just use f instead of stdin throughout the program. Don't forget to say fclose(f); at the end.

Or, reading from a file specified by an interactive user:
    char fname[100];
    int n;
    FILE *f;
    printf("File: ");
    fgets(fname,99,stdin);
    n=strlen(fname)-1;
    fname[n]=0;
    f=fopen(fname,"r");
    if (f==NULL)
    { printf("Couldn't open file \"%s\"\n", fname);
      exit(1); }



And they all lived happily ever after. The End.