Assignment 4

Program quicksort and mergesort. Make the code as fast as you can within reason. Working correctly is of course more important than running quickly.

Data for sorting will be provided in files that follow this format exactly:
19670607 114680858 Matilda Vincent MI 71734
19471024 114930037 Desdemona Hanover ID 69743
19790110 115550206 Xanadu Perlman ND 12193
19630921 116520629 Alexander Hall SD 94976
19301016 117050976 David Lamprey GA 50895
19650202 119610646 Thomas Porlock IL 83582
19621126 120330928 Cary Cartman NC 90387
19620411 122460462 Bella Oldman SD 94495
19220213 123040628 Elizabeth Watson NC 90369
19230308 123580905 Gustav Hornswaggle MN 37568
19840613 125040813 Godfrey Tumor OR 55645
19580903 125610677 Gustav Trentham IA 73590
19521219 126470499 Justin Oddly MA 33458
19300616 127700250 Ursula Farnes LA 74163
19791114 129540334 Betty Eaton NH 25807
19361114 130020412 Maggie McIntosh NV 16628
19631118 132680826 Raul Kringle NJ 53633
19490427 135040001 Arthropod Gravedigger ID 68521
19561012 135590854 Aloysius Pornman MO 27649
19521224 136870683 Alexandra Hanover LA 74067
Each line represents a data record for a person. The fields are
  1. Date of birth, YYYYMMDD,
  2. Social security number,
  3. First name,
  4. Last Name,
  5. State of residence,
  6. Zip code.
The sorting order is to be alphabetical, A to Z, based primarily on last name. When two people have the same last name, use first name. If two people have the same first and last names, use date of birth.

Your program should take three inputs from the command line:
  1. The letter Q or M to indicate Quicksort or Mergesort,
  2. The number of lines of data to read
    (so you can easily do small tests from large data files),
  3. The name of the file to read the data from.
These data files are on rabbit, ready for use: database1.txt, database2.txt, database3.txt, database5.txt, database10.txt, database20.txt, database30.txt, database50.txt, database100.txt.
The file called databaseN.txt contains N thousand data records, and its full path on rabbit is /home/www/class/een318/databaseN.txt.

The sorted results should be output (in the same format) to a file called sorted.txt.

For example, if you run the program by typing
     a.out Q 100 data5.txt
It should read only the first 100 lines from data5.txt and use Quicksort to sort them.

When you are experimenting and measuring the run time, here is a convenient but mysterious function to use:
#include <time.h>
#include <sys/resource.h>

double get_cpu_time()
{ struct rusage ruse;
  getrusage(RUSAGE_SELF, &ruse);
  return ruse.ru_utime.tv_sec+ruse.ru_utime.tv_usec/1000000.0 + 
         ruse.ru_stime.tv_sec+ruse.ru_stime.tv_usec/1000000.0; }
Each time it is called, it returns the amount of CPU time your program has used so far, in seconds, as a double. So call it once before sorting and again after, and look at the difference.

The resolution of the clock is only around 10mS, so you will see nothing for very small sorts.

Make sure you do not include the time taken to read the data or to write the results in your measurements.