Notes on Mergesort, relating to classes 27/1-1/2/2000

This is the merge function that we worked out to start with:
void Merge(int *A, int Astart, int Aend,
           int *B, int Bstart, int Bend,
           int *C, int Cstart, int Cend)
{ int Aptr=Astart, Bptr=Bstart, Cptr;
  for (Cptr=Cstart; Cptr<=Cend; Cptr+=1)
  { if (Aptr>Aend)
    { C[Cptr]=B[Bptr]; Bptr+=1; }
    else if (Bptr>Bend)
    { C[Cptr]=A[Aptr]; Aptr+=1; }
    else if (A[Aptr]<B[Bptr])
    { C[Cptr]=A[Aptr]; Aptr+=1; }
    else
    { C[Cptr]=B[Bptr]; Bptr+=1; } } }
it merges the elements A[Astart]...A[Aend] of array A (which must already be in order) with the elements B[Bstart]...B[Bend] of array B (which must also already be ordered) putting them into elements C[Cstart]...C[Cend] of array C, producing an ordered result.

C[Cstart]...C[Cend] must not overlap either A[Astart]...A[Aend] or B[Bstart]...B[Cstart].

The number of elements in C must be the sum of the number in A and B (i.e. (Cend-Cstart+1)=(Aend-Astart+1)+(Bend-Bstart+1)) but A and B may be different sizes.


Remember all that?

One thing to note is that I have changed the declarations of the array parameters from int A[] to int *A. Its a subtle difference, which we'll address in class soon.

And there's also a little function that copies from one array to another. Its just used at the end to tidy up the loose ends.

void Copy(int *A, int Astart, int Aend,
          int *C, int Cstart, int Cend)
{ int i, num=Aend-Astart+1;
  for (i=0; i<num; i+=1)
    C[Cstart+i]=A[Astart+i]; }
Now, remember the pattern behind the sorting method. Itv relies on the fact that an array with only one thing in it can not possibly be un-sorted. First it treats the whole array A from A[0] to A[N-1] as if it were N separate tiny little (pre-sorted) arrays of one element each. These one element arrays are merged with their neighbours to produce a sequence of two element arrays in B. Then we swap arround the arrays (T=A; A=B; B=T), which is why I changed the declarations to make A and B be pointers, and start again. Except that this time we merge the slightly larger two element arrays that were produced in the previous step, noting that merging preserves sortedness, so these will be sorted two-element arrays, and we'll be making sorted four-element arrays from them. And we keep on the same way, swapping A and B, and then merging in ever increasing block sizes until there's nothing left to do. So the main sorting function has two loops. The main one varies the block size from 1 up to whatever power of two is just less that the whole array size.

An inner loop controls which pair of sublists are about to be merged, computing values in integer variables to mark the beginnings and ends of each of the sublists. After merging all the pairs of sublists, we may be left with a single partial sublist all alone at the end. It doesn't need to be merged with anything (because there is nothing left to merge it with), but it does need to be copied into the second array (I think I forgot about that in class). Giving us this:
void MergeSort(int *A, int N, int *B)
{ int *T;
  int *orig=A;
  int blocksize, firstblockstart, i, lastblockend;
  for (blocksize=1; blocksize<N; blocksize*=2)
  { for (firstblockstart=0; firstblockstart+blocksize<N; firstblockstart+=2*blocksize)
    { int firstblockend=firstblockstart+blocksize-1;
      int secondblockstart=firstblockend+1;
      int secondblockend=secondblockstart+blocksize-1;
      if (secondblockend>=N) secondblockend=N-1;
      Merge(A,firstblockstart,firstblockend,
            A,secondblockstart,secondblockend,
            B,firstblockstart,secondblockend);
      lastblockend=secondblockend; }
    Copy(A,lastblockend+1,N-1,
         B,lastblockend+1,N-1);
    T=A;
    A=B;
    B=T; }
  if (A!=orig)
    Copy(A,0,N-1,orig,0,N-1); }
The standard library contains a function called "mergesort", so I called this one "MergeSort" (with capitals) to avoid a clash.

A is the array of integers to be sorted, N is its size. B is an array of the same size, only used as temporary workspace (remember you can't merge two lists into themselves, you need a second array).

Each stage of MergeSort merges lists from A into B, then swaps around A and B ready for the next time round the loop. If the algorithm finishes after an odd number of times around the loop, our sorted data will still be in the temporary array (passed in as B) and has to be copied back into the original (A) before the function can exit.

In order to test this sort of function, we want to be able to sort big arrays without having to read through 1000's of numbers to make suer it sorted them correctly. A little function can be made to do the testing for us. It just makes sure there aren't any out-of-order numbers in an array:
int isordered(int *A, int N)
{ int i;
  for (i=0; i<N-1; i+=1)
    if (A[i]>A[i+1])
      return 0;
  return 1; }
A whole program to demonstrate or test MergeSort would now be trivial to construct. It would be sensible to generate an array full of random numbers, make sure that they are not (by a very unlikely but still possible chance) already sorted, sort them, and finally make sure that they are now all in order.

We need to include stdio.h to be able to print things, and stdlib.h because that's where the random number generator is.
#include <stdio.h>
#include <stdlib.h>

#define N 100

void main(void)
{ int A[N], B[N], i;

  srandomdev();        /* randomises the random number generator */
  for (i=0; i<N; i+=1)
    A[i]=random()%1000;   /* make a three digit number */

  printf("Before sort, random numbers ");  
  if (isordered(A,N))
    printf("sorted\n");
  else
    printf("not sorted\n");

  MergeSort(A, N, B);

  printf("After sort, numbers ");  
  if (isordered(A,N))
    printf("Sorted!\n");
  else
    printf("***NOT*** sorted.\n"); }


And they all lived happily ever after.