Mergesort notes

Notes on Mergesort, relating to classes 29/8-5/9/2000

This is the merge function that we worked out to start with in class:

void Merge(int A[], int Astart, int Aend,
           int B[], int Bstart, int Bend,
           int C[], int Cstart)
{ while (1)
  { if (Astart>Aend && Bstart>Bend)
      break;
    if (Astart<=Aend && Bstart<=Bend)
    { if (A[Astart]<B[Bstart])
      { C[Cstart]=A[Astart]; Cstart+=1; Astart+=1; }
      else
      { C[Cstart]=B[Bstart]; Cstart+=1; Bstart+=1; } }
    else if (Astart<=Aend && Bstart>Bend)
    { C[Cstart]=A[Astart]; Cstart+=1; Astart+=1; }
    else if (Astart>Aend && Bstart<=Bend)
    { C[Cstart]=B[Bstart]; Cstart+=1; Bstart+=1; } }

Here is an equivalent version that may be a little more efficient:

void Merge(int A[], int Astart, int Aend,
           int B[], int Bstart, int Bend,
           int C[], int Cstart)
{ while (Astart<=Aend && Bstart<=Bend)
  { if (A[Astart]<B[Bstart])
    { C[Cstart]=A[Astart];
      Astart+=1; }
    else
    { C[Cstart]=B[Bstart];
      Bstart+=1; }
    Cstart+=1; }
  while (Astart<=Aend)
  { C[Cstart]=A[Astart];
    Astart+=1;
    Cstart+=1; }
  while (Bstart<=Bend)
  { C[Cstart]=B[Bstart];
    Bstart+=1; 
    Cstart+=1; } }

This is an alternate version that some may prefer. They are the same in all important respects:

void Merge(int *A, int Astart, int Aend,
           int *B, int Bstart, int Bend,
           int *C, int Cstart, int Cend)
{ int Aptr=Astart, Bptr=Bstart, Cptr;
  for (Cptr=Cstart; Cptr<=Cend; Cptr+=1)
  { if (Aptr>Aend)
    { C[Cptr]=B[Bptr]; Bptr+=1; }
    else if (Bptr>Bend)
    { C[Cptr]=A[Aptr]; Aptr+=1; }
    else if (A[Aptr]<B[Bptr])
    { C[Cptr]=A[Aptr]; Aptr+=1; }
    else
    { C[Cptr]=B[Bptr]; Bptr+=1; } } }

it merges the elements A[Astart]...A[Aend] of array A (which must already be in order) with the elements B[Bstart]...B[Bend] of array B (which must also already be ordered) putting them into elements C[Cstart]...C[Cend] of array C, producing an ordered result.

C[Cstart]...C[Cend] must not overlap either A[Astart]...A[Aend] or B[Bstart]...B[Bend].

We will also need a little function that copies from one array to another. Its just used at the end to tidy up the loose ends.

void Copy(int *A, int Astart, int Aend,
          int *C, int Cstart, int Cend)
{ int i;
  int num=Aend-Astart+1;
  for (i=0; i<num; i+=1)
    C[Cstart+i]=A[Astart+i]; }

Now, remember the pattern behind the sorting method. Itv relies on the fact that an array with only one thing in it can not possibly be un-sorted. First it treats the whole array A from A[0] to A[N-1] as if it were N separate tiny little (pre-sorted) arrays of one element each. These one element arrays are merged with their neighbours to produce a sequence of two element arrays in B.

A[0] and A[1] are merged to produce B[0]...B[1];
A[2] and A[3] are merged to produce B[2]...B[3];
A[4] and A[5] are merged to produce B[4]...B[5];
A[6] and A[7] are merged to produce B[6]...B[7];
A[8] and A[9] are merged to produce B[8]...B[9];
etc.

Then we swap arround the arrays (T=A; A=B; B=T), which is why I changed the declarations to make A and B be pointers, and start again. Except that this time we merge the slightly larger two element arrays that were produced in the previous step, noting that merging preserves sortedness, so these will be sorted two-element arrays, and we'll be making sorted four-element arrays from them.

A[0]...A[1] and A[2]...A[3] are merged to produce B[0]...B[3];
A[4]...A[5] and A[6]...A[7] are merged to produce B[4]...B[7];
A[8]...A[9] and A[10]...A[11] are merged to produce B[8]...B[11];
A[12]...A[13] and A[14]...A[15] are merged to produce B[12]...B[15];
etc.

And we keep on the same way, swapping A and B, and then merging in ever increasing block sizes until there's nothing left to do.

A[0]...A[3] and A[4]...A[7] are merged to produce B[0]...B[7];
A[8]...A[11] and A[12]...A[15] are merged to produce B[8]...B[15];
A[16]...A[19] and A[20]...A[23] are merged to produce B[16]...B[23];
etc.

So the main sorting function has two loops. The main one varies the block size from 1 up to whatever power of two is just less that the whole array size.

An inner loop controls which pair of sublists are about to be merged, computing values in integer variables to mark the beginnings and ends of each of the sublists. After merging all the pairs of sublists, we may be left with a single partial sublist all alone at the end. It doesn't need to be merged with anything (because there is nothing left to merge it with), but it does need to be copied into the second array (I think I forgot about that in class). Giving us this:

void MergeSort(int *A, int N, int *B)
{ int *T;
  int *orig=A;
  int blocksize, firstblockstart, i, lastblockend;
  for (blocksize=1; blocksize<N; blocksize*=2)
  { for (firstblockstart=0;
         firstblockstart+blocksize<N;
         firstblockstart+=2*blocksize)
    { int firstblockend=firstblockstart+blocksize-1;
      int secondblockstart=firstblockend+1;
      int secondblockend=secondblockstart+blocksize-1;
      if (secondblockend>=N) secondblockend=N-1;
      Merge(A,firstblockstart,firstblockend,
            A,secondblockstart,secondblockend,
            B,firstblockstart);
      lastblockend=secondblockend; }
    Copy(A,lastblockend+1,N-1,
         B,lastblockend+1,N-1);
    T=A;
    A=B;
    B=T; }
  if (A!=orig)
    Copy(A,0,N-1,orig,0,N-1); }

The standard library contains a function called "mergesort", so I called this one "MergeSort" (with capitals) to avoid a clash.

A is the array of integers to be sorted, N is its size. B is an array of the same size, only used as temporary workspace (remember you can't merge two lists into themselves, you need a second array).

Each stage of MergeSort merges lists from A into B, then swaps around A and B ready for the next time round the loop. If the algorithm finishes after an odd number of times around the loop, our sorted data will still be in the temporary array (passed in as B) and has to be copied back into the original (A) before the function can exit.

In order to test this sort of function, we want to be able to sort big arrays without having to read through 1000's of numbers to make sure it sorted them correctly. A little function can be made to do the testing for us. It just makes sure there aren't any out-of-order numbers in an array:

int isordered(int *A, int N)
{ int i;
  for (i=0; i<N-1; i+=1)
    if (A[i]>A[i+1])
      return 0;
  return 1; }

A whole program to demonstrate or test MergeSort would now be trivial to construct. It would be sensible to generate an array full of random numbers, make sure that they are not (by a very unlikely but still possible chance) already sorted, sort them, and finally make sure that they are now all in order.

We need to include stdio.h to be able to print things, and stdlib.h because that's where the random number generator is.

#include <stdio.h>
#include <stdlib.h>

#define N 100

void main(void)
{ int A[N], B[N], i;

  srandomdev();           /* randomises the random number generator */
  for (i=0; i<N; i+=1)
    A[i]=random()%1000;   /* make a three digit number */

  printf("Before sort, random numbers ");  
  if (isordered(A,N))
  { printf("ALREADY SORTED!!\nSorry, try again.\n");
    exit(1); }
  else
    printf("not sorted\n");

  MergeSort(A, N, B);

  printf("After sort, numbers ");  
  if (isordered(A,N))
    printf("Sorted!\n");
  else
    printf("***NOT*** sorted.\n"); }

And they all lived happily ever after.