Skip to main content

How MergeSort works

· 5 min read
Strider

Mergesort is a sorting method to sort relatively fast huge amounts of data. Mergesort belongs to the divide & conquer algorithms with a big notation of nlog(n)n log(n)

The idea behind this is that an array is always divided into 2 halves down to the smallest element, and these are then sorted together.

The picture below shows the flow of Mergesort and clarifies the idea behind it.

dia.png

The procedure is such that the algorithm gets some array of elements and then always splits them into two halves. The whole thing happens recursively. In the first stage we have divided the array into 2 seperated arrays. These two partial arrays are now also divided into two halves, as you can see in stage 2. The whole thing is now repeated until you only have partial arrays consisting of one element.

If this is the case, the sorting phase comes where we merge the individual arrays and compare each element with each other, so that each subarray always goes from small to large. In Sort 1 you can see that the elements 3 and 5 have been compared and merged. The whole thing continues (Sort 2-4) until we have an array with all elements which are sorted from small to large.

void MergeSort(vector &a, int low, int high)
{
if(low < high)
{
int q = (low + high) / 2;
MergeSort(a, low, q);
MergeSort(a, q + 1, high);
_mergesort(a, low, q, high);
}
}

We look at the head of the procedure. Here we pass once the array to be sorted and low and high. Low and High are both points from where to where which parts of the array go. Low and High are nothing else than "from A to B". They are always compared with each other on each call. The procedure is only executed if low is smaller than high which is logical since it makes no sense to go from BB to AA since there is no negative index here either with the arrays.

Then another variable qq is created which represents the average of low and high. Our qq is used as a breakpoint to split the array into two parts, as you can see in the two MergeSort calls. The first call MergeSort(a, low, q);, is passed the array a with the position markers where exactly the array starts and ends at the call. The same is with the call MergeSort(a, q + 1, high); except that here q+1q+1 is taken as the start position and stops at high. Then the method _mergesort is called with the parameters aa, lowlow, qq, and highhigh. This piece of code is the divide-part of the algorithm.

void _mergesort(vector &arr, int p, int q, int r)
{
int n1 = q - p + 1;
int n2 = r - q;


int left[n1], right[n2];

for(int i = 0; i < n1; i++)
{
left[i] = arr[p + i];
}

for(int i = 0; i < n2; i++)
{
right[i] = arr[q + 1 + i];
}

int i = 0, j = 0, k = p;

while(i < n1 && j < n2)
{
if(left[i] <= right[j])
{
arr[k] = left[i];
i++;
}
else
{
arr[k] = right[j];
j++;
}

k++;
}

while(i < n1)
{
arr[k] = left[i];
i++;
k++;
}

while(j < n2)
{
arr[k] = right[j];
j++;
k++;
}
}

Let's look at the merge part of the procedure. The method _mergesort is passed the parameters arrarr, pp, qq, rr. The parameter arrarr is the array, parameter pp is our low which we got from the head of the procedure. Parameter qq is our breakpoint or arithmetic mean which is the average of highhigh and lowlow. The parameter rr is our high.

What is done here first is that our two breakpoints are created where the array arr is divided. The two breakpoints are n1n_1 and n2n_2. After that the two subarrays (left & right) are created which are only created temporarily to store both subarrays in between. These two arrays are now filled with the elements from the array arrarr. The result looks like in stage 1.

Now two more variables are created which are responsible for both subarrays ii, jj. These two variables are used to go through both subarrays. The value from the array leftleft at position ii is compared with the value from the array rightright to see if the value from leftleft is less than or equal to the value from the array rightright. If this is the case, no exchange takes place and the value from the array leftleft is assigned at position kk of the origninal array arrarr. The value kk is equal to the low-position. But if it is not the case then the value from the array rightright is smaller than leftleft and is assigned to the position kk of the array arrarr instead. In both cases the respective index ii or jj is increased by one. The whole thing is now done as long as ii is smaller than n1n_1 and jj is smaller than n2n_2.

But it is possible that there are still some elements left over. For this there are the two while-loops which copy the remaining elements into the array arrarr.

All this is done until the complete input array is sorted, then you get the display as above.

MergeSort(array,0 , array.size() - 1);

Suppose we have an array array with some elements that need to be sorted. Then we have to call the head of MergeSort which gets as first parameter the array arrayarray, the low position 00 and the n1n-1 of elements of the array arrayarray as high position. And MergeSort starts sorting.