Next up on our list is to implement Merge Sort. This is considerably faster than Bubble Sort (pretty much anything is better than bubble sort), and I ran a test case with an array of size
n=12345. Before we get into the code, look at this performance difference:
./sortingpractice.py 12345 23456 Beginning Bubble Sort . . . ( N = 12345 ) Bubble Sort finished in 19.896456956863403 seconds Beginning Merge Sort . . . ( N = 12345 ) Merge Sort finished in 0.026005983352661133 seconds
How crazy is that? At only
n=12345, you can see the massive performance difference.
Before writing the code for merge sort, let's see what makes it so much better. According to the Wikipedia page for merge sort,
- Divide the unsorted list into n sublists, each containing one element (a list of one element is considered sorted).
- Repeatedly merge sublists to produce new sorted sublists until there is only one sublist remaining. This will be the sorted list.
Sounds an awful lot like recursion, doesn't it? Take the array, split it up into
len(array) arrays, then merge the arrays back together in order.
Now, let's implement it!
If you're following along at home, just open up your
sortingpractice.py file. Somewhere in the file, you'll want to add in our new method.
In the main method,be sure to add the timing and the call to
print('Beginning Merge Sort . . . ( N = ', len(arr), ')') start = time.time() mergeSort(arr) end = time.time() print('Merge Sort finished in ', end-start ,'seconds')
It should be at the same indentation level as
Okay, we've got the skeleton up, and if you add a dummy print statement to the mergeSort function, you should be able to run it without issue. Time for the logic!
arrLength = len(arr) if arrLength > 1: mid = arrLength // 2 leftArr = arr[:mid] rightArr = arr[mid:] # Sort both halves mergeSort(leftArr) mergeSort(rightArr)
This first chunk does a few things. Since we want each array to be only of length 1, that needs to be our first check. Then, break it up into two arrays. We'll call the function recursively (the guts are going to be added shortly), until we only have two arrays of length 1.
leftIndex = rightIndex = index = 0 # Utilize the two temp arrays while leftIndex < len(leftArr) and rightIndex < len(rightArr): if(leftArr[leftIndex] < rightArr[rightIndex]): arr[index] = leftArr[leftIndex] leftIndex += 1 else: arr[index] = rightArr[rightIndex] rightIndex += 1 index += 1
Here, we start by setting all indices (left temp array, right temp array, and original array) to 0. Then, we want to loop through the temp arrays to
len(rightArr)-1, respectively. This is where the actual sorting takes place - if the left array is smaller than the right array, insert the left array into the original at the index. Else, add the right array.
# Check for remaining items while leftIndex < len(leftArr): arr[index] = leftArr[leftIndex] leftIndex += 1 index =+ 1 while rightIndex < len(rightArr): arr[index] = rightArr[rightIndex] rightIndex += 1 index =+ 1
In this final chunk, we're just checking to see if we're done or not.
As always, you can find the full source code on GitHub!