It's been a while since I've dealt with algorithms, so I figured why not refresh my memory by learning Python 3 at the same time?

To get started, I created a file called sortingpractice.py. This file will contain various sorts - Bubble, Merge, etc., in order to see the performance differences between them.

Now, the first thing we need to do is accept some command line arguments. To do this, we just need

if __name__ == '__main__':
    print("Hello world!")

If you just go to where the file is and run ./sortingpractice.py, you should get a little message saying "Hello World!". Simple enough, right?

In order to refresh some memories on what a bubble sort does, let's have a quick example.
Bubble sort is not a viable sorting algorithm, it's normally used only for educational purposes. For a deeper look at it, you can view the Wikipedia page for Bubblesort. In a nutshell, it compares each number in an array to the one next to it, and swaps them if necessary. For instance;

[5, 2, 1, 3, 4]
[5, 2, 1, 3, 4]
[2, 5, 1, 3, 4]
[2, 1, 5, 3, 4]
[2, 1, 3, 5, 4]
[2, 1, 3, 4, 5]

The first check is between 5 and 2. Clearly, 2 is less than 5, so they swap. This repeats until the number is either the last one in the array, or the next number is higher than the current number. Then, the same thing would happen for the number 2, and it just happens that there'd only be one comparison. You can see how, as N gets higher, this sort becomes less and less efficient.

So, how do we do this in Python?
First, we need to accept arguments from the CLI. That way, we can run sort an array of an arbitrary length.
 

def is_int(s):

def generateList(numItems, arrayLimit):

def bubbleSort(arr):

if __name__ == '__main__':
    for arg in sys.argv[1:]:
        if not is_int(arg):
            sys.exit("Both Array Size and Array Limit must be integers")
    arr = generateList(int(sys.argv[1]), int(sys.argv[2]))
    if not arr:
        sys.exit("The first parameter must be smaller than the second.")
    else:
        bubbleSort(arr)

There's a bit going on here, so let's take it step by step. 

for arg in sys.argv[1:]:
        if not is_int(arg):
            sys.exit("Both Array Size and Array Limit must be integers")

Here, we're looping through the arguments that were given at the command line. Each argument is checked if its' an integer or a string, via this function:
 

def is_int(s):
    try:
        int(s)
        return True
    except ValueError:
        return False

This just tries to cast the input as an int. If it can cast, great! If not, throw an error, and tell the user to try again.

The next function, generateList, just creates an array of numbers between 1 and a user-defined limit.

def generateList(numItems, arrayLimit):
    if numItems > arrayLimit:
        print('The number of items must be lower than the array limit.')
        return False
    arrayToSort = random.sample(range(1, arrayLimit), numItems)
    return arrayToSort

That function is where the user-defined arguments come into play. The array that gets created has a length of numItems, and these numbers are randomly selected from a list between 1 and arrayLimit; arrayLimit is the second number that the user passes in, which is the highest number that can be used.

Finally, on to the bubbleSort function:

def bubbleSort(arr):
    start = time.time()
    print('Beginning Bubble Sort . . . ( N = ', len(arr), ')')
    while not arr == sorted(arr):
        for i in range(len(arr)-1):
            if arr[i] < arr[i+1]:
                pass    # Skip if it's already sorted
            else:
                arr[i], arr[i+1] = arr[i+1], arr[i]
    end = time.time()
    print('Bubble Sort finished in ', end-start ,'seconds')

Here, we want to see how long it takes to sort the array. That's the whole point of this, right? So we start by defining a start time, then letting the user know that the sort is starting and how long the array is.
In the while loop, we use Python's sorted function to check when the bubble sort is complete. This allows us to loop through the array however many times we need to.
Inside the loop, we want to check if the current number is less than the next number. If it is, great, we're already sorted! Move on to the next one. If it isn't, then swap it with the number next to it, and continue this behavior until the array is sorted.
At last, we want to see what time it stopped, so we set an end variable. We let the user know the sorting is finished, and how long it took.

In order to run it from the CLI, just navigate to the directory where your file is saved, then run

./sortingpractice.py ## ###

If, for instance, I ran ./sortingpractice.py 12314 435654364536 , I get the following result:

Beginning Bubble Sort . . . ( N =  12314 )
Bubble Sort finished in  24.44393801689148 seconds

So, for an array of size 12,314, it took almost 25 seconds to sort it. This code can (and will!) still be played around with - better error handling, more descriptive responses, etc., but it gets the job done.

As always, source code can be found on GitHub

Tags
Category