Mastering Sorting Algorithms with Python: A Comprehensive Guide for Data Scientists and Developers
Exploring the Most Popular Sorting Algorithms and Their Performance Characteristics for Efficient Data Manipulation
Bubble Sort
Bubble sort is a simple sorting algorithm that repeatedly steps through the list, compares adjacent elements and swaps them if they are in the wrong order. The algorithm gets its name from the way smaller elements “bubble” to the top of the list.
Here is the implementation of bubble sort in Python:
def bubble_sort(arr):
swapped = False # If already sorted, skip
n = len(arr)
# track the number of passes through the array.
for i in range(n):
# to compare adjacent elements in the array.
for j in range(0, n - i - 1):
if arr[j] > arr[j + 1]:
arr[j], arr[j + 1] = arr[j + 1], arr[j]
swapped = True
if swapped == False: # already sorted
break
return arr
Bubble sort has a time complexity of O(n²).
Insertion Sort
Insertion sort is a simple sorting algorithm that builds the final sorted array one item at a time.
def insertion_sort(arr):
for i in range(len(arr)-1):
# j is always i + 1
j = i + 1
# internal loop for j>0 and j moves from right to left
while(j > 0):
if arr[j] < arr[j-1]:
arr[j], arr[j-1] = arr[j-1], arr[j]
j = j - 1
# if j = 0, we break the internal loop
else:
break
return arr
Insertion sort has a time complexity of O(n²) in the worst-case scenario, making it inefficient for large arrays.
Cyclic Sort
Cyclic sort is an in-place and linear-time sorting algorithm that finds the minimum unsorted element and swaps it with the element at its correct position.
Here is the implementation of the cyclic sort in Python:
def cyclic_sort(arr):
i = 0
# keep track of the progress through the array.
while i < len(arr):
j = arr[i] - 1 # correct position
if arr[i] != arr[j]:
arr[i], arr[j] = arr[j], arr[i] # swap if incorrect position
else:
i += 1
return arr
Cyclic sort has a time complexity of O(n), making it more efficient than other sorting algorithms such as bubble sort or insertion sort.
Merge Sort
Merge sort is a divide-and-conquer algorithm that sorts an array by dividing it into smaller sub-arrays and then merging them in a sorted manner.
def merge_sort(arr):
if len(arr) > 1:
# array is divided into two halves
mid = len(arr) // 2
left_half = arr[:mid]
right_half = arr[mid:]
# two halves are then sorted recursively
merge_sort(left_half)
merge_sort(right_half)
# two sorted halves are then merged back together in a sorted manner
i = j = k = 0
while i < len(left_half) and j < len(right_half):
# adds the smaller value to the result array arr.
if left_half[i] < right_half[j]:
arr[k] = left_half[i]
i += 1
else:
arr[k] = right_half[j]
j += 1
k += 1
# add any remaining values from the left and right halves
while i < len(left_half):
arr[k] = left_half[i]
i += 1
k += 1
while j < len(right_half):
arr[k] = right_half[j]
j += 1
k += 1
return arr
Merge sort has a time complexity of O(n log n).
Quick Sort
Quick sort is a divide-and-conquer algorithm that sorts an array by choosing a “pivot” element and partitioning the other elements into two sub-arrays, according to whether they are less than or greater than the pivot.
def quickSort(nums):
sort(nums, 0, len(nums) - 1)
return nums
# takes the array, lowest value and highest value
def sort(nums, low, high):
if low >= high:
return
s, e = low, high
# get the mid of the array
m = int(s + (e - s) / 2)
# get the mid element
p = nums[m]
# while the start is less than or equal to the end
while s <= e:
# if the array is sorted, it won't swap
# i.e the the left half is already sorted
while nums[s] < p:
s += 1
# same for the other half
while nums[e] > p:
e -= 1
# for start < end index, we swap them
if s <= e:
nums[s], nums[e] = nums[e], nums[s]
s += 1
e -= 1
# Now that the pivot is at correct index, we sort the two halfs
sort(nums, low, e)
sort(nums, s, high)
Quick sort has an average time complexity of O(n log n)
That’s all folks!
Happy Coding 💛