I was given the task with the merge-insertion sort described as(paraphrased):
Starting off with merge sort, once a threshold S(small positive integer) is reached, the algorithm will then sort the sub arrays with insertion sort.
We are tasked to find the optimal S value for varying length of inputs to achieve minimum key comparisons. I implemented the code by modifying what was available online to get:
def mergeSort(arr, l, r, cutoff): if l < r: m = l+(r-l)//2 if len(arr[l:r+1]) > cutoff: return mergeSort(arr, l, m, cutoff)+mergeSort(arr, m+1, r, cutoff)+merge(arr, l, m, r) else: return insertionSort(arr, l, r+1) return 0 def merge(arr, l, m, r): comp = 0 n1 = m - l + 1 n2 = r - m L =  * (n1) R =  * (n2) for i in range(0, n1): L[i] = arr[l + i] for j in range(0, n2): R[j] = arr[m + 1 + j] i = 0 j = 0 k = l while i < n1 and j < n2: if L[i] <= R[j]: arr[k] = L[i] i += 1 else: arr[k] = R[j] j += 1 k += 1 comp +=1 while i < n1: arr[k] = L[i] i += 1 k += 1 while j < n2: arr[k] = R[j] j += 1 k += 1 return comp def insertionSort(arr, l, r): comp = 0 for i in range(l+1, r): key = arr[i] j = i-1 while j >= l: if key >= arr[j]: comp += 1 break arr[j + 1] = arr[j] j -= 1 comp += 1 arr[j + 1] = key return comp
This means that a near-pure mergesort is almost always preferred over the hybrid. Which is against what is available online, saying that insertion sort will perform faster than mergesort at low values of S(~10-25). I can’t seem to find any error with my code, so is hybrid sort really better than merge sort?
IMO the question is flawed.
Mergesort always performs N Lg(N) key comparisons, while Insertionsort takes N²/2 of them. Hence as of N=2, the comparison count favors Mergesort in all cases. (This is only approximate, as N does not always divide evenly).
But the number of moves as well as the overhead will tend to favor Insertionsort. So a more relevant metric is the actual running time which, unfortunately, will depend on the key length and type.