Searching and Sorting

Not Started

Sorting and searching are two of the most fundamental operations in programming. Levels 1 and 2 cover sorting - starting with the basic merge operation and building up to custom comparators and practical applications. Level 3 covers searching - including binary search and how it gets applied in the real world. Understanding when and how to sort data is often the key first step to solving a problem efficiently.

1. Merging Two Sorted Arrays

The simplest sorting building block

Before we can sort, we need to be able to merge. Given two arrays that are already sorted, we want to combine them into one sorted array. This is the foundation that mergesort is built on.

The trick is the two-pointer technique. Keep one pointer at the start of each array. Compare the elements at both pointers, take whichever is smaller, and advance that pointer. When one array runs out, tack the rest of the other one on the end.

Think of it like merging two queues of people sorted by height. You always let the shorter person through next. When one queue is empty, the remaining people file straight through.

The code

def merge_sorted_arrays(a0, a1):
    i = 0
    j = 0
    result = []
    while i < len(a0) and j < len(a1):
        if a0[i] < a1[j]:
            result.append(a0[i])
            i += 1
        else:
            result.append(a1[j])
            j += 1
    # whatever's left, stick it on the end
    result += a0[i:] + a1[j:]
    return result

vector<int> MergeSortedArrays(const vector<int>& a0, const vector<int>& a1)
{
    vector<int> returnArray;
    int i = 0, j = 0;
    while (i < a0.size() && j < a1.size())
    {
        if (a0[i] < a1[j])
            returnArray.push_back(a0[i++]);
        else
            returnArray.push_back(a1[j++]);
    }
    // add the remainder
    while (i < a0.size())
        returnArray.push_back(a0[i++]);
    while (j < a1.size())
        returnArray.push_back(a1[j++]);
    return returnArray;
}

static List<int> MergeSortedArrays(List<int> a0, List<int> a1)
{
    var result = new List<int>();
    int i = 0, j = 0;
    while (i < a0.Count && j < a1.Count)
    {
        if (a0[i] < a1[j])
            result.Add(a0[i++]);
        else
            result.Add(a1[j++]);
    }
    // shove in whatever's left from either side
    while (i < a0.Count) result.Add(a0[i++]);
    while (j < a1.Count) result.Add(a1[j++]);
    return result;
}

static List<Integer> mergeSortedArrays(List<Integer> a0, List<Integer> a1) {
    List<Integer> result = new ArrayList<>();
    int i = 0, j = 0;
    while (i < a0.size() && j < a1.size()) {
        if (a0.get(i) < a1.get(j))
            result.add(a0.get(i++));
        else
            result.add(a1.get(j++));
    }
    // dump the rest in
    while (i < a0.size()) result.add(a0.get(i++));
    while (j < a1.size()) result.add(a1.get(j++));
    return result;
}

Both pointers only ever move forward, so we visit each element exactly once. This makes the merge O(n + m) where n and m are the lengths of the two arrays. It also uses O(n + m) extra space for the result.

2. Mergesort

Using merge as a tool

Now that we can merge two sorted arrays, we can sort anything. The idea is beautifully simple: split the array in half, sort each half (recursively), and merge the two sorted halves back together.

This is a classic divide and conquer approach. The recursion bottoms out when you have a single element - which is already sorted by definition.

The code

def merge_sort(vals):
    if len(vals) == 1:        # only one element? already sorted
        return vals
    L = len(vals)
    a0 = merge_sort(vals[0:L//2])   # sort the left half
    a1 = merge_sort(vals[L//2:])   # sort the right half
    return merge_sorted_arrays(a0, a1)  # stitch them together

vector<int> mergeSort(vector<int> vals) {
    if (vals.size() <= 1) return vals;  // nothing to split
    int mid = vals.size() / 2;
    // copy into two new vectors for each half
    vector<int> a0(vals.begin(), vals.begin() + mid);
    vector<int> a1(vals.begin() + mid, vals.end());
    a0 = mergeSort(a0);
    a1 = mergeSort(a1);
    return MergeSortedArrays(a0, a1);
}

static List<int> MergeSort(List<int> vals) {
    if (vals.Count <= 1) return vals;  // base case
    int mid = vals.Count / 2;
    // GetRange copies into new lists for each half
    var a0 = MergeSort(vals.GetRange(0, mid));
    var a1 = MergeSort(vals.GetRange(mid, vals.Count - mid));
    return MergeSortedArrays(a0, a1);
}

static List<Integer> mergeSort(List<Integer> vals) {
    if (vals.size() <= 1) return vals;  // one element is sorted
    int mid = vals.size() / 2;
    // subList gives a view - wrap in new ArrayList to copy
    var a0 = mergeSort(new ArrayList<>(vals.subList(0, mid)));
    var a1 = mergeSort(new ArrayList<>(vals.subList(mid, vals.size())));
    return mergeSortedArrays(a0, a1);
}

Language note: Mergesort reads very cleanly in Python because slicing creates new lists for free. In C++ and C# you would typically copy into new arrays to effect the split. Using pointers in C/C++ you can achieve the same brevity. C++ iterators with vectors could also do the same job.

Why O(n log n)?

Each level of recursion splits the array in half, so there are log n levels. At each level, every element gets touched once during the merge step, which costs O(n). Multiply those together: O(n log n).

Mergesort also uses O(n) extra space for the temporary arrays during merging. This is the trade-off for a guaranteed O(n log n) runtime regardless of input order.

3. Sort Stability

What does "stable" mean?

A sort is stable if elements that compare as equal keep their original relative order after sorting. If two items have the same sort key, a stable sort guarantees they appear in the same order they were in before.

This sounds like a minor detail, but it matters a lot when you sort on multiple keys.

A concrete example

Suppose you have a list of students, already sorted alphabetically by name:

Name	Grade
Alice	B
Bob	A
Charlie	B
Diana	A

Now you sort by grade. A stable sort gives you:

Name	Grade
Bob	A
Diana	A
Alice	B
Charlie	B

Bob and Diana both have grade A. Because the sort is stable, they stay in alphabetical order (Bob before Diana), which was their original order. Same for Alice and Charlie with grade B.

An unstable sort might swap Bob and Diana, or swap Alice and Charlie. The grade order would still be correct, but within each grade the alphabetical order could be lost.

Why does this matter?

Stability lets you sort on multiple keys by sorting repeatedly. Sort by the least important key first, then by the most important key using a stable sort. The earlier ordering is preserved within groups of equal keys.

In the example above, if you first sort by name (alphabetically) and then stable-sort by grade, you get a list sorted by grade with alphabetical ordering within each grade - without writing a custom comparator.

Which sorts are stable?

Algorithm	Stable?	Time
Mergesort	Yes	O(n log n)
Insertion sort	Yes	O(n^2)
Timsort (Python's default)	Yes	O(n log n)
Heapsort	No	O(n log n)
Quicksort	No	O(n log n) avg

Python's built-in sorted() and .sort() use Timsort, which is stable. Java's Arrays.sort() for objects is also stable (it uses Timsort). C++'s std::sort is not stable - use std::stable_sort if you need stability. C#'s List.Sort() is also not guaranteed stable.

Rule of thumb: if you are sorting data that has already been sorted on a different key, or if the relative order of equal elements matters to you, reach for a stable sort.

4. Heapsort

Leveraging the heap data structure

Heapsort is a sort algorithm that uses a heap to do the heavy lifting. The basic idea is short and sweet:

We can build a heap from an array in O(n).
We can remove the smallest element from the heap in O(log n).
If we remove the smallest element n times, we have sorted the array in O(n log n).

Heapsort has a very predictable runtime - the worst case is only about 20% above the average case. It also works in-place, meaning no extra memory is required beyond the array itself (though the versions below use language built-ins which may allocate internally).

See Week 4 for more on heaps and how they work under the hood.

The code

The simplest way to heapsort in most languages is to push everything into a min-heap, then pop elements out one at a time. They come out in sorted order.

import heapq

def heap_sort(vals):
    # heapify turns the list into a heap in-place - O(n)
    heapq.heapify(vals)
    # pop the smallest each time - heap keeps itself sorted
    return [heapq.heappop(vals) for _ in range(len(vals))]

vector<int> heapSort(vector<int> vals) {
    // priority_queue is a max-heap by default
    // use greater<> to flip it to min-heap
    priority_queue<int, vector<int>, greater<int>>
        pq(vals.begin(), vals.end());
    vector<int> sorted;
    while (!pq.empty()) {
        sorted.push_back(pq.top());
        pq.pop();
    }
    return sorted;
}

// PriorityQueue was added in .NET 6
static List<int> HeapSort(List<int> vals) {
    var pq = new PriorityQueue<int, int>();
    foreach (int v in vals)
        pq.Enqueue(v, v);  // element and priority are the same
    var sorted = new List<int>();
    while (pq.Count > 0)
        sorted.Add(pq.Dequeue());
    return sorted;
}

static List<Integer> heapSort(List<Integer> vals) {
    // Java's PriorityQueue is a min-heap out of the box
    PriorityQueue<Integer> pq = new PriorityQueue<>(vals);
    List<Integer> sorted = new ArrayList<>();
    while (!pq.isEmpty())
        sorted.add(pq.poll());
    return sorted;
}

Remember: heapsort is not stable. If you need equal elements to keep their original order, use mergesort instead.

5. Sorting on a Key

Sorting by something other than the value itself

Often you do not want to sort elements by their natural order. You want to sort by some property of each element - a specific field, a computed value, or a custom rule. Every language has a way to do this, but the syntax differs quite a bit.

Python: the key parameter

Python's sorted() accepts a key function. This function is called on each element, and the return values are what get compared during sorting.

def arg_min(vals):
    # return index of the smallest element in an array
    indices = [i for i in range(len(vals))]
    sorted_indices = sorted(indices, key=lambda x: vals[x])
    return sorted_indices[0]

This creates a list of indices and sorts them - but instead of sorting by the index value, it sorts by vals[x]. The first element of the sorted list is the index of the smallest value.

Here is another way to do the same thing, using tuples:

def arg_min_2(vals):
    # same thing but with tuples - Python sorts tuples element by element
    indices_and_vals = [(vals[i], i) for i in range(len(vals))]
    return sorted(indices_and_vals)[0][1]

This builds a list of (value, index) tuples. Python sorts tuples lexicographically - by the first element, then the second. So sorting these tuples sorts by value, and the index tags along for the ride.

C# and C++: custom comparators

In C# and C++, you typically define how your objects compare by implementing an interface or overloading an operator. Here is a struct that sorts by its x2 field:

struct Thing
{
    int x1;
    int x2;
    // overload < so std::sort knows how to compare these
    bool operator < (const Thing& other) const
    {
        return x2 < other.x2;  // sort by x2, ignore x1
    }
};

struct Thing : IComparable<Thing>
{
    public int x1;
    public int x2;
    public Thing(int x1, int x2)
    {
        this.x1 = x1;
        this.x2 = x2;
    }

    // IComparable tells Sort() how to order these
    public int CompareTo(Thing other)
    {
        if (x2 == other.x2)
            return 0;
        else if (x2 < other.x2)
            return -1;
        else
            return 1;
    }
}

class Thing implements Comparable<Thing> {
    int x1;
    int x2;
    Thing(int x1, int x2) {
        this.x1 = x1;
        this.x2 = x2;
    }

    // same idea as C# - return negative, zero, or positive
    public int compareTo(Thing other) {
        return Integer.compare(this.x2, other.x2);
    }
}

In C#, implementing IComparable<T> lets the built-in Sort() know how to order your objects. The CompareTo method returns a negative number if this object should come first, zero if they are equal, and a positive number if the other should come first.

In C++, overloading operator< achieves the same thing. When you call std::sort on a vector of these structs, it uses your < operator to decide the order.

In Java, implementing Comparable<T> follows the same pattern as C#. Alternatively, you can pass a Comparator lambda to Collections.sort() without modifying the class.

Key takeaway: Python uses key= lambdas. C# uses IComparable. C++ uses operator<. Java uses Comparable or Comparator. Different syntax, same idea - tell the sort function what to compare on.

6. Merging Intervals

A practical application of sorting

A very common problem: given a list of intervals (each with a start and end), merge any that overlap. For example, [1,3] and [2,5] overlap and should become [1,5].

The trick is to sort the intervals by their start value first. Once sorted, overlapping intervals will always be next to each other, so you can merge them in a single pass.

The code

def merge_intervals(intervals):
    intervals.sort()  # sorts by first element of each tuple
    merged = [intervals[0]]
    for start, end in intervals[1:]:
        if start > merged[-1][1]:
            # no overlap - start a new interval
            merged.append([start, end])
        else:
            # overlaps - extend the current one if needed
            merged[-1][1] = max(merged[-1][1], end)
    return merged

vector<pair<int,int>> mergeIntervals(vector<pair<int,int>> intervals) {
    sort(intervals.begin(), intervals.end());
    vector<pair<int,int>> merged = {intervals[0]};
    for (int i = 1; i < intervals.size(); i++) {
        if (intervals[i].first > merged.back().second) {
            // gap between intervals - new group
            merged.push_back(intervals[i]);
        } else {
            // they overlap - stretch the end if needed
            merged.back().second = max(merged.back().second, intervals[i].second);
        }
    }
    return merged;
}

public static List<Tuple<int, int>> MergeIntervals(List<Tuple<int, int>> intervals)
{
    List<Tuple<int, int>> merged = new List<Tuple<int, int>>();
    List<Tuple<int, int>> sorted = new List<Tuple<int, int>>(intervals);
    sorted.Sort();
    Tuple<int, int> curInterval = sorted[0];

    for (int i = 1; i < sorted.Count; i++)
    {
        // do we have a new interval?
        if (sorted[i].Item1 > curInterval.Item2)
        {
            merged.Add(curInterval);
            curInterval = sorted[i];
        }
        else
        {
            // does this interval extend the current one?
            if (sorted[i].Item2 > curInterval.Item2)
                curInterval = new Tuple<int,int>(curInterval.Item1, sorted[i].Item2);
        }
    }
    // add the current interval
    merged.Add(curInterval);
    return merged;
}

static List<int[]> mergeIntervals(List<int[]> intervals) {
    intervals.sort((a, b) -> a[0] - b[0]);  // sort by start
    List<int[]> merged = new ArrayList<>();
    merged.add(intervals.get(0));
    for (int i = 1; i < intervals.size(); i++) {
        int[] last = merged.get(merged.size() - 1);
        if (intervals.get(i)[0] > last[1]) {
            // no overlap - fresh interval
            merged.add(intervals.get(i));
        } else {
            // overlaps - extend if it goes further
            last[1] = Math.max(last[1], intervals.get(i)[1]);
        }
    }
    return merged;
}

The sort is O(n log n) and the merge pass is O(n), so the total is O(n log n). This is a pattern you will see again and again: sort first, then solve in a single scan.

7. Binary Search

Searching sorted data in O(log n)

If your data is sorted, you do not need to check every element to find something. Binary search cuts the search space in half with every comparison. Start in the middle. If the target is smaller, search the lower half. If it is bigger, search the upper half. Repeat until you find it or run out of space.

This gives O(log n) time, which is dramatically faster than a linear scan for large datasets. Searching a million elements takes at most about 20 comparisons.

The code

def find_element(values, to_find):
    lower = 0
    upper = len(values)
    while upper - lower > 1:
        mid = (upper + lower) // 2
        if values[mid] == to_find:
            return mid            # found it
        elif values[mid] > to_find:
            upper = mid           # too big - search lower half
        else:
            lower = mid           # too small - search upper half
    return -1                    # not in there

int FindElement(const vector<int>& values, int toFind)
{
    int len = values.size();
    int iLower = 0;
    int iUpper = len;

    while (iUpper - iLower > 1)
    {
        int iMid = (iUpper + iLower) / 2;
        if (values[iMid] == toFind)
            return iMid;       // found
        else if (values[iMid] > toFind)
            iUpper = iMid;     // search lower half
        else
            iLower = iMid;     // search upper half
    }
    return -1;                // not found
}

static int FindElement(List<int> values, int toFind)
{
    int lower = 0, upper = values.Count;
    while (upper - lower > 1)
    {
        int mid = (upper + lower) / 2;
        if (values[mid] == toFind)
            return mid;        // got it
        else if (values[mid] > toFind)
            upper = mid;       // look lower
        else
            lower = mid;       // look higher
    }
    return -1;                 // nope
}

static int findElement(List<Integer> values, int toFind) {
    int lower = 0, upper = values.size();
    while (upper - lower > 1) {
        int mid = (upper + lower) / 2;
        if (values.get(mid) == toFind)
            return mid;        // found it
        else if (values.get(mid) > toFind)
            upper = mid;       // search left
        else
            lower = mid;       // search right
    }
    return -1;                 // not here
}

Each iteration halves the search range. After k iterations the range has been divided by 2^k, so the loop runs at most log2(n) times. This is why binary search is so fast.

Precondition: binary search only works on sorted data. If the data is not sorted, sort it first (O(n log n)) or use a different search method. Many problems that look like search problems are really "sort first, then binary search" problems.

8. Quick Selection with Binary Search

The roulette wheel problem

Binary search is not just for finding exact values. One of its most practical uses is selecting from a weighted distribution - sometimes called roulette wheel selection.

Imagine a roulette wheel where each slot has a different width based on its weight. You spin the wheel (generate a random number) and need to figure out which slot the ball lands in. With a small number of slots you could scan through them, but with thousands of slots that gets slow. Binary search on the cumulative weights gives you the answer in O(log n).

How it works

Suppose you have 6 items with these weights:

Item	Weight	Cumulative
0	1	1
1	2	3
2	2	5
3	3	8
4	5	13
5	2	15

Total weight is 15. To pick a random item proportional to its weight:

Generate a random number between 0 and 1. Say we get 0.61.
Multiply by total weight: 0.61 x 15 = 9.15. This is our target value.
Binary search the cumulative array [1, 3, 5, 8, 13, 15] to find where 9.15 falls.

Walking through the binary search

Cumulative array: [1, 3, 5, 8, 13, 15]. Target: 9.15.

Step 1: Range is [0, 5]. Middle index = 2, value = 5.
        9.15 > 5, so the answer is in the upper half.
        New range: [3, 5].

Step 2: Middle index = 4, value = 13.
        13 > 9.15, so we have gone too far.
        New range: [3, 4].

Step 3: Range has narrowed to [3, 4].
        9.15 falls between cumulative[3]=8 and cumulative[4]=13.
        -> Selected item is 4.

Item 4 has weight 5 (the largest weight), so it occupies the biggest slice of the wheel. It makes sense that it gets selected often. The binary search found the right slot in just 3 steps instead of scanning all 6 items.

The code

import random

def build_cumulative(weights):
    # running total of weights
    cumulative = []
    total = 0
    for w in weights:
        total += w
        cumulative.append(total)
    return cumulative

def roulette_select(cumulative):
    # pick a random target in the total range
    target = random.random() * cumulative[-1]
    # binary search for the right slot
    lo, hi = 0, len(cumulative) - 1
    while lo < hi:
        mid = (lo + hi) // 2
        if cumulative[mid] < target:
            lo = mid + 1
        else:
            hi = mid
    return lo

vector<double> buildCumulative(const vector<double>& weights) {
    vector<double> cumulative;
    double total = 0;
    for (double w : weights) {
        total += w;
        cumulative.push_back(total);
    }
    return cumulative;
}

int rouletteSelect(const vector<double>& cumulative) {
    // pick a random spot on the wheel
    double target = ((double)rand() / RAND_MAX) * cumulative.back();
    // binary search to find which slot we landed on
    int lo = 0, hi = cumulative.size() - 1;
    while (lo < hi) {
        int mid = (lo + hi) / 2;
        if (cumulative[mid] < target)
            lo = mid + 1;
        else
            hi = mid;
    }
    return lo;
}

static List<double> BuildCumulative(List<double> weights) {
    var cumulative = new List<double>();
    double total = 0;
    foreach (double w in weights) {
        total += w;
        cumulative.Add(total);
    }
    return cumulative;
}

static int RouletteSelect(List<double> cumulative) {
    // spin the wheel
    double target = new Random().NextDouble() * cumulative[cumulative.Count - 1];
    // binary search for the slot
    int lo = 0, hi = cumulative.Count - 1;
    while (lo < hi) {
        int mid = (lo + hi) / 2;
        if (cumulative[mid] < target)
            lo = mid + 1;
        else
            hi = mid;
    }
    return lo;
}

static double[] buildCumulative(double[] weights) {
    double[] cumulative = new double[weights.length];
    double total = 0;
    for (int i = 0; i < weights.length; i++) {
        total += weights[i];
        cumulative[i] = total;
    }
    return cumulative;
}

static int rouletteSelect(double[] cumulative) {
    // random spot on the wheel
    double target = Math.random() * cumulative[cumulative.length - 1];
    // find which slot via binary search
    int lo = 0, hi = cumulative.length - 1;
    while (lo < hi) {
        int mid = (lo + hi) / 2;
        if (cumulative[mid] < target)
            lo = mid + 1;
        else
            hi = mid;
    }
    return lo;
}

Building the cumulative array is O(n). Each selection is O(log n). If you need to select many times from the same distribution, build the cumulative array once and reuse it.

Where this comes up: roulette wheel selection is used in genetic algorithms, reinforcement learning, randomised load balancing, and anywhere you need to pick items with probability proportional to their weight. The binary search step is what makes it efficient for large numbers of items.

Summary

Merging two sorted arrays is O(n + m) and forms the building block of mergesort.

Mergesort splits, sorts recursively, and merges - giving guaranteed O(n log n) and stable output.

Heapsort leverages the heap for O(n log n) with very predictable runtime and no extra memory.

Sorting on a key lets you sort by any criterion using comparators, lambdas, or operator overloading.

Stable sorts preserve the original order of equal elements - crucial for multi-key sorting.

Binary search finds elements in O(log n) but requires sorted data.

Roulette wheel selection is a real-world application of binary search on cumulative weights.