HT2026 Design & Analysis of Algorithms notes


Remaining TODOs: 59

Relevant reading:

T. H. Cormen, C. E. Leiserson, R. L. Rivest and C. Stein. Introduction to Algorithms


1. Program cost and asymptotic notation

Definition 1.1:An algorithm is a finite set of well-defined instructions to accomplish a specific task.
Definition 1.2:An efficient algorithm runs in polynomial time.

Definition 1.3: Insertion sort compares each the element and compares it with the previously sorted elements, inserting it in the correct place.

In CRLS-style pseudocode (as used in Introduction to Algorithms):

Input: An 1-indexed array A of integers
Output: Array A is sorted in non-decreasing order
InsertionSort(A):
for j = 2 to A.length
key = A[j + 1]
// insert A[j + 1] into the sorted sequence A[1..j]
i = j
while i > 0 and A[i] > key
A[i + 1] = A[i]
i = i - 1
A[i + 1] = key

Definition 1.4: The running time of a CLRS program is defined as:

  • Line takes constant time
  • When a loop exits normally, the test is executed one more time than the loop body

Remark: The running time of insertion sort as given is where is the number of times the test of the while loop is executed for a given value of .

Then in the worst case, , so for some . Hence is quadratic in .

In the best case, so is linear.

Definition 1.5: Let . Then

If then is an asymptotic upper bound for .

Proposition 1.6: The algorithm is correct.

Proof:

By a loop-invariant argument:

  • Initialisation - Prove the invariant holds prior to first iteration
  • Maintenance - Prove that if holds just before an iteration, then it holds just before the next iteration
  • Termination: Prove that, when the loop terminates, the invariant along with the reason the loop terminates imply the correctness of the program

This is similar to mathematical induction, but rather than proving for all numbers, we expect to exit the loop.

Invariant: At the start of the th iteration, A[1..j] is sorted.

When , is a singleton so is trivially sorted.

The outer loop terminates when j = A.length. So the loop invariant at termination says that A[1..A.length] = A is sorted.

To prove maintenance, we need to prove that, at the end of the while loop, the sequence A[1], ..., A[i], key, A[i+2], ..., A[j+1] are sorted.

The invariant of the while loop is: TODO

Remark: Some nice properties of insertion sort:

  • It is stable (preserves relative order of equal keys)
  • In-place
  • Online (can sort the list as it is recieved)

Lemma 1.7: Let . Then:

  • where is the set of -degree polynomials
Definition 1.8: If , we say that is an asymptotic lower bound for .
Corollary 1.9: .

Definition 1.10: If , is an asymptotic tight bound for .

Remark: Big // are not closed under function composition.

2. Divide and conquer algorithms

Definition 2.1: A divide and conquer algorithm divides the problem into subproblems, solves each subproblem separately and combines the results.

Remark: In general, a divide and conquer algorithm working on an input of size can be defined as:

  • If is small, for some constant , use constant-time brute force solution
  • Otherwise, divide the problem into subproblems, each the size of the original
  • Let the time to divide a size- problem be
  • Let the time to combine solutions be .

Let be the time taken to split a size- problem up, and be the time taken to combine solutions of subproblems.

Then the running time, , of a divide-and-conquer algorithm can in general be expressed by the following recurrence:

Theorem 2.2 (Master Theorem):

Suppose

Then

Proof:

TODO

Example (merge sort):

  1. Split array into 2 ()
  2. Carry out merge sort of on the subarrays (or for singletons, do nothing)
  3. Merge the elements from the sorted sublists in order, which is easy as we just pick the smallest first element from each

Pseudocode:

MergeSort(A, p, r):
if r > p + 1
q =
MergeSort(A, p, q)
MergeSort(A, q + 1, r)
Merge(A, p, q, r)

Where Merge is defined as:

// Merge A[p..q] and A[p+1..r] so that A[p..r] is sorted
Merge(A, p, q, r):
= q - p + 1
= r - q
let L[1.. + 1] and R[1.. + 1] be new arrays
for i = 1 to :
L[i] = A[p + i - 1]
for j = 1 to :
R[j] = A[q + j]
// Set the last values of the arrays to be larger than
// anything else we could encounter, so we don't need
// to check if we have exhausted all the elements of
// either array if they aren't balanced
L[ + 1] =
L[ + 1] =
i = 1
j = 1
for k = p to r:
if L[i] R[j]:
A[k] = L[i]
i = i + 1
else:
A[k] = R[j]
j = j + 1

Remark: Merging should be left-biased (i.e. in case of tie, pick the left one) so that merge sort is a stable sort.

Merge sort is not in-place (requires extra space) and is not online.

Merge is clearly linear.

Then for MergeSort, in the form given above we have , and .

Then

for some constant .

There are a few approaches we could use to solve for :

  • Guess a solution, and use induction to find constants and prove that it works
  • Draw out a recursion tree and sum each level, either to obtain an exact answer or to gain a heuristic to guess and check
  • Or use the Master Theorem.

Using the Master Theorem, we get that .

Example (Integer multiplication): Suppose we want to multiply two -bit integers, and .

A simple divide-and-conquer approach would be to split each number into a top and bottom half, , then

Therefore we can compute four multiplications and combine them with three additions (which are linear time in ).

The time complexity of this is , which by the Master Theorem is .

However, we can do better.

Definition 2.3: The Karatsuba algorithm, or Karatsuba-Ofman algorithm, is a subquadratic algorithm for integer multiplication. It utilises the fact that

so we only need three multiplications. This needs more additions, but that is still linear time.

The recurrence is now .

Example (matrix multiplication):

A Naïve matrix multiplication algorithm, based on the definition, operating on matrices has time complexity , as it needs to compute multiplications, times.

A naïve divide-and-conquer approach might split each matrix into four matrices, and compute as

This requires computing 8 subproblems ( etc), and the addition of the submatrices is quadratic, so the time complexity is

This is no better than the previous approach, but it is possible to improve upon this.

Definition 2.4: Strassen’s algorithm utilises a divide-and-conquer approach with only 7 subproblems.

where

The recurrence is now , which by the Master Theorem is .

Remark: Suppose a recursion splits into two subproblem with size , and suppose that the division is and the combining is . Then we have , which isn’t directly expressible with the Master Theorem. We can substitute to get ; if we let then . We can then use the Master Theorem, with , giving that . Hence

TODO binary search

TODO th-order statistic

Theorem 2.5: The running time of every comparison-based sorting algorithm is , i.e. at best it is .

Proof:

In a descision tree of a comparison-based sorting algorithm on an input of length , there are leaves. Every binary tree of depth has at most leaves; so to get to we need a depth of at least .

Definition 2.6: If we know our elements come from a certain discrete interval, we can use counting sort: just count how often each element appears, then insert the relevant number of each elements. This is ( if ) but is not in-place; however we can design it to be stable.

Input: An array A of n elements with elements in the interval
Output: An array B consisting of a sorted permutation of A

TODO

Definition 2.7: The Fast Fourier Transform (FFT) can be used to multiply two polynomials of degree in time.

  1. Split each polynomial into 2 polynomials of even degree:

    This is nice because each polynomial is an even function. We now have that

  2. Represent each polynomial by a list of values at points, which can be computed in by Horner’s rule,

TODO

Essentially, we split into two polynomials and recurse. Once we are in point-value form, we can multiply each point, all in , and then convert back to coefficient form (interpolation).

3. Data structures: heaps and queues

Remark: Unsorted arrays have insertion and finding the maximum.
Remark: Sorted arrays have insertion (find position by binary search but then shifting is linear) but finding the maximum.
Corollary 3.1: It is not possible to have a data structure with insertion and maximum extraction, as this would give a linear-time comparison-based sorting algorithm which we have seen isn’t possible.

Definition 3.2: A heap is a data structure that is a tree that is completely filled on all levels except the lowest, which is filled from the left to right.

Heaps can be represented with arrays, with each level stored next to each other, read left-to-right and top-to-bottom. This is unambiguous because the tree is complete, so we know how many children each element has.

For a binary tree, if we have the root at A[0], the left child of A[i] is at A[2i + 1] and the right child is at A[2i + 2]. If arrays are 1-indexed, we instead get A[i]‘s children at A[2i] and A[2i + 1].

Definition 3.3: A max-heap is a heap that satisfies the max-heap property.
Definition 3.4: The max-heap property states that, for every node excluding the root, the key of the parent of is greater than or equal to the key of .
Remark: The maximum element of a max-heap is at the root.
Remark: Min-heaps can be defined similarly.
Remark: Children of a node in a max-heap are not necessarily sorted.
Definition 3.5: The height of a node in a heap is defined to be the number of edges in the longest simple path from the node to a leaf. The height of a heap is the height of its root; this means that a singleton heap has a height of 0.

Definition 3.6: The MaxHeapify procedure turns a heap into a max heap, where the two subtrees of the root are already max-heaps. This works by ‘bubbling’ the root down into a permissible location.

Input: A 1-indexed array A and index i such that the subtrees of the node A[i] are max-heaps.
Output: A max heap with root at A[i].

MaxHeapify(A, i):
n = A.size
l = 2i
r = 2i + 1
if l <= n and A[l] > A[i]:
largest = l
else:
largest = i
if r <= n and A[r] > A[largest]:
largest = r
if largest != i:
swap A[i] and A[largest]
MaxHeapify(A, largest)

This is because the procedure runs at most once for each depth level in the tree; we can also see this by the master theorem: the two subtrees have a maximum size of , occurring on the left if the bottom row is exactly half full, so

Definition 3.7: The MakeMaxHeap procedure forms a max heap from an unordered array.

We carry out MaxHeapify on every node, starting from the bottom, but skipping leaves, as they are already singleton max-heaps; allows us to skips about half of the nodes.

MakeMaxHeap(A):
for i = ceil((n + 1) / 2) - 1 downto 1
MaxHeapify(A, i)

This is definitely , but in fact a tighter bound is .

Proof: MaxHeapify is linear in the height of the node it operates on; therefore given a node of height , MaxHeapify takes time for some . Moreover, in any binary tree there are at most nodes at height .

So, the running time of MakeMaxHeap is

Let , and consider . Then .

Therefore , so .

Remark: To extract the maximum element from a max-heap, we extract the root, place the last vertex into the root, and carry out MakeMaxHeap. This is .
Remark: To insert an element, append to the end of the backing array, and carry out MaxHeapify recursively on the parents of the new vertex. This is .

Definition 3.8: Heap sort:

  1. Build a max-heap
  2. Starting from the root, swap the maximum with the final element in the backing array, and ignore the final node from now on (it remains in the backing array, but we decrement the heap size), then carry out MaxHeapify on the root (this is fine because the children are still max-heaps)
  3. Repeat until the heap size is 1

This is worst case like merge sort, but is in-place like insertion sort.

Definition 3.9: A priority queue maintains a set of elements, each with an associated key. Max-priority queues give priority to elements with larger keys; min-priority keys are defined similarly.

Max-priority queues have the following operations:

  • Insert(S, x, k) inserts element x with key k into S
  • Maximum(S) returns the element of S with the largest key
  • ExtractMax(S) removes and returns the element of S with the largest key
  • IncreaseKey(S, x, k) increases the value of x’s key to k, which is assumed to be at least as large as the existing key

If we implement the max-/min-priority queue as a max-/min-heap, both Insert and ExtractMax are , and are as described above. Maximum is and simply returns the root of the heap.

Definition 3.10: IncreaseKey works by modifying the key, and then bubbles the node up until it satisfies the max-heap property. Note that it does not just call MaxHeapify on the parents of the node, as this would unnecessarily visit sibling nodes as well.

Here A is 1-indexed as before.

IncreaseKey(A, i, key):
require(key >= A[i])
A[i] = key
while i > 1 and A[floor(i / 2)] < A[i]:
swap A[i] and A[floor(i / 2)]
i = floor(i / 2)

This is also as the while loop cannot run more times than the height of the heap.

4. Dynamic programming

Definition 4.1: Dynamic programming is an optimisation in which we define a sequence of subproblems such that:

  • The subproblems are ordered from smalles to largest
  • The largest is the one we want to solve
  • The optimal solution of a subproblem can be constructed from the optimal solutions of smaller subproblems (this property is optimal substructure)

We then solve from smallest to largest and store the solutions.


Example (Change-making): Suppose we have lots of coins of different denominations, , and want to give units of change.

Then .

More formally,

The running time is . This is polynomial in and (i.e. in the number of inputs), but is exponential in the size of the inputs.


Example (Knapsack problem): A burglar has a knapsack with capacity . There are itmes to pick from, of size/weight and value . We want to find the most valuable configuration of items to steal, assuming that there is either 1 of each item, or an unlimited quantity of each.

Assuming first an unlimited supply of each item:

Total running time is .

Where there is only one of each item:

The running time is .

But we can modify to make it not exponential, by rather than passing a set of indices, passing the largest index that is considered - since if we remove item , we should have an optimal solution for .

This formulation is stating that for each index , we either pick that item or discard it.

Now the running time is .


Example (Longest increasing subsequences): Task: find the longest increasing subsequences of a sequence; the elements in a subsequence don’t need to all be adjacent to each other in the original sequence. TODO

TODO edit distance

Example (Longest simple path): This is an example of a problem that does not have optimal substructure: the longest simple paths from to and from to might share a vertex, so cannot be combined. We would therefore need to store not only the length of the longest path of a subproblem, but also the path itself; so dynamic programming cannot be used for this.

However, if the graph is acyclic, the problem does have optimal substructure.

Example (Travelling salesperson problem): Given a complete undirected graphs with weights associated with each edge , find the Hamiltonian cycle with minimal total distance (sum of edge weights).

This is NP-complete, so we don’t know if there is a polynomial time solution, but we can verify a solution in polynomial time.

Brute force is very bad, but dynamic programming gives a better (but not polynomial) solution.

Subproblems: for every subset containing , and for every , find the shortest path that starts from , ends in , and passes only once through all the other nodes in . Define to be the length of that path.

Then

TODO

Example (A better algorithm for longest increasing subsequence): TODO

5. Graph decomposition

Remark: A directed acyclic graph can be abbreviated to dag.

Definition 5.1: An adjacency matrix of a graph is a matrix where

TODO

Remark: Given an adjacency matrix , is the number of walk of length from to .

If represents an undirected graph, is the degree of .

Remark: An adjacency list for a graph is a list of linked lists, one per vertex: the list for vertex holds the indices of vertices to which has an outgoing edge.

This has size

TODO

Definition 5.2: Depth-first search (DFS) is a linear-time algorithm that tells us what parts of a graph are reachable from a given vertex. It works for both digraphs and undirected graphs.

As soon as a new vertex is discovered, explore from it. As DFS progresses, we assign each vertex a colour:

  • not discovered yet (e.g. white)
  • discovered, but not fully explored yet (e.g. grey)
  • finished (e.g. black)

Input: A graph
Output: for each vertex , a backpointer (the predecessor of ) and two timestamps, the discovery time and finishing time .

DFS(V, E):
for u in V:
colour[u] = white
pi[u] = null
time = 0
for u in V:
if colours[u] = white:
DFSVisit(u)

DFSVisit(u):
time += 1
d[u] = time
colour[u] = grey
for v in neighbours(v):
if colour[v] = white:
pi[v] = u
DFSVisit(v)
time = time + 1
f[u] = time
colour[u] = black

Note that .

Note that are dependent upon the order in which the vertices in the graph are visited, and on the order of the vertices in the adjacency/neighbour lists.

This has running time because each vertex is visited once, and DFSVisit(v) takes where is the degree of ; .

If we map out the edges visited by DFS, we get a forest that we call the DFS forest, consisting of DFS trees (note that this is a slight abuse of notation because trees and forests are usually undirected).

Theorem 5.3 (Parenthesis theorem): For all , exactly one of the following holds:

  1. or and neither of and are descendants of each other in the DFS forest
  2. and is a descendant of in a DFS tree
  3. and is a descendant of in a DFS tree

Using shorthand TODO

Corollary 5.4: Vertex is a descendant of vertex iff .

TODO

Definition 5.5: We can classify edges of the searched graph:

  • Tree edges are edges of the DFS forest; is white when is explored;
  • Back edges lead from a node to an ancestor in the DFS tree. is grey when is explored;
  • Forward edges lead from a node to a non-child descendant in the DFS tree, i.e. they lead to a descendant but are not edges in the DFS forest. is black when is explored;
  • Cross edges do not lead to an ancestor or descendant; this could be between nodes in the same tree or in a different tree. is black when is explored.

TODO discovery and finishing times vs edge classification

Theorem 5.6: A directed graph has a cycle iff has a back edge.

Proof:

: If is a back edge, then there is a cycle consisting of the back edge and the path in the DFS tree from to .

: TODO

Definition 5.7: A topological sort of a DAG is a total ordering of vertices, , such that if then .

TopologicalSort(V, E):
f[v | v in V] = DFS(V, E)
return V sorted according to decreasing f[v]

TODO correctness of topological sort

Remark: When DFS is applied to an undirected graph, the DFS trees correspond to the connected components of the graph. These can be identified by the discovery and finishing times.
Definition 5.8: In a digraph, two vertices are strongly connected if there is a path from to and from to .
Definition 5.9: A strongly connected component (SCC) of a digraph is a maximal set of vertices such that for all , and are strongly connected.

Lemma 5.10: Given distinct strongly connected components of a digraph , , if there is a path from to in , then there is not a path from to in .

Proof: Suppose there is a path from to . Then for every , there exists a path from to via the path connecting and . If there were a path from to , there would be a path from to via that path, so and would be strongly connected and would be part of the same SCC; contradiction. Therefore there is no path from to .

Definition 5.11: The SCC graph of a digraph is the graph , such that

  • has one vertex for every SCC in
  • iff there exist such that there is a path from to .
Remark: From Lemma 5.10, it follows that any SCC graph is a DAG.

Definition 5.12: We extend discovery and finishing times for sets of vertices :

  • , the earliest discovery time amongst ;
  • , the latest finishing time amongst .
Definition 5.13: We say that there is an edge between SCCs if s.t. .

Lemma 5.14: Let be distinct SCCs in . If there is an edge from to , then .

Proof: TODO same as correctness of topological sort

Lemma 5.15: If and , then there cannot be an edge from to .

Proof: Equivalent to Lemma 5.14 by contraposition.

Definition 5.16: For a digraph , the tranpose of is where

Remark: We can create in time using adjacency lists.
Theorem 5.17: and have the same SCCs.

Remark: For distinct SCCs in , if , then

  • in there cannot be an edge from to
  • in there cannot be an edge from to .

This means that running DFS on starting from the SCC with the largest finishing time, we will not find edges to any other SCC. TODO why? intuition

Definition 5.18: The SCC algorithm, or Kosoraju’s algorithm, identifies all SCCs in a digraph :

  • run DFS on
  • run DFS of , exploring in order of decreasing finishing time of the first DFS
  • the trees in the DFS forest of the second DFS correspond to the SCCs of

6. Paths in graphs - finding shortest paths

Definition 6.1: Consider a digraph with weight function .

The weight or length of a path TODO

Definition 6.2: TODO FIFO queue

Definition 6.3: Considering first the simple case where all weights are equal to 1. Let be the minimum number of edges on a path from to , if such a path exists, otherwise .

Given a source vertex , breadth-first search (BFS) finds all vertices s reachable from , the shortest path lengths , and the shortest paths from to the s.

Idea:

  • send out waves of increasing length from the source
  • when a vertex is reached, out it in the (FIFO) queue
  • when all of s neighbours have been reached, remove from

TODO the algorithm

Remark: BFS takes time .

Definition 6.4: The BFS tree consists of vertices reachable from and edges where . TODO what is ?

More formally, the BFS tree is the graph

Lemma 6.5: TODO there exists a path of length
Corollary 6.6: TODO lower bound on

Lemma 6.7: If is enqueued before , then . Furthermore, if is the queue at any given step of BFS, then

Proof: TODO by induction
Lemma 6.8: TODO upper bound on

Theorem 6.9: .

Proof: TODO just follows from previous lemmas; also consider .
Theorem 6.10: If , then is the predecessor of TODO
Remark: BFS uses a queue but DFS uses a stack (possibly implicitly).

Definition 6.11: Dijkstra’s algorithm is essentially BFS for weighted graphs. It solves the “single-source shortest path” problem for non-negative weights.

It uses a min-priority queue rather than a FIFO queue, with keys given by the shortest-path weight estimates .

At termination:

  • is the distance from to
  • is the TODO

TODO algorithm

Invariants for the while loop:

TODO initialisation

Then suppose that holds before an iteration of the while loop. For any vertex , either did not change, or it did change. If did not change, then by . Otherwise, let be the vertex selected by ExtractMin in this iteration of the loop. Then we have that

Now suppose that holds before an iteration. Then for any , either:

  • , then by
  • and there is no path from to , then
  • and there is a path from to . Then there is also a shortest path TODO

TODO finish maintenance of

At termination, , so .

TODO remark about optimal substructure

Lemma 6.12 (Convergence property): TODO

Theorem 6.13: For , is the predecessor of on a shortest path from to .

Proof: TODO
Remark: Total running time of Dijkstra’s algorithm is . TODO proof/derivation

Definition 6.14: The Bellman-Ford algorithm takes a graph with weight function (possibly negative), and returns false if there exists a negative-weight cycle reachable from , otherwise true, along with for each .

TODO algorithm

This is .

TODO invariants

TODO full topological sort algorithm for DAGs to find shortest path

TODO all-pairs shortest paths

Definition 6.15: Floyd-Warshall algorithm

Suppose we have vertex set .

Let be the length of the shortest path from to , all of whose intermediate nodes are in the interval

TODO

TODO algorithm (just dynamic programming as usual) TODO note why the is ok (because we also have a in the last argument so we’ve already calculated it)

Running time is .

7. Greedy algorithms

Definition 7.1: In a greedy algorithm, at each step we greedily make the choice that offers the greatest immediate benefit (the greedy choice). This choice is not reconsidered at subsequent steps.
Remark: Dijkstra’s algorithm is an example of a greedy algorithm.
Remark: The greedy approach doesn’t always work, but when it does it’s nice because it’s simple and doesn’t require keeping track of subproblem solutions.
Definition 7.2: Given a connected undirected graph with weights , the minimum spanning tree (MST) is the connected acyclic subgraph that has minimum weight and connects all the vertices of . For positive weights, a minimum spanning tree always exists (because we can just look at all possible spanning trees and take the minimum).

We will identify a tree within by its edge set .

Lemma 7.3:

An undirected graph is a tree iff every pair of vertices is connected by a unique (simple) path.

Proof: A connected undirected graph has a cycle iff there exist two vertices connected by distinct paths.

Lemma 7.4: If a graph is a tree, than .

Proof:


.

TODO more formally

Definition 7.5: A spanning tree of a graph is a subgraph with edge set such that is a tree, reaches all vertices of , and TODO .

Remark: A spanning tree is:

  • a minimal connected subgraph (removing an edge disconnects it)
  • a maximal acyclic subgraph (adding an edge creates a cycle).
Remark: A graph can have multiple MSTs, but they must all have the same number of edges (because they are spanning trees).

The algorithm for building an MST:

Let be a subset of an MST . We say that an edge is safe to add to iff is a subset of some MST.

Definition 7.6: A cut is a partition of the vertex set into and .

An edge crosses a cut if one endpoint is in and the other is in .

A cut respects if no edge in crosses the cut.

An edge crossing a cut is light if its weight is minimum over all edges that cross that cut.

Lemma 7.7: Let be a subset of some MST. If is a cut that respects , and is a light edge crossing the cut, then is safe for .

Proof: Let be an MST that includes . Since is a tree, it contains a unique path between and .

Path must cross the cut at least once. ( TODO why?) Let be an edge of that crosses the cut. Adding and deleting creates a tree .

is a tree because we create a cycle and then remove one edge from it, keeping the graph connected. is minimum because the weight of is . contains because was included in , and did not contain .

Definition 7.8: A disjoint-set data structure keeps track of a family of dynamic disjoint sets . Each set is identified by some arbitrary representative.

It has three operations:

  • MakeSet(x) adds to
  • FindSet(u) returns the representative of the set containing
  • Union(x, y) removes from and adds to .

A good representation is as a disjoint-set forest, when elements are stored in an array with pointers to a parent leading to a root of each set. Keeping this balanced gives amortised complexity where is the number of operations and is the number of MakeSet operations, and is the inverse Ackermann function, which is extremely slowly-growing and is at most 4 for any feasible inputs.

Definition 7.9: Kruskal’s algorithm:

Start from . At each step, pick the edge with the smallest weight and add it to if it does not create a cycle.

To do this, we need to keep track of the connected components, for instance by using a disjoint-set data structure.

Kruskal(V, E, w):
A = nothing
for each v in V:
MakeSet(v)
sort E into increading order by weight w
for each edge (u, v) from sorted edge list:
if FindSet(u) != FindSet(v):
A = A union {(u,v)}
Union(u, v)
return A

There are MakeSet operations, and total disjoint-set operations. So the disjoint-set operations take (for a good disjoint-set implementation). So the sorting of the edges dominates. Therefore Kruskal’s algorithm has time complexity .

Definition 7.10: Prim’s algorithm picks a vertex and grows the tree from that vertex.

We set and . Then at every step, find a light edge connected to . Update to and to .

We use a min-heap/priority queue such that , the key of is the minimum weight of any edge where , and if no such edge exists set the key to be .

To find a light edge crossing the cut , take the minimum from the queue. If v = ExtractMin(Q), then there exists a light edge for some . The vertex can be retrieved by a backpointer: when the key of is to to , we define .

Prim(V, E, w, r):
for each u in V:
key[u] = oo
pi[u] = null
Insert(Q, u)
DecreaseKey(Q, r, 0)
while Q != nothing:
u = ExtractMin(Q)
for each v in Adj[u]:
if v in Q and w(u,v) < key[v]:
pi[v] = u
DecreaseKey(Q, v, w(u,v))

Then gives us the edges.

Remark: TODO Prim vs Dijkstra
Remark: TODO running time

Definition 7.11: The activity selection problem is, given a set of activiies , , to select a maximum-size subset of activities that do not overlap. (Where is the start time and is the finish time).

We can reduce this to a graph where edges between nodes indicate a scheduling conflict, and try to find a maximal independent set of vertices.

We could try to find this by greedily selecting vertices of minimal degree. However, this wouldn’t always work:

TODO ( one node at bottom, connected to 4 in next row, connected to four in row above which form a clique )

It turns out that this is an NP-hard problem (i.e. we can verify a solution in polynomial time, but finding a solution is difficult).

However, this doesn’t mean that activity selection is hard, because not all graphs are an activity selection problem. For example, a graph with an induced cycle > 3 cannot be an activity selection problem. TODO image

Lemma 7.12: There exists an optimal solution to an activity selection problem that contains an activity with minimum finish time.

Proof: Let be an activity with minimum time. Consider an optimal solution and assume it does not contain any activity with the same finish time as . By removing the activity of the optimal solution with minimum finish time and adding , we create a valid solution that has the same number of activities.

Corollary 7.13: The greedy algorithm whereby we take the activity with minimal finish time at each step always gives an optimal solution to the activity selection problem.

ActivitySelection(s, f):
sort the activities in order of increasing f
A = { 1 }
k = 1
for j = 2 to n:

TODO

Remark: The runtime is dominated by the sorting, so is .

8. Stable matching

Definition 8.1: A matching is an undirected graph where all connected components are pairs of vertices.

Equivalently, a matching is a set of ordered pairs with and , such that TODO

Goal: given a set of preferences among hospitals and med students, design a self-reinforcing admissions process.

Definition 8.2: Hospital and student form an unstable pair if:

  • prefers to one of its admitted students
  • and prefers to its assigned hospital.

Definition 8.3: A stable assignment is an assignment with no unstable pairs.

This is the desirable condition. Note that individual self-interest prevents any hospital-student side deal (if there are any unstable pairs though, the hospital and student could both complain).

Definition 8.4: A perfect matching is TODO
Definition 8.5: A stable matching is a perfect matching with no unstable pairs.

TODO Gale-Shapely deferred acceptance algorithm

Remark: This is .

Lemma 8.6: The Gale-Shapley algorithm finds a perfect matching.

Proof: Suppose for the sake of contradiction that some hospital is unmatched at termination. Then some student, say , is unmatched at termination. This means that was never proposed to, but must have proposed to every student, because it is unmatched. Contradiction, so the matching upon termination must be perfect.

Lemma 8.7: The matching produced by the Gale-Shapley algorithm is stable.

Proof: Consider a pair that is not in .

Either:

  • proposed to , in which case prefers its student in to
  • proposed to : therefore rejected at some point, so ended up with a more preferred hospital.

TODO : why does this mean it’s stable?

Definition 8.8: A student is a valid partner for hospital if there exists any stable matching in which and are matched.

Remark: The Gale-Shapley algorithm gives the hospital-optimal assignment, i.e. each hospital received the best valid partner. As a corollary, we get that the hospital-optimal assignment is stable.

This is also the student-pessimal assignment (each student gets the worst valid partner). As a corollary, we get that the student-pessimal assignment is stable.

Remark: A student can get a better outcome by lying about their preferences.