9. Minimum Spanning Trees

Definition: a subset of edges of a connected, weighted, undirected graph that:

  1. Connects all vertices in V
  2. Contains no cycles (it’s a tree)
  3. Minimizes the total edge weight

Approach: Connect a node, and the nodes connected to it, to the rest of the nodes (i.e. any other node) using an edge with minimal weight (called a safe edge if unique and therefore included in all MSTs).

Prerequisite: Union Find

The graph above has the array . Each node’s entry contains its ancestor.

How Find Works: Traverse the graph until you found a node that maps to itself.

How Union Works: Find the root node of both nodes. Replace the ancestor of one connected components with the ancestor of other connected component.

By Size (Rank): Always merge the connected component with less nodes (hight) into the larger one. Keep an array tracking size (height). When merging, set the size to their sum (or the height to ).

Path compression: Each time you traverse the graph to find the root node, remember all the nodes you pass through. Then connect them directly to the root node (possible by connecting them after performing a recursive call).

Note on the lectures: In the lectures, we saw the highly unusual approach having an array of ints, , and an array of lists, . contains the parent of each element. contains the children of each element. When we unionize and , we check whether or has more members. Let’s say has more. We then change the parent of and of all the children of to , unionize the list with ‘s, and delete its entries.

Runtime

Only find is relevant. Union is simply two finds + updating one root node and component size entry.

  • Naive: : you may have an extremely unbalanced tree if you always merge a tall tree into a single node.
  • By size/rank:
  • By size/rank + path compression: amortized, where is the inverse Ackermann function (effectively for practical purposes)
    • Generally, all nodes are directly connected to the root (or via one node in between if two connected components just merged)

Boruvka

How it works: Always pick the edge with the lowest weight incident to every connected component, starting with the individual nodes. Use DFS to get the minimum edge of each connected component: save minimum edge found during DFS walk that doesn’t connect parts of component (using Union Find to check), reset it, and perform DFS walk from another unvisited node; add all the saved edges to the MST.

Does it really use DFS or does it go through all edges, keeping track of the cheapest on for each connected component?

Runtime

  • iterations: In the worst case, you always connect pairs of connected components with each other.
  • for performing DFS and parallel minimum finding. Union Find it negligable.

Total:

Prim

How it works: Build up an MST from a single chosen node by iteratively adding the lowest edge going from the component to an unvisited node.

Runtime:

  • iterations: every node must be added to the MST
  • across all runs to insert edges into and extract edges from a priority queue of size

Total:

Kruskal

How it works:

  • Sort edges by weight
  • Take lowest edge
  • Check if edge connects two already connected nodes using Union Find. If no, add edge to MST
  • Repeat until there are no edges left

Runtime

  • Sorting:
  • Adding edges + performing Union Find w/o path compression:

Total: