Counting Connected Components

post-thumb

Number of Connected Components in an Undirected Graph

Given n nodes labeled from 0 to n - 1 and a list of undirected edges (each edge is a pair of nodes), find the number of connected components in the undirected graph.

Example 1:

0 – 1 – 2

3 – 4

Given n = 5 and edges = [[0, 1], [1, 2], [3, 4]], return 2.

Example 2:

0 – 1 – 2 – 3 – 4

Given n = 5 and edges = [[0, 1], [1, 2], [2, 3], [3, 4]], return 1.

Note: You can assume that no duplicate edges will appear in edges. Since all edges are undirected, [0, 1] is the same as [1, 0] and thus will not appear together in edges.

Idea

  1. Construct the graph from edges
  2. Initialize components = 0
  3. Start at an unvisited node, traverse the connected component that contains this node and mark every node in this component as visited
  4. Increment components by 1
  5. Repeat 3 and 4 until every node has been visited.
  6. Return components

Before Implementing …

How do we represent a graph?

We want to represent a graph in such a way that supports fast neighbor lookup (like a getNeighbors() method). So we can either use a HashMap to map a node to its neighbors or write a GraphNode class:

class GraphNode:
    def __init__(self, id):
        self._id = id
        self._neighbors = []

    def add_neighbor(self, neighbor):
        self._neighbors.append(neighbor)

    @property
    def neighbors(self):
        return self._neighbors

This is a good opportunity to discuss with your interviewer about the trade-offs between the two implementations.

HashTable:

  • Pro: Easy to implement
  • Con: Code is not self-explanatory

GraphNode[^1]:

  • Pro: Good design, self-explanatory code
  • Con: Extra overhead

[^1]: An important note about using a GraphNode class: after initializing GraphNode(0), GraphNode(1), ..., GraphNode(n-1), when we see [2, 3], we want to add GraphNode(3) to GraphNode(2).neighbors and vice versa. GraphNode(2) and GraphNode(3) have been created – how do we find them? I.e, given some i, how do we store the GraphNode's so we can quickly look up the GraphNode(i) that has been created?
Solution 1: Store GraphNode(0),GraphNode(1),…, GraphNode(n-1) in an array : python nodes = [GraphNode(i) for i in range(n)] Solution 2: Use a dictionary to map i to GraphNode(i): python nodes = {i: GraphNode(i) for i in range(n)} In this case when the node ids range from 0 to n-1, solution 1 is slightly better since the dictionary solution uses more space. In other cases, however, when the node ids are strings or integers but all are sparsely distributed within some range, using an array to store all the nodes will be either impossible or a huge waste of space.

Seeing that you understand how to use a GraphNode class to construct the graph, the interviewer will likely suggest you use a HashMap to represent the graph to save some time.

How do we traverse a connected component?

Both BFS and DFS will work.

How do we store the visited nodes?

In our algorithm, we need a way to retrieve an unvisited node. So instead of keeping track the visited nodes, we use a set to store the unvisited nodes and we can get retrive an unvisited node using unvisited.pop().

Implementation

Main code

def countComponents(n, edges):
    children = {i: [] for i in range(n)}
    for a, b in edges:
        children[a].append(b)
        children[b].append(a)
    components = 0
    unvisited = set(range(n))
    while unvisited:
        somenode = unvisited.pop()
        dfs(somenode, children, unvisited)  # bfs also works here
        components += 1
    return components

DFS

def dfs(root, children, unvisited):
    stack = [root]
    while stack:
        current = stack.pop()
        for x in children[current]:
            if x in unvisited:
                unvisited.discard(x)
                stack.append(x)

Note: set.discard(x) can be replaced by set.remove(x). The difference between discard(x) and remove(x) is that set.remove(x) will raise KeyError if x is not present.

BFS

from collections import deque

def bfs(root, children, unvisited):
    q = deque([root])
    while q:
        current = q.popleft()
        for x in children[current]:
            if x in unvisited:
                unvisited.discard(x)
                q.append(x)

A Union Find Solution

A Disjoint Set Union (DSU) data structure[^2] can also be used to find the number of connected components in a graph. We’ll cover DSU in detail in a future post.

[^2]: Also called Union Find because of its two main operations.

Similar Problems

comments powered by Disqus