Course Schedule (Topological Sort)
“You can’t build the roof before you pour the foundation.”
1. Problem Statement
This is the canonical “Dependency Resolution” problem.
There are numCourses courses you have to take, labeled from 0 to numCourses - 1. You are given an array prerequisites where prerequisites[i] = [a, b] indicates that you must take course b first if you want to take course a.
Return true if you can finish all courses. Otherwise, return false.
Example 1:
- Input:
numCourses = 2,prerequisites = [[1,0]] - Output:
true - Explanation: To take course 1 you should have finished course 0. So it is possible.
Example 2:
- Input:
numCourses = 2,prerequisites = [[1,0],[0,1]] - Output:
false - Explanation: To take 1 you need 0. To take 0 you need 1. Impossible (Cycle).
2. Understanding the Problem
2.1 Directed Acyclic Graphs (DAGs)
This problem models a Directed Graph.
- Nodes = Courses.
- Edges = Dependencies (
b -> aimpliesbcreatesa).
The question “Can I finish?” is synonymous with: “Does this graph contain a Cycle?”
If there is a cycle (A -> B -> A), you can never start A or B.
If there is no cycle (a DAG), there always exists a valid linear ordering, called a Topological Sort.
2.2 The “Indegree” Intuition
How do you know which course to take first? Simple: Take the one with zero prerequisites.
- If Course 0 has no prerequisites (Indegree = 0), take it.
- Once taken, Course 0 is “done”. Now, any course that only needed Course 0 might now have its prerequisites satisfied.
- We effectively “remove” Course 0 and its outgoing edges from the graph.
- Repeat until empty.
3. Approach 1: DFS (Cycle Detection)
We can process each node with DFS.
- States:
Unvisited(0),Visiting(1),Visited(2). - If we encounter a node marked
Visiting, we found a back-edge -> Cycle.
Pros: Easy to write. Cons: Harder to generate the actual sort order (requires reversing the post-order). Large recursion depth.
4. Approach 2: Kahn’s Algorithm (BFS) – The Standard
This approach directly simulates the “peeling onion” strategy.
Algorithm
- Build Graph: Adjacency list + Indegree array.
Adj[u] = [v1, v2]Indegree[v1]++,Indegree[v2]++
- Initialize Queue: Add all nodes with
Indegree == 0to Queue. - Process:
- While Queue is not empty:
- Pop
u. Add to list oftaken_courses. - For each child
vofu:Indegree[v]--(We satisfied one prerequisite).- If
Indegree[v] == 0, pushvto Queue.
- Pop
- While Queue is not empty:
- Check: If
count(taken_courses) == numCourses, return True. Else, False (Cycle detected, some nodes never reached Indegree 0).
5. Implementation: Kahn’s Algorithm
from collections import deque
class Solution:
def canFinish(self, numCourses: int, prerequisites: list[list[int]]) -> bool:
# 1. Initialize Graph
adj = [[] for _ in range(numCourses)]
indegree = [0] * numCourses
# 2. Build Graph (Time: O(E), Space: O(E))
for dest, src in prerequisites:
# src -> dest (take src first)
adj[src].append(dest)
indegree[dest] += 1
# 3. Add seeds (Time: O(V))
queue = deque()
for i in range(numCourses):
if indegree[i] == 0:
queue.append(i)
processed_count = 0
# 4. BFS Traversal (Time: O(V + E))
while queue:
node = queue.popleft()
processed_count += 1
for neighbor in adj[node]:
indegree[neighbor] -= 1
if indegree[neighbor] == 0:
queue.append(neighbor)
# 5. Final check
return processed_count == numCourses
# Example Usage
sol = Solution()
print(sol.canFinish(2, [[1,0]])) # True
print(sol.canFinish(2, [[1,0], [0,1]])) # False
6. Testing Strategy
Test Case 1: Disconnected Components
0 -> 1 and 2 -> 3.
- Queue starts with
{0, 2}. - Pop 0 -> Add 1 to Queue.
- Pop 2 -> Add 3 to Queue.
- All processed. Valid.
Test Case 2: Deadlock (Cycle)
0 -> 1 -> 0.
- Indegree:
[1, 1]. - Queue starts Empty.
- Processed Count = 0.
- Returns False.
7. Complexity Analysis
- Time: $O(V + E)$.
- $V$: Initializing lists.
- $E$: Building graph edges.
- BFS loop visits every node ($V$) and every edge ($E$) exactly once.
- Space: $O(V + E)$.
- To store the Adjacency List.
This is asymptotically optimal.
8. Production Considerations
This algorithm is the backbone of Build Systems (Make, Maven, Bazel).
- Parallel Dependencies:
The Queue length represents the level of parallelism.
If Queue has 5 items, it means 5 tasks are ready to run simultaneously on 5 worker threads.
This is how
make -j8works!
9. Connections to ML Systems
This directly maps to DAG Pipeline Orchestration (ML System Design).
- Airflow / Kubeflow: These tools literally execute Kahn’s Algorithm.
- Nodes = ETL Tasks (Load Data, Train Model).
- Edges = Data dependencies.
- The Scheduler puts tasks with
Indegree 0into the worker pool.
10. Interview Strategy
- Identify Graph: Start by saying “This is a dependency problem, which can be modeled as a Graph.”
- Define Edge Direction: Be careful! “Prerequisite
[a, b]meansb -> a”. Getting this backward ruins the code. - Mention Cycle: Explicitly mention “If there’s a cycle, we can’t finish.”
- Compare DFS vs BFS: “I prefer Kahn’s (BFS) because it’s iterative (no stack overflow) and easy to extend to parallel execution logic.”
11. Key Takeaways
- Indegree is Key: It represents “Unsatisfied Constraints”.
- Topological Sort order is not unique: If queue has
[A, B],ABandBAare both valid sorts. - Cycle Detection: If
processed < total, the remaining nodes are locked in a cycle.
Originally published at: arunbaby.com/dsa/0049-course-schedule
If you found this helpful, consider sharing it with others who might benefit.