Sudoku Solver

Q: How do bitmasks optimize Sudoku constraint checking?

Maintain three arrays of integers (rows, cols, boxes) where each bit represents a digit. To check if digit k is safe at (r,c), compute used = rows[r] | cols[c] | boxes[box_idx] and test if bit k is unset. This replaces three O(9) scans with a single O(1) bitwise operation.

Q: What is the time complexity of a Sudoku backtracking solver?

O(9^k) where k is the number of empty cells. For a standard puzzle with about 50 empty cells, constraint propagation prunes most branches. With bitmasking, a 9x9 board typically solves in under 5 milliseconds.

Q: How does the box index mapping work in Sudoku?

For cell (r, c), the box index is (r // 3) * 3 + (c // 3). This maps the 9x9 grid into nine 3x3 sub-boxes indexed 0-8, starting from top-left to bottom-right.

Q: What is Dancing Links and when would you use it for Sudoku?

Dancing Links (DLX) is Knuth's Algorithm X implementation using circular doubly-linked lists for the exact cover formulation of Sudoku. It is the fastest known approach for competitive solvers, using zero-allocation backtracking through pointer manipulation.

9 minute read

“Sudoku Solver is the quintessential backtracking problem, it represents the transition from simple recursion to a multi-constraint search problem where every choice prunes a massive branch of the state space.”

TL;DR

Sudoku Solver fills empty cells using backtracking: try digits 1-9, check constraints, recurse, and undo on failure. Bitmask optimization encodes row, column, and box constraints as integers for O(1) safety checks instead of O(9) scans. Pre-collecting empty cells avoids re-scanning the board each recursive call. This constraint satisfaction approach applies to AutoML hyperparameter search and builds on the N-Queens backtracking pattern.

1. Introduction: The Complexity of the Grid

Sudoku is more than just a pastime in the Sunday newspaper. In computer science, it is a canonical example of a Constraint Satisfaction Problem (CSP). It is essentially a graph coloring problem on a graph with 81 nodes (cells) and a very specific set of edges (constraints).

At a glance, a 9x9 grid seems small. However, if you were to try filling an empty board with every possible digit combination, you would be dealing with 9^{81} possibilities, a number larger than the estimated number of atoms in the observable universe. Even a partially filled board generates a search tree that explodes exponentially.

Solving Sudoku efficiently is about Pruned Search. It is about making a move, checking if it satisfies a set of local and global constraints, and immediately “backtracking” if it leads to a dead end. In this deep dive, we will move from a basic recursive solution to a highly optimized bitmask solver, and finally discuss Dancing Links (DLX), the algorithm used by competitive solvers.

1.1 Why This Problem Matters

Backtracking Foundation: It is the purest form of “Try, Fail, Undo.”
Constraint Propagation: It teaches you how checking constraints early reduces work later.
Bitwise Optimization: It offers a perfect use case for using bits to represent sets.

2. The Problem Statement

Write a program to solve a Sudoku puzzle by filling the empty cells (represented by .). A sudoku solution must satisfy all of the following rules:

Each of the digits 1-9 must occur exactly once in each row.
Each of the digits 1-9 must occur exactly once in each column.
Each of the digits 1-9 must occur exactly once in each of the 9 3x3 sub-boxes of the grid.

Constraints:

The input board is guaranteed to have exactly one solution.
The board size is fixed at 9x9.

3. Thematic Link: Constraint Satisfaction and Search

We focus on Constraint Satisfaction and Search:

DSA: We use backtracking to solve a rule-based grid.
ML System Design: AutoML systems search through the space of hyperparameters while satisfying hardware and accuracy constraints.
Speech Tech: Neural Architecture Search (NAS) for speech is a search for the best Conformer topology under latency constraints.
Agents: Benchmarking agents involves measuring how efficiently an agent searches through a task space to find a valid solution.

4. Approach 1: Basic Backtracking (The Foundation)

Backtracking is a depth-first search (DFS) through the state space.

4.1 The Recursive Algorithm

Find the next empty cell: Ideally, we pick the first . we encounter.
Iterate through digits 1-9:
- For each digit, check if it is Safe to place.
- If safe, place it and move to the next empty cell (recursive call).
Check the result:
- If the recursion returns True, we found the solution.
- If it returns False, Backtrack: reset the cell to . and try the next digit.
Base Case: If no more empty cells exist, the puzzle is solved.

4.2 Code implementation

class Solution:
    def solveSudoku(self, board: List[List[str]]) -> None:
        self.solve(board)

    def solve(self, board):
        for r in range(9):
            for c in range(9):
                if board[r][c] == ".":
                    for digit in "123456789":
                        if self.is_safe(board, r, c, digit):
                            board[r][c] = digit
                            if self.solve(board):
                                return True
                                board[r][c] = "." # Backtrack
                                return False
                                return True

    def is_safe(self, board, r, c, digit):
        for i in range(9):
            # Row check
            if board[r][i] == digit: return False
            # Column check
            if board[i][c] == digit: return False
            # 3x3 Box check
            if board[3*(r//3) + i//3][3*(c//3) + i%3] == digit: return False
            return True

5. Approach 2: Optimized Backtracking with Bitmasks

The is_safe function in the previous approach is called hundreds of thousands of times. We can optimize this to O(1) by using Bitmasks.

5.1 The Logic

We maintain three arrays of integers (bitmasks):

rows[9]: rows[i] is a bitmask where the k-th bit is 1 if digit k is present in row i.
cols[9]: Same for columns.
boxes[9]: Same for the 9 sub-boxes.

5.2 Optimized Code

class Solution:
    def solveSudoku(self, board: List[List[str]]) -> None:
        rows = [0] * 9
        cols = [0] * 9
        boxes = [0] * 9
        to_fill = []

        # 1. Map initial state
        for r in range(9):
            for c in range(9):
                if board[r][c] != ".":
                    digit = int(board[r][c])
                    mask = 1 << digit
                    rows[r] |= mask
                    cols[c] |= mask
                    boxes[(r//3)*3 + (c//3)] |= mask
                else:
                    to_fill.append((r, c))

    def backtrack(idx):
        if idx == len(to_fill):
            return True

            r, c = to_fill[idx]
            box_idx = (r//3)*3 + (c//3)

            used = rows[r] | cols[c] | boxes[box_idx]

            for digit in range(1, 10):
                if not (used & (1 << digit)):
                    mask = 1 << digit
                    board[r][c] = str(digit)
                    rows[r] |= mask
                    cols[c] |= mask
                    boxes[box_idx] |= mask

                    if backtrack(idx + 1):
                        return True

                        # Backtrack (unset bits)
                        rows[r] ^= mask
                        cols[c] ^= mask
                        boxes[box_idx] ^= mask
                        return False

                        backtrack(0)

6. Implementation Deep Dive: Coordinate Mapping

Scaling a Sudoku solver often fails due to incorrect box index mapping.

A 9x9 grid has 9 boxes, indexed 0-8 starting from the top-left.
To map (r, c) to box_id: box_id = (r // 3) * 3 + (c // 3)
Advanced Tip: In a N \times N Sudoku where N is a perfect square, the formula generalizes to: sub_size = int(sqrt(N)) box_id = (r // sub_size) * sub_size + (c // sub_size)

7. Theoretical Maximum: Knuth’s Dancing Links (DLX)

If you are asked about the “Fastest possible Sudoku solver”, the answer is Algorithm X using Dancing Links.

7.1 Exact Cover

Sudoku can be modeled as an Exact Cover Problem. We have 324 constraints and 729 choices. The goal is to select 81 choices such that every constraint is satisfied exactly once.

7.2 The DLL Magic

Donald Knuth’s “Dancing Links” uses a circular doubly-linked list where a node can be removed and “re-inserted” with zero memory allocation. This is the ultimate optimization for the “Backtrack” step.

8. Complexity Analysis

Metric	Complexity	Rationale
Time	O(9^k)	`k` is the number of empty cells.
Space	O(k)	The recursion stack goes as deep as the empty cells.

For a 9x9 board, even a naive solver rarely takes more than 100ms. With bitmasking, it drops to < 5ms.

9. Common Pitfalls and Memory Management

Bitmask Indexing: Start your bits from 1 to match Sudoku digits (1-9).
The “Unique Solution” Trap: Your code should be robust enough to return False if no solution is found.
Mutating the Board: In Python, when you modify board[r][c], you are changing the actual list object. Always ensure you reset it.

10. Connections to Other Topics

10.1 Connection to ML (Neural Sudoku Solvers)

Recent research into SatNet attempts to teach neural networks to solve Sudoku by embedding a “Constraint Optimization” layer into the architecture.

10.2 Connection to AI Agents (Benchmarking)

A Sudoku solver is essentially a “Single-Purpose Agent.” We measure its “Trajectory Efficiency”, how few guesses did it make? A superior agent picks cells with the Fewest Remaining Possibilities (Heuristic).

11. Interview Strategy: The “Constraint First” Mindset

Define Safety: Explain the three rules clearly before writing code.
Backtracking Visualization: Draw a small grid and show how trying a digit forces decisions.
Mention Optimizations: Mention bitmasking and “Minimum Remaining Values” (MRV) heuristic.
Edge Cases: Mention what happens if the input board is already invalid.

12. Key Takeaways

Constraint Propagation is Power: The more constraints you check upfront, the smaller your search tree.
Backtracking is DFS on State: Every call represents a state in the puzzle’s life.
Bitmasking for Speed: O(1) constraint checking is the difference between a toy and a production-grade solver.

FAQ

How do bitmasks optimize Sudoku constraint checking?

Maintain three arrays of integers (rows, cols, boxes) where each bit position represents a digit. To check if digit k is safe at position (r,c), compute used = rows[r] OR cols[c] OR boxes[box_idx] and test if bit k is unset. This replaces three O(9) scans with a single O(1) bitwise operation.

What is the time complexity of a Sudoku backtracking solver?

O(9^k) where k is the number of empty cells. For a standard puzzle with roughly 50 empty cells, constraint propagation prunes most branches early. With bitmasking, a 9x9 board typically solves in under 5 milliseconds.

How does the box index mapping work in Sudoku?

For cell (r, c), the box index is (r // 3) * 3 + (c // 3). This maps the 9x9 grid into nine 3x3 sub-boxes indexed 0 through 8, starting from top-left to bottom-right row by row.

What is Dancing Links and when would you use it for Sudoku?

Dancing Links (DLX) is Donald Knuth’s Algorithm X implementation using circular doubly-linked lists for the exact cover formulation of Sudoku. It converts the puzzle into 324 constraints and 729 choices, then uses zero-allocation backtracking through pointer manipulation. It is the fastest known approach for competitive solvers.

Originally published at: arunbaby.com/dsa/0059-sudoku-solver

Sudoku Solver

TL;DR

1. Introduction: The Complexity of the Grid

1.1 Why This Problem Matters

2. The Problem Statement

3. Thematic Link: Constraint Satisfaction and Search

4. Approach 1: Basic Backtracking (The Foundation)

4.1 The Recursive Algorithm

4.2 Code implementation

5. Approach 2: Optimized Backtracking with Bitmasks

5.1 The Logic

5.2 Optimized Code

6. Implementation Deep Dive: Coordinate Mapping

7. Theoretical Maximum: Knuth’s Dancing Links (DLX)

7.1 Exact Cover

7.2 The DLL Magic

8. Complexity Analysis

9. Common Pitfalls and Memory Management

10. Connections to Other Topics

10.1 Connection to ML (Neural Sudoku Solvers)

10.2 Connection to AI Agents (Benchmarking)

11. Interview Strategy: The “Constraint First” Mindset

12. Key Takeaways

FAQ

Related across topics

Share on

TL;DR

1. Introduction: The Complexity of the Grid

1.1 Why This Problem Matters

2. The Problem Statement

3. Thematic Link: Constraint Satisfaction and Search

4. Approach 1: Basic Backtracking (The Foundation)

4.1 The Recursive Algorithm

4.2 Code implementation

5. Approach 2: Optimized Backtracking with Bitmasks

5.1 The Logic

5.2 Optimized Code

6. Implementation Deep Dive: Coordinate Mapping

7. Theoretical Maximum: Knuth’s Dancing Links (DLX)

7.1 Exact Cover

7.2 The DLL Magic

8. Complexity Analysis

9. Common Pitfalls and Memory Management

10. Connections to Other Topics

10.1 Connection to ML (Neural Sudoku Solvers)

10.2 Connection to AI Agents (Benchmarking)

11. Interview Strategy: The “Constraint First” Mindset

12. Key Takeaways

FAQ

Related across topics

AutoML Systems at Scale

Neural Architecture Search (NAS) for Speech

Agent Benchmarking: A Deep Dive

Share on