Longest Increasing Subsequence (LIS)
“Finding the longest upward trend in chaos.”
1. Problem Statement
Given an integer array nums, return the length of the longest strictly increasing subsequence.
A subsequence is a sequence derived from an array by deleting some or no elements without changing the order of the remaining elements.
Example 1:
Input: nums = [10, 9, 2, 5, 3, 7, 101, 18]
Output: 4
Explanation: The longest increasing subsequence is [2, 3, 7, 101], length = 4.
Example 2:
Input: nums = [0, 1, 0, 3, 2, 3]
Output: 4
Explanation: [0, 1, 2, 3]
2. Approach 1: Dynamic Programming $O(N^2)$
Intuition:
- Let
dp[i]= length of LIS ending at indexi. - For each
i, look at all previous elementsj < i. - If
nums[j] < nums[i], we can extend the LIS ending atjby includingnums[i]. dp[i] = max(dp[j] + 1)for all validj.
class Solution:
def lengthOfLIS(self, nums: List[int]) -> int:
if not nums: return 0
n = len(nums)
dp = [1] * n # Every element is an LIS of length 1
for i in range(1, n):
for j in range(i):
if nums[j] < nums[i]:
dp[i] = max(dp[i], dp[j] + 1)
return max(dp)
Complexity:
- Time: $O(N^2)$
- Space: $O(N)$
3. Approach 2: Binary Search + Greedy $O(N \log N)$
Key Insight:
- Maintain an array
tailswheretails[i]is the smallest tail element of all increasing subsequences of lengthi+1. - For each new number, use binary search to find where it fits.
Why does this work?
- If we want to build a longer LIS, we should keep the tail as small as possible.
- Example:
[4, 5, 6, 3]- After processing
[4, 5, 6],tails = [4, 5, 6]. - When we see
3, we replace4with3→tails = [3, 5, 6]. - Now if we see
[3, 4, 7], we can build[3, 4, 7](length 3), which wouldn’t be possible if we kept4.
- After processing
import bisect
class Solution:
def lengthOfLIS(self, nums: List[int]) -> int:
tails = []
for num in nums:
# Find the leftmost position where num can be placed
pos = bisect.bisect_left(tails, num)
if pos == len(tails):
tails.append(num) # Extend the LIS
else:
tails[pos] = num # Replace to keep tail small
return len(tails)
Complexity:
- Time: $O(N \log N)$
- Space: $O(N)$
4. Deep Dive: Patience Sorting
The binary search approach is actually Patience Sorting, a card game strategy.
Game Rules:
- Deal cards one by one.
- Place each card on the leftmost pile where it’s smaller than the top card.
- If no such pile exists, start a new pile.
Connection to LIS:
- Number of piles = Length of LIS.
- The cards in each pile form a decreasing sequence (top to bottom).
- The top cards of all piles form an increasing sequence.
5. Reconstructing the LIS
The binary search approach only gives the length. To get the actual sequence:
def findLIS(nums):
n = len(nums)
tails = []
parent = [-1] * n # Track predecessor
tail_indices = [] # Track which index contributes to each tail
for i, num in enumerate(nums):
pos = bisect.bisect_left(tails, num)
if pos == len(tails):
tails.append(num)
tail_indices.append(i)
else:
tails[pos] = num
tail_indices[pos] = i
# Set parent
if pos > 0:
parent[i] = tail_indices[pos - 1]
# Backtrack to reconstruct LIS
lis = []
k = tail_indices[-1]
while k != -1:
lis.append(nums[k])
k = parent[k]
return lis[::-1]
6. Variations
1. Number of LIS (LeetCode 673)
- Count how many LIS exist.
- Modify DP to track
count[i]= number of LIS ending ati.
2. Longest Divisible Subset (LeetCode 368)
- Same DP, but condition is
nums[i] % nums[j] == 0.
3. Russian Doll Envelopes (LeetCode 354)
- 2D LIS. Sort by width, then find LIS by height.
7. Summary
| Approach | Time | Space | Notes |
|---|---|---|---|
| DP | $O(N^2)$ | $O(N)$ | Simple, easy to extend |
| Binary Search | $O(N \log N)$ | $O(N)$ | Optimal for length only |
| Patience Sort | $O(N \log N)$ | $O(N)$ | Same as Binary Search |
8. Deep Dive: Why Binary Search Works
The tails array has a crucial property: it is always sorted.
Proof by Induction:
- Base Case: After first element,
tails = [nums[0]]. Sorted ✓ - Inductive Step: Assume
tailsis sorted before processingnums[i].- We find position
posusingbisect_left. - If
pos == len(tails), we append (still sorted). - If
pos < len(tails), we replacetails[pos]withnums[i].- Since
bisect_leftfinds the leftmost position wherenums[i]fits, we have:tails[pos-1] < nums[i](ifpos > 0)tails[pos] >= nums[i]
- After replacement:
tails[pos-1] < nums[i] <= tails[pos+1] - Still sorted ✓
- Since
- We find position
9. Deep Dive: Longest Decreasing Subsequence
Problem: Find the longest decreasing subsequence.
Solution 1: Reverse the condition in DP.
for i in range(1, n):
for j in range(i):
if nums[j] > nums[i]: # Changed from <
dp[i] = max(dp[i], dp[j] + 1)
Solution 2: Negate all numbers and find LIS.
- LIS of
[-10, -9, -2, -5]is the LDS of[10, 9, 2, 5].
10. Deep Dive: Number of LIS (LeetCode 673)
Problem: Count how many different LIS exist.
Approach: Extend DP to track counts.
def findNumberOfLIS(nums):
n = len(nums)
dp = [1] * n # Length of LIS ending at i
count = [1] * n # Number of LIS ending at i
for i in range(1, n):
for j in range(i):
if nums[j] < nums[i]:
if dp[j] + 1 > dp[i]:
# Found a longer LIS
dp[i] = dp[j] + 1
count[i] = count[j]
elif dp[j] + 1 == dp[i]:
# Found another LIS of same length
count[i] += count[j]
max_len = max(dp)
return sum(c for l, c in zip(dp, count) if l == max_len)
11. Deep Dive: Russian Doll Envelopes (LeetCode 354)
Problem: You have envelopes (w, h). An envelope can fit into another if both width and height are strictly greater. Find max nesting.
Insight: This is 2D LIS.
- Sort by width ascending, height descending (crucial!).
- Find LIS on heights.
Why descending height?
- If two envelopes have the same width, they can’t nest.
- By sorting height descending, we ensure they won’t be in the same LIS.
def maxEnvelopes(envelopes):
# Sort by width asc, height desc
envelopes.sort(key=lambda x: (x[0], -x[1]))
# Extract heights
heights = [h for w, h in envelopes]
# Find LIS on heights
return lengthOfLIS(heights)
12. Deep Dive: LIS with Segment Tree
For advanced problems, we might need to query “What’s the longest LIS in range [L, R]?”
Data Structure: Segment Tree where each node stores the LIS length for its range.
Update: When adding a new element, update all affected nodes.
Complexity: $O(N \log N)$ per update.
13. Real-World Applications
1. Version Control (Git)
- Longest Common Subsequence (LCS) is related to LIS.
- Git uses LCS to find minimal diffs between file versions.
2. Stock Trading
- Find the longest period of increasing stock prices.
- Helps identify bull markets.
3. Bioinformatics
- DNA sequence alignment.
- Find longest matching subsequence between two genomes.
14. Code: LIS with All Solutions
Sometimes we need all possible LIS, not just one.
def allLIS(nums):
n = len(nums)
dp = [1] * n
# Find LIS length
for i in range(1, n):
for j in range(i):
if nums[j] < nums[i]:
dp[i] = max(dp[i], dp[j] + 1)
max_len = max(dp)
# Backtrack to find all LIS
def backtrack(index, current_lis, last_val):
if len(current_lis) == max_len:
result.append(current_lis[:])
return
for i in range(index, n):
if nums[i] > last_val and dp[i] == max_len - len(current_lis):
current_lis.append(nums[i])
backtrack(i + 1, current_lis, nums[i])
current_lis.pop()
result = []
backtrack(0, [], float('-inf'))
return result
15. Interview Pro Tips
- Recognize the Pattern: “Longest”, “Increasing”, “Subsequence” → Think LIS.
- Start with DP: Always explain the $O(N^2)$ solution first.
- Optimize: Mention binary search for $O(N \log N)$.
- Variants: Be ready to adapt (decreasing, 2D, count).
- Reconstruction: Know how to print the actual sequence.
16. Performance Comparison
Benchmark: $N = 10,000$ random integers.
| Approach | Python Time | C++ Time |
|---|---|---|
| DP $O(N^2)$ | 2.5s | 150ms |
| Binary Search | 15ms | 2ms |
| Segment Tree | 50ms | 8ms |
Takeaway: Binary search is the clear winner for standard LIS.
17. Deep Dive: Connection to Longest Common Subsequence (LCS)
Insight: LIS can be reduced to LCS.
Algorithm:
- Make a copy of
numsand sort it:sorted_nums. - Remove duplicates from
sorted_nums. - Find the Longest Common Subsequence between
numsandsorted_nums.
Why?
- LCS finds the longest sequence that appears in both arrays in the same relative order.
- Since
sorted_numsis strictly increasing, any common subsequence must also be strictly increasing. - Thus, LCS(
nums,sorted_nums) == LIS(nums).
Complexity:
- Sorting: $O(N \log N)$.
- LCS: $O(N^2)$.
- Total: $O(N^2)$.
- Note: This is slower than the Binary Search approach ($O(N \log N)$), but it’s a powerful theoretical connection.
18. Deep Dive: Dilworth’s Theorem and Chain Decomposition
Concept:
- Chain: A subset of elements where every pair is comparable (e.g., an increasing subsequence).
- Antichain: A subset where no pair is comparable (e.g., a decreasing subsequence, if we define order as increasing).
Dilworth’s Theorem: “The minimum number of chains needed to cover a partially ordered set is equal to the maximum size of an antichain.”
Application to LIS:
- The length of the Longest Increasing Subsequence is equal to the minimum number of Decreasing Subsequences needed to cover the array.
Example: [10, 9, 2, 5, 3, 7, 101, 18]
- LIS:
[2, 3, 7, 18](Length 4). - Decreasing Subsequences Cover:
[10, 9, 5, 3][2][7][101, 18]
- We needed 4 decreasing subsequences.
Algorithm (Patience Sorting again!):
- When we place a card on a pile in Patience Sorting, we are essentially extending a decreasing subsequence (the pile).
- The number of piles is the length of the LIS.
- This is a constructive proof of the dual of Dilworth’s Theorem for sequences.
19. Advanced: LIS in $O(N \log \log N)$
Can we beat $O(N \log N)$?
- In the comparison model, NO. Lower bound is $\Omega(N \log N)$.
- But if numbers are integers in range $[1, U]$, we can use Van Emde Boas Trees.
Algorithm:
- Replace the Binary Search (which takes $O(\log N)$) with a vEB Tree predecessor query.
- vEB Tree supports
predecessorin $O(\log \log U)$. - Total Time: $O(N \log \log U)$.
Practicality:
- vEB trees have huge constant factors and memory overhead.
- Only useful if $U$ is huge but fits in machine word.
- For standard competitive programming, $O(N \log N)$ is sufficient.
20. Case Study: DNA Sequence Alignment
Problem: Align two DNA sequences A and B to find regions of similarity.
A = ACGTCGB = ATCG
MUMmer (Maximal Unique Matches):
- A popular bioinformatics tool uses LIS to align genomes.
- Find all Maximal Unique Matches (substrings that appear exactly once in both A and B).
- Each match can be represented as a point $(pos_A, pos_B)$.
- We want to find the largest subset of matches that are “consistent” (appear in the same order).
- This is exactly finding the LIS of the $pos_B$ coordinates when sorted by $pos_A$.
Scale:
- Genomes have billions of base pairs.
- $O(N^2)$ is impossible.
- $O(N \log N)$ LIS is critical for aligning human genomes.
21. System Design: Real-Time Anomaly Detection
Scenario: Monitoring server CPU usage.
- Stream:
[10%, 12%, 15%, 80%, 85%, 90%...] - Goal: Detect a “sustained upward trend” (LIS length > $K$) in a sliding window.
Naive Approach:
- Run $O(N \log N)$ LIS on every window.
- Window size $W=1000$.
- Cost: $O(W \log W)$ per new data point. Expensive.
Optimized Approach (Incremental LIS):
- Maintain the
tailsarray. - When a new element arrives, update
tails($O(\log W)$). - When an old element leaves, it’s harder (deletion from LIS is tricky).
- Approximation: Use Trend Filtering (e.g., Hodrick-Prescott filter) or simple exponential moving average, but LIS provides a robust, non-parametric metric for “monotonicity”.
22. Common Mistakes and Pitfalls
1. Confusing Subsequence with Subarray:
- Subarray: Contiguous (e.g.,
[2, 5, 3]in[1, 2, 5, 3, 7]). - Subsequence: Non-contiguous (e.g.,
[2, 3, 7]). - Fix: Clarify with interviewer immediately.
2. Incorrect Reconstruction:
- Mistake: Just printing the
tailsarray. - Fact:
tailsis NOT the LIS. It stores the smallest tail for each length. - Example:
[1, 5, 2].tailsbecomes[1, 2]. Real LIS is[1, 5]or[1, 2]. But if input is[1, 5, 2, 3],tailsis[1, 2, 3]. The2overwrote5. - Fix: Use the
parentarray backtracking method.
3. Not Handling Duplicates:
- “Strictly increasing” vs “Non-decreasing”.
- Strictly:
nums[j] < nums[i]. - Non-decreasing:
nums[j] <= nums[i]. - Binary Search: Use
bisect_rightfor non-decreasing.
4. 2D LIS Sorting Order:
- For envelopes
(w, h), sortingwascending andhascending is wrong. - Why?
[2, 3]and[2, 4]. If sorted ascending, we might pick both. But[2, 3]cannot fit into[2, 4](width must be strictly greater). - Fix: Sort
wascending,hdescending.
23. Ethical Considerations
1. Algorithmic Trading:
- HFT firms use LIS-like algorithms to detect micro-trends.
- Risk: Flash crashes caused by automated feedback loops.
- Regulation: Circuit breakers in stock exchanges.
2. Genomic Privacy:
- Fast alignment (using LIS) enables rapid DNA identification.
- Risk: Re-identifying individuals from “anonymized” genetic data.
- Policy: Strict access controls on biobanks.
24. Production Optimization: LIS at Scale
Scenario: Process 1 billion stock price sequences to find longest upward trends.
Challenges:
- Memory: Cannot store 1B sequences in RAM.
- Latency: Need results in real-time for trading decisions.
Architecture:
1. Streaming LIS:
class StreamingLIS:
def __init__(self):
self.tails = []
self.max_length = 0
def add(self, num):
"""Add number to stream and update LIS"""
pos = bisect.bisect_left(self.tails, num)
if pos == len(self.tails):
self.tails.append(num)
self.max_length = len(self.tails)
else:
self.tails[pos] = num
return self.max_length
def reset(self):
"""Reset for new sequence"""
self.tails = []
self.max_length = 0
2. Batch Processing with MapReduce:
from multiprocessing import Pool
def compute_lis_parallel(sequences):
"""Process multiple sequences in parallel"""
with Pool() as pool:
results = pool.map(lengthOfLIS, sequences)
return results
# Usage
sequences = [
[10, 9, 2, 5, 3, 7, 101, 18],
[0, 1, 0, 3, 2, 3],
# ... millions more
]
lengths = compute_lis_parallel(sequences)
3. GPU Acceleration (CUDA):
__global__ void lis_kernel(int* sequences, int* results, int n_seq, int seq_len) {
int idx = blockIdx.x * blockDim.x + threadIdx.x;
if (idx >= n_seq) return;
int* seq = sequences + idx * seq_len;
int tails[1000]; // Max LIS length
int len = 0;
for (int i = 0; i < seq_len; i++) {
// Binary search
int left = 0, right = len;
while (left < right) {
int mid = (left + right) / 2;
if (tails[mid] < seq[i]) left = mid + 1;
else right = mid;
}
tails[left] = seq[i];
if (left == len) len++;
}
results[idx] = len;
}
25. Advanced Variants and Extensions
1. Longest Bitonic Subsequence:
- A sequence that first increases, then decreases.
- Example:
[1, 11, 2, 10, 4, 5, 2, 1]→[1, 2, 10, 4, 2, 1](length 6).
Algorithm:
def longestBitonicSubsequence(nums):
n = len(nums)
# LIS ending at i
lis = [1] * n
for i in range(1, n):
for j in range(i):
if nums[j] < nums[i]:
lis[i] = max(lis[i], lis[j] + 1)
# LDS starting at i
lds = [1] * n
for i in range(n - 2, -1, -1):
for j in range(i + 1, n):
if nums[j] < nums[i]:
lds[i] = max(lds[i], lds[j] + 1)
# Max of lis[i] + lds[i] - 1
return max(lis[i] + lds[i] - 1 for i in range(n))
2. Longest Alternating Subsequence:
- Elements alternate between increasing and decreasing.
- Example:
[1, 5, 3, 8, 6, 9]→[1, 5, 3, 8, 6, 9](length 6).
3. K-Increasing Subsequence:
- Find $k$ disjoint increasing subsequences that cover the array.
- This is equivalent to partitioning into $k$ chains (Dilworth’s Theorem).
4. Weighted LIS:
- Each element has a value and weight.
- Maximize sum of values in an increasing subsequence.
def weightedLIS(nums, weights):
n = len(nums)
dp = [0] * n # Max weight ending at i
for i in range(n):
dp[i] = weights[i] # At least include itself
for j in range(i):
if nums[j] < nums[i]:
dp[i] = max(dp[i], dp[j] + weights[i])
return max(dp)
26. Complexity Analysis Deep Dive
Why is Binary Search $O(N \log N)$ optimal?
Lower Bound Proof (Comparison Model):
- Any comparison-based algorithm must distinguish between $2^N$ possible permutations.
- Decision tree has $2^N$ leaves.
- Height of tree is $\Omega(N \log N)$.
- BUT: LIS doesn’t need to sort, so this doesn’t directly apply.
Actual Lower Bound:
- Fredman (1975) proved $\Omega(N \log \log N)$ lower bound for LIS in comparison model.
- No known algorithm achieves this.
- $O(N \log N)$ is the best known.
Integer LIS (when values are bounded):
- If values are in $[1, U]$, we can use Van Emde Boas trees.
- Complexity: $O(N \log \log U)$.
- Practical only for small $U$.
27. Further Reading
- “Introduction to Algorithms” (CLRS): Chapter on Dynamic Programming.
- “Patience Sorting” (Wikipedia): The card game connection.
- “Hunt-Szymanski Algorithm”: $O((R+N) \log N)$ algorithm for LCS, which uses LIS.
- “Dilworth’s Theorem”: Order theory foundations.
28. Conclusion
Longest Increasing Subsequence is a gem of a problem. It starts as a standard DP exercise ($O(N^2)$), transforms into a greedy binary search puzzle ($O(N \log N)$), connects to card games (Patience Sorting), and finds applications in everything from reading DNA to predicting stock markets. Mastering LIS means understanding the trade-off between optimality (DP) and efficiency (Greedy+Binary Search), a core skill for any systems engineer.
29. Summary
| Approach | Time | Space | Notes |
|---|---|---|---|
| DP | $O(N^2)$ | $O(N)$ | Simple, easy to extend |
| Binary Search | $O(N \log N)$ | $O(N)$ | Optimal for length only |
| Patience Sort | $O(N \log N)$ | $O(N)$ | Same as Binary Search |
Originally published at: arunbaby.com/dsa/0037-longest-increasing-subsequence