System Design FAQ: Top Questions
8. How would you design a Search Autocomplete System?
A Search Autocomplete System suggests query completions in real-time as users type. It enhances user experience by speeding up input and guiding queries based on popularity, context, and intent.
๐ Functional Requirements
- Show top N suggestions while typing
- Personalized suggestions (optional)
- Real-time updates and fast response
๐ฆ Non-Functional Requirements
- Low latency (<100ms)
- Highly available and scalable
- Support language normalization (e.g., case insensitivity, diacritics)
๐๏ธ Architecture Components
- Frontend: Captures input and shows suggestions
- API Gateway: For routing and caching responses
- Search Service: Trie or prefix tree structure for lookups
- Suggestion Engine: Ranks and filters results
- Metrics Store: Tracks clicks and impressions
๐ง Data Structures
- Trie: Prefix tree for fast lookup and insertion
- Weighted Trie: Stores frequency or popularity
- Fallback: Use Elasticsearch/Lucene for rare terms
๐ง Trie Implementation in Python
class TrieNode:
def __init__(self):
self.children = {}
self.is_end = False
self.frequency = 0
class Trie:
def __init__(self):
self.root = TrieNode()
def insert(self, word, freq=1):
node = self.root
for ch in word:
if ch not in node.children:
node.children[ch] = TrieNode()
node = node.children[ch]
node.is_end = True
node.frequency += freq
def autocomplete(self, prefix):
def dfs(node, path):
if node.is_end:
results.append((''.join(path), node.frequency))
for ch, child in node.children.items():
path.append(ch)
dfs(child, path)
path.pop()
results = []
node = self.root
for ch in prefix:
if ch not in node.children:
return []
node = node.children[ch]
dfs(node, list(prefix))
return sorted(results, key=lambda x: -x[1])[:5]
trie = Trie()
trie.insert("apple", 100)
trie.insert("app", 300)
trie.insert("applet", 50)
print(trie.autocomplete("app")) # [('app', 300), ('apple', 100), ('applet', 50)]
โ๏ธ Real-Time Suggestions with Redis Sorted Sets
ZADD suggestions 1000 "app"
ZADD suggestions 500 "apple"
ZADD suggestions 200 "applet"
ZRANGEBYLEX suggestions [app [apq LIMIT 0 5
๐ Metrics Tracking
- Track frequency of selected suggestions
- Store in Kafka โ aggregate in ClickHouse/BigQuery
- Use weekly aggregates to reweight Trie or Redis
๐ Security & Abuse Protection
- Limit rate per IP/user
- Filter profane/inappropriate completions
๐งช Observability
- Latency per keystroke (frontend + backend)
- Autocomplete cache hit ratio
- Abandoned suggestions vs. clicks
๐ Final Insight
Autocomplete systems combine structured data (tries), ranking (frequency or ML), and real-time responsiveness. Caching and careful tracking of user behavior are key to a snappy and useful UX.
