System Design FAQ: Top Questions
23. How would you design a Feature Store for Machine Learning?
A Feature Store is a system that manages ML features, ensuring consistent definitions across training and inference, real-time access, and versioning.
📋 Functional Requirements
- Register, store, and retrieve features
- Serve real-time and batch data
- Ensure consistency between training and serving
- Provide metadata and lineage for features
📦 Non-Functional Requirements
- Low latency for online feature access (<10ms)
- Scalable for millions of rows/features
- Immutable versioned definitions
🏗️ Core Components
- Feature Registry: Metadata about features, owners, and transformations
- Batch Store: Historical data from Data Warehouse (e.g., BigQuery, S3)
- Online Store: Redis, Cassandra for serving real-time features
- Materialization Job: Scheduled pipeline to sync batch → online
- API Layer: Unified retrieval across stores
📝 Sample Feature Definition (Feast)
from feast import Feature, FeatureView, Entity, FileSource
driver_hourly_stats_source = FileSource(
path="data/driver_stats.parquet",
timestamp_field="event_timestamp",
)
driver = Entity(name="driver_id", join_keys=["driver_id"])
driver_hourly_stats_fv = FeatureView(
name="driver_hourly_stats",
entities=["driver_id"],
ttl=timedelta(hours=2),
features=[
Feature(name="conv_rate", dtype=Float),
Feature(name="acc_rate", dtype=Float),
],
online=True,
source=driver_hourly_stats_source,
)
📤 Real-Time Retrieval
from feast import FeatureStore
fs = FeatureStore(repo_path="feature_repo")
features = fs.get_online_features(
features=["driver_hourly_stats:conv_rate"],
entity_rows=[{"driver_id": 1001}]
).to_dict()
📁 Data Sync Pipeline
- Airflow/Beam job extracts batch features and loads into Redis
- TTL is enforced during ingestion to purge stale rows
🛂 Access Control
- Each feature tagged with owner team, environment (dev/prod)
- Use RBAC for updates, visibility, and deletion
📈 Observability
- Feature freshness per entity
- Serving latency and cache hit rate
- Data drift monitoring between train vs inference
📌 Final Insight
A Feature Store solves the train-serving skew problem by maintaining versioned, centralized features. It supports faster experimentation, reproducibility, and reliable model deployment pipelines.
