System Design FAQ: Top Questions
62. How would you design a Notification System (Email, SMS, Push)?
A Notification System delivers asynchronous messages like emails, SMS, or push notifications to users based on rules, preferences, or system events. It must handle retries, user targeting, templates, and vendor integrations.
📋 Functional Requirements
- Support for Email, SMS, and Push
- User preference handling
- Templating, personalization, and rate limits
- Retry failed sends with backoff
📦 Non-Functional Requirements
- High throughput, low latency
- Reliability and observability
- Vendor failover (e.g., fallback SMS provider)
🏗️ Architecture Overview
- Producer: Services publish events (e.g., order placed)
- Message Queue: Kafka/SQS decouples producer/consumer
- Worker: Consumes queue, evaluates preferences, renders templates
- Notifier: Sends via vendor APIs (Twilio, SendGrid, Firebase)
📄 Notification Template Example (Handlebars)
Hi {{name}},
Your order #{{orderId}} has been shipped and will arrive by {{deliveryDate}}.
Thanks,
Team
🔁 Retry Logic with Exponential Backoff
async function sendWithRetry(payload, maxRetries = 5) {
for (let attempt = 1; attempt <= maxRetries; attempt++) {
try {
await sendToVendor(payload);
return;
} catch (err) {
const delay = Math.pow(2, attempt) * 100;
await sleep(delay);
}
}
logError("Notification permanently failed", payload);
}
🔌 Vendor Integration Example (SendGrid)
const sgMail = require('@sendgrid/mail');
sgMail.setApiKey(process.env.SENDGRID_API_KEY);
sgMail.send({
to: 'user@example.com',
from: 'noreply@company.com',
subject: 'Welcome!',
text: 'Thanks for joining our app!'
});
🧠 User Preference Management
- DB table per user with channel toggles (email=true, sms=false)
- Default settings for new users
- Consent audit logging
📈 Observability
- Delivery rate, bounce rate
- Failure trends by channel
- Time-to-notify metric (event → delivery)
🔐 Security & Compliance
- Rate limiting per user/org
- Secure token usage in templates (e.g., password reset)
- Support for GDPR unsubscribe mechanisms
🧰 Tools & Infra
- Queue: Kafka, SQS, RabbitMQ
- Vendors: SendGrid, SES, Twilio, Firebase
- Templates: Handlebars, Mustache, MJML
- Infra: Node.js, Golang, Python workers
📌 Final Insight
Decouple message generation from delivery using queues and vendor-agnostic workers. Retry intelligently and monitor for deliverability across all channels. Consider UI for audit logs, preferences, and failure inspection.