System Design FAQ: Top Questions
42. How would you design a Payment Gateway System like Stripe or Razorpay?
A Payment Gateway System processes, validates, and routes transactions between customers, banks, and merchants. It handles security, compliance, and reliability under high concurrency.
📋 Functional Requirements
- Initiate and authorize card and UPI transactions
- Integrate with multiple banks/processors
- Webhooks for payment status updates
- Store transactions and user metadata securely
📦 Non-Functional Requirements
- PCI-DSS compliance
- High availability (5 nines uptime)
- Idempotent transaction handling
- Low-latency authorization flow
🏗️ Core Components
- Checkout API: Accepts client token, amount, metadata
- Authorization Service: Calls partner banks for approval
- Ledger: Double-entry transaction store (immutable)
- Webhook Processor: Sends confirmation to merchant backend
- Compliance & Vault: Secures card/PII with tokenization
🔐 Vault Encryption Example (Vault CLI)
vault kv put secret/card cvv=123 number=4111111111111111
vault kv get secret/card
🔁 Idempotency Token Example (Stripe)
POST /v1/charges
Idempotency-Key: 089e3e3a-35ad-4a92-a4f2-6e7b61da9a00
Content-Type: application/json
{
"amount": 2000,
"currency": "usd",
"source": "tok_visa"
}
📜 Double-Entry Ledger Schema (PostgreSQL)
CREATE TABLE ledger_entries (
id UUID PRIMARY KEY,
transaction_id UUID,
account TEXT,
amount_cents BIGINT,
entry_type TEXT CHECK(entry_type IN ('debit', 'credit')),
created_at TIMESTAMP DEFAULT now()
);
📨 Webhook Handler Retry Logic
- Send POST to merchant endpoint
- Retry on 5xx responses with backoff
- Use DLQ for persistent failures
📈 Observability
- Success/failure rate by bank/processor
- Avg. processing latency per step
- Webhook delivery delay and retries
🧰 Tools/Infra Used
- Tokenization: Vault, HSMs
- DB: PostgreSQL, CockroachDB (strong consistency)
- Messaging: Kafka, RabbitMQ
- Queueing: SQS with DLQs
📌 Final Insight
A payment system must prioritize security, atomicity, and auditability. Idempotency, ledger integrity, and multibank routing logic are vital. Use strong isolation and retry-safe APIs.