Flash Sale System

Overview
- Introduction
- Requirements
- Data Model
- API Design
- High Level Design
- Deep Dive 1: Preventing Overselling
- Deep Dive 2: Surviving the Thundering Herd
- Deep Dive 3: Distributed Transactions & Inventory Holds
- Deep Dive 4: Client Communication (Polling vs. WebSockets)
- Complete Architecture
- Additional Discussion Points
Introduction
- A flash sale system (e.g., Amazon Prime Day, limited sneaker drops, concert ticket sales) offers a highly desirable product at a deep discount or limited quantity for a very short period.
- Designing a robust flash sale architecture requires solving the most extreme edge cases in distributed systems: surviving the Thundering Herd (when traffic spikes from near-zero to millions of requests in a single second) while maintaining strict consistency to guarantee that not a single item is oversold.
Requirements
- Functional Requirements
- View Product: Users can view the flash sale product details and the countdown timer.
- Purchase Attempt: Users can attempt to purchase the item the exact second the sale begins.
- Order Status: Users can view the status of their order (e.g., Queued, Payment Pending, Confirmed, Sold Out).
- Timeouts: If a user secures an item but fails to complete payment within a strict window (e.g., 5 minutes), the item is returned to the pool.
- Non Functional Requirements
- Extreme Scalability: Must absorb bursts of 1M+ concurrent requests hitting the "Buy" button at the exact same second.
- Strict Consistency: Zero overselling. If there are 1,000 items, exactly 1,000 items can be sold. No exceptions.
- High Availability: 99.99% uptime for the edge infrastructure and queueing layers; graceful degradation is preferred over a hard crash.
- Low Latency: < 100ms response time to acknowledge the user's purchase attempt and place them in the fulfillment queue.
Data Model
To handle the contrasting demands of extreme burst writes and strict financial auditing, a Polyglot Persistence strategy is strictly required.
- In-Memory Store (Redis Cluster): The absolute core of the flash sale. Used for high-speed, atomic inventory management. Relational databases are too slow to handle 1M+ concurrent updates to a single row. Redis handles the immediate transactional state (item_id -> available_stock).
- Relational DB (PostgreSQL): Serves as the system's "Source of Truth" for completed orders, user data, and financial records. It is shielded from the traffic burst by the message queue and relies heavily on ACID properties to finalize state.
- Message Broker (Apache Kafka): Acts as a durable shock absorber. It decouples the fast, synchronous inventory reservation step from the slower, complex order fulfillment and payment processes.
- CDN (Content Delivery Network): Caches 99.9% of the read traffic (product images, descriptions, JS/CSS). Without a CDN, the read traffic preceding the sale would melt the application servers before the sale even began.
API Design
For a Flash Sale System, the API must be lightweight, aggressively cached where possible, and enforce strict idempotency to handle panicked user retries.
- Get Sale Details:
- GET /v1/sales/{sale_id} (Aggressively cached at the CDN edge).
- Checkout / Reserve:
- POST /v1/sales/{sale_id}/checkout
- Requires: user_id, product_id, idempotency_key
- Returns: 202 Accepted with a queue_ticket_id if inventory is reserved, or 409 Conflict / 410 Gone if sold out.
- Poll Status:
- GET /v1/orders/status/{queue_ticket_id}
- Returns: Pending, Awaiting Payment, or Failed.
High Level Design
- CDN (Content Delivery Network): Absorbs 99.9% of the pre-sale read traffic as millions of users visit the site. It serves the static HTML, CSS, and the client-side JavaScript, which syncs with the server time via a highly cached endpoint to run the countdown locally.
- API Gateway: Acts as a shield at T-Zero when the countdown hits zero and a massive burst of POST requests arrives. It drops bad actors, validates session tokens, and enforces strict rate limits before passing clean traffic to the backend.
- Order Service (Stateless): Receives the clean traffic and hashes the user's ID to deterministically route them to a specific node within the Redis Cluster (Sharded).
- Redis Cluster (Sharded): Executes an atomic Lua script that instantly checks inventory, decrements it, and places a 5-minute hold on the item.
- Apache Kafka (Event Bus): Acts as the system's shock absorber, receiving a CheckoutEvent published by the Order Service the moment a Redis reservation succeeds.
- Client-Side JavaScript: Receives an instant 202 Accepted and a queue_ticket_id back from the Order Service. The UI displays a loading spinner and immediately begins short-polling a Read-Optimized Redis Cache (via the Order Service) to wait for the payment link.
- Order Processors (Worker Nodes): Consume the successful checkout events from Kafka at a steady, controlled rate. For each event, they request a secure checkout URL from the Payment Gateway, write the PENDING_PAYMENT order into PostgreSQL, and push the checkout URL up to the Read-Optimized Redis Cache.
- Expiration Worker (Cleanup): Acts as a final safety net. If the client successfully redirects to the payment page but abandons the checkout, 5 minutes will pass. This worker detects the expired hold, cancels the PostgreSQL order, and atomically increments the inventory back up on the exact Redis shard it originated from.
While this high-level flow conceptually absorbs the load, we need to be able to explain how we reserve inventory, prevent overselling, and survive thundering herds so let's dive into that now.
Deep Dive 1: Preventing Overselling
Deep Dive 2: Surviving the Thundering Herd
Deep Dive 3: Distributed Transactions & Inventory Holds
Deep Dive 4: Client Communication (Polling vs. WebSockets)
Complete Architecture
Additional Discussion Points
Master System Design Interviews
Get ready for the exact system design questions top tech companies are asking right now. Read comprehensive editorial write-ups and practice with our AI whiteboard that simulates a real, step-by-step interviewer experience.
See All System Designs →