Google Maps

Overview
Introduction
- A modern mapping platform like Google Maps provides users with visual map exploration, point-of-interest (POI) discovery, and real-time turn-by-turn navigation.
- Designing a mapping system at a global scale requires solving intense distributed systems and algorithmic challenges:
- Calculating the shortest path across a continent containing billions of road segments in milliseconds.
- Securely ingesting millions of concurrent GPS heartbeats to deduce live traffic.
- Efficiently streaming visual map data without draining mobile bandwidth.
Requirements
- Functional Requirements
- Map Rendering: Users must be able to view and smoothly pan/zoom across a global map.
- Route Calculation: Users can request the fastest route between Point A and Point B.
- Live Navigation & ETA: The system must provide turn-by-turn directions and continuously update the Estimated Time of Arrival (ETA) based on live traffic.
- Location Telemetry: The client application must silently send anonymous GPS pings to the server to help crowdsource traffic data.
- Non Functional Requirements
- Scalability: Support 1 billion+ Monthly Active Users (MAU) and tens of millions of concurrent navigating users globally.
- Low Latency: Route calculation must complete in < 200ms. Map tiles must render instantly.
- High Availability: 99.99% uptime. The routing and navigation paths are mission-critical.
- Bandwidth Efficiency: The system must minimize cellular data usage for mobile clients, especially in areas with poor connectivity.
Data Model
For a system of this scale, relying on a single database paradigm is impossible. We require a Polyglot Persistence architecture tailored specifically to graph traversals, spatial querying, and high-throughput time-series data.
- In-Memory Graph Store (Custom C++ / Rust): The road network is fundamentally a massive graph (Nodes = Intersections, Edges = Road Segments). Because standard graph databases (like Neo4j) rely on disk I/O which is too slow for sub-200ms continental routing, the core routing topology is heavily compressed and loaded directly into RAM across a cluster of specialized routing machines.
- Wide-Column Store (Cassandra / ScyllaDB): Used to handle the massive write-throughput of incoming GPS telemetry. It stores historical location data and aggregated traffic speeds (edge weights) over time. Cassandra's masterless architecture ensures high availability for millions of writes per second.
- Object Storage (S3 / GCS) + CDN: Acts as the immutable storage layer for Vector Tiles. The globe's visual data is pre-processed into lightweight files that are heavily cached at the edge.
- Relational Spatial DB (PostgreSQL + PostGIS): Serves as the "Source of Truth" for Points of Interest (businesses, restaurants) and user metadata. PostGIS provides essential spatial querying features (e.g., finding all restaurants within a bounding box) using R-trees.
- In-Memory Cache (Redis Cluster): Stores the absolute most recent real-time traffic speeds (edge weights) for active road segments, allowing the routing engine to fetch live traffic penalties in single-digit milliseconds.
API Design
For a mapping service, we will use a classic RESTful API architecture. While real-time applications often default to WebSockets, for mobile location telemetry we explicitly choose stateless batched HTTP requests. This preserves mobile battery life by allowing the radio to sleep, handles spotty cellular connections gracefully, and ensures our backend ingestion tier can scale horizontally without maintaining millions of open, stateful connections.
- Get Map Tile: GET /v1/tiles/{z}/{x}/{y}.pbf
- Parameters: Zoom level (z), X coordinate (x), Y coordinate (y).
- Returns: A highly compressed Protocol Buffer (Protobuf) vector tile for client-side GPU rendering.
- Calculate Route: POST /v1/routes
- Payload: Origin coordinates, destination coordinates, and user preferences (e.g., {"avoid_tolls": true}).
- Returns: A polyline geometry for rendering the path, an Estimated Time of Arrival (ETA), and an array of turn-by-turn navigation steps.
- Publish Telemetry: POST /v1/telemetry/heartbeat
- Payload: A compressed array of recent GPS coordinates, headings, and speeds collected locally by the device over the last 10–15 seconds.
- Returns: 202 Accepted (Fire-and-forget; acknowledges receipt instantly so the client can close the connection).
- Search POI: GET /v1/places/search?query={query}&lat={lat}&lng={lng}&radius={radius}
- Parameters: Text query (e.g., "coffee"), current latitude/longitude, and search radius in meters.
- Returns: A JSON array of matching Points of Interest containing their unique IDs, coordinates, and metadata.
High Level Design
- Mobile Client: Requests map data for its current screen bounding box.
- API Gateway / Edge CDN: Intercepts client requests. Static tile requests rarely hit the backend; they are served directly from the CDN.
- Tile Service: If a tile isn't cached, this service retrieves the vector payload from Object Storage, applies any dynamic localized styling, and returns it to the client.
- Mobile Client: Sends a search query (e.g., "Starbucks").
- Search Service: Translates user queries into spatial bounding box queries against the PostGIS DB.
- Routing Service: Receives the A-to-B request, fetches the graph topology from memory, applies real-time traffic weights from Redis, and calculates the shortest path.
- Telemetry Service: A specialized high-throughput endpoint that blindly accepts incoming GPS payloads from navigating devices.
- Kafka & Stream Processing: Buffers the raw GPS pings, Flink cleans the data, and matches it to actual roads and updates Redis for real time traffic updates and Cassandra for historical storage.
While this skeleton logically separates features, it glosses over massive algorithmic and scaling bottlenecks. Simply running Dijkstra's algorithm on a graph of North America would take minutes, and we have skipped over explaining in detail how we will handle telemetry to meet our scale requirements. Let's dive into these now so meet all the system's non-functional requirements.
Deep Dive 1: Spatial Indexing & Vector Tiles
Deep Dive 2: Routing
Deep Dive 3: High-Throughput Telemetry
Complete Architecture
Additional Discussion Points
Master System Design Interviews
Get ready for the exact system design questions top tech companies are asking right now. Read comprehensive editorial write-ups and practice with our AI whiteboard that simulates a real, step-by-step interviewer experience.
See All System Designs →