URL Shortener System (BITLY) System Design
Problem statement
We want a service that converts a long URL into a short URL and redirects users back to the original URL with low latency and high reliability.
Core flow (from the diagram):
- User submits a long URL.
- System generates a short code (random/unique).
- Store mapping in database.
- When user clicks short URL: retrieve long URL and redirect.
Rendering diagram…
Nice-to-have:
- Custom alias (e.g.
/my-event) - Link expiration
- Error handling after expiry / invalid code
- Click tracking (total count, maybe more analytics)
Requirements (what matters in production)
- Read-heavy: redirects are much more frequent than creates (often (1000:1)).
- Latency: redirect path should be < 50 milliseconds at the edge/region.
- Correctness: same short code must always map to the same long URL.
- Abuse protection: spam, phishing, and brute-force enumeration.
- Durability: mappings must survive restarts, deployments, cache evictions.
- Observability: track errors, redirect latency, cache hit rate, and CTR.
API design
Create short URL
POST /api/links
Request:
longUrl(required)customAlias(optional)expiresAt(optional)
Response:
shortUrlcodeexpiresAt
Redirect
GET /:code → 301/302 redirect to longUrl
Notes:
- Use 302 if you expect the target to change or want better control.
- Use 301 only if you want aggressive browser caching (often not desired for analytics/controls).
Data model
Table: links
code(PK, string)long_url(text)created_at(timestamp)expires_at(timestamp nullable)created_by(nullable)is_active(bool, default true)
Table: link_events (optional, for analytics at scale)
code(index)ts(timestamp)ip_hash/country/ua(optional)referrer(optional)
At high volume, treat analytics as append-only events (queue/stream) and keep redirect path hot.
Short code generation
We need a short, URL-safe identifier. Common approach: Base62 (0-9, a-z, A-Z).
Option A: Random code (recommended for simplicity)
- Generate random 6–10 chars Base62.
- Insert into DB with unique constraint on
code. - If insert fails due to collision → retry.
Collision probability becomes negligible with enough length:
- Base62 has 62 symbols.
- For length 7:
62^7 ≈ 3.5 × 10^12possibilities (about 3.5 trillion).
Option B: ID → Base62 (predictable; needs hardening)
- Use DB auto-increment / Snowflake ID.
- Encode to Base62.
- Pros: no collision, faster writes.
- Cons: predictable, easier to enumerate. Mitigate via:
- adding salt + hashid
- using non-sequential IDs (Snowflake) + extra obfuscation
Redirect path: caching is the game
Redirect is the hottest endpoint. The standard pattern:
- L1 cache: in-process (per instance) for ultra-fast hits (tiny TTL).
- L2 cache: Redis/KeyDB/Upstash (shared) with TTL.
- DB as source of truth.
Flow:
- Receive
GET /:code - Check L1 → if hit, redirect
- Check L2 → if hit, populate L1, redirect
- DB lookup → validate active/expiry → populate caches → redirect
- If not found/expired → 404 or custom error page
Cache key:
link:{code}→{ longUrl, expiresAt, isActive }
TTL strategy:
- If
expiresAtexists: set TTL toexpiresAt - now(clamped to min/max) - Else: use a default TTL (e.g. 24h) + refresh on access
Expiration + error handling
When a link expires:
- Redirect should return 404 (or a branded “Link expired” page)
- Also invalidate caches:
- best-effort delete
link:{code}from L2 - L1 naturally expires quickly via TTL
- best-effort delete
Batch cleanup:
- Cron job to mark expired links inactive (optional)
- Avoid scanning huge tables; use index on
expires_atand query by range
Custom alias
customAlias is just code chosen by user:
- Validate length, charset (Base62 or a safer subset), and banned words.
- Enforce uniqueness via DB constraint.
- Optionally reserve system routes (
api,blog,admin, etc.).
Analytics without slowing redirects
The redirect path should not block on analytics.
Do:
- Emit event to a queue/stream (or fire-and-forget log) with minimal payload.
- Aggregate asynchronously (daily/hourly).
If you only need total clicks:
- Keep a
clickscounter in Redis and flush to DB periodically. - Or use a write-optimized store for counters.
Scaling to millions of requests/day
- Stateless app servers behind a load balancer.
- Read replicas for DB if cache miss rate is high.
- Sharding if mapping dataset grows very large:
- shard by
codehash
- shard by
- CDN / edge caching can help for extremely hot codes, but be careful:
- caching redirects may hide revokes/expiry updates.
Security considerations
- Phishing/malware: integrate a URL reputation check (async or on create).
- Rate limiting:
- create endpoint: strict per IP/user
- redirect endpoint: softer limits + bot protection
- Enumeration:
- prefer random codes
- monitor high 404 rates and throttle
Summary
To build a solid URL shortener:
- Use random Base62 codes + unique constraint + retry.
- Keep redirect fast with multi-layer caching.
- Support custom alias, expiration, and analytics with async pipelines.
- Design for scaling early: stateless services, durable DB, and observable cache behavior.
Written by Vũ Thanh Thiên