URL Shortener System (BITLY) System Design

Problem statement

We want a service that converts a long URL into a short URL and redirects users back to the original URL with low latency and high reliability.

Core flow (from the diagram):

User submits a long URL.
System generates a short code (random/unique).
Store mapping in database.
When user clicks short URL: retrieve long URL and redirect.

Rendering diagram…

Nice-to-have:

Custom alias (e.g. /my-event)
Link expiration
Error handling after expiry / invalid code
Click tracking (total count, maybe more analytics)

Requirements (what matters in production)

Read-heavy: redirects are much more frequent than creates (often (1000:1)).
Latency: redirect path should be < 50 milliseconds at the edge/region.
Correctness: same short code must always map to the same long URL.
Abuse protection: spam, phishing, and brute-force enumeration.
Durability: mappings must survive restarts, deployments, cache evictions.
Observability: track errors, redirect latency, cache hit rate, and CTR.

API design

Create short URL

POST /api/links

Request:

longUrl (required)
customAlias (optional)
expiresAt (optional)

Response:

shortUrl
code
expiresAt

Redirect

GET /:code → 301/302 redirect to longUrl

Notes:

Use 302 if you expect the target to change or want better control.
Use 301 only if you want aggressive browser caching (often not desired for analytics/controls).

Data model

Table: links

code (PK, string)
long_url (text)
created_at (timestamp)
expires_at (timestamp nullable)
created_by (nullable)
is_active (bool, default true)

Table: link_events (optional, for analytics at scale)

code (index)
ts (timestamp)
ip_hash / country / ua (optional)
referrer (optional)

At high volume, treat analytics as append-only events (queue/stream) and keep redirect path hot.

Short code generation

We need a short, URL-safe identifier. Common approach: Base62 (0-9, a-z, A-Z).

Option A: Random code (recommended for simplicity)

Generate random 6–10 chars Base62.
Insert into DB with unique constraint on code.
If insert fails due to collision → retry.

Collision probability becomes negligible with enough length:

Base62 has 62 symbols.
For length 7: 62^7 ≈ 3.5 × 10^12 possibilities (about 3.5 trillion).

Option B: ID → Base62 (predictable; needs hardening)

Use DB auto-increment / Snowflake ID.
Encode to Base62.
Pros: no collision, faster writes.
Cons: predictable, easier to enumerate. Mitigate via:
- adding salt + hashid
- using non-sequential IDs (Snowflake) + extra obfuscation

Redirect path: caching is the game

Redirect is the hottest endpoint. The standard pattern:

L1 cache: in-process (per instance) for ultra-fast hits (tiny TTL).
L2 cache: Redis/KeyDB/Upstash (shared) with TTL.
DB as source of truth.

Flow:

Receive GET /:code
Check L1 → if hit, redirect
Check L2 → if hit, populate L1, redirect
DB lookup → validate active/expiry → populate caches → redirect
If not found/expired → 404 or custom error page

Cache key:

link:{code} → { longUrl, expiresAt, isActive }

TTL strategy:

If expiresAt exists: set TTL to expiresAt - now (clamped to min/max)
Else: use a default TTL (e.g. 24h) + refresh on access

Expiration + error handling

When a link expires:

Redirect should return 404 (or a branded “Link expired” page)
Also invalidate caches:
- best-effort delete link:{code} from L2
- L1 naturally expires quickly via TTL

Batch cleanup:

Cron job to mark expired links inactive (optional)
Avoid scanning huge tables; use index on expires_at and query by range

Custom alias

customAlias is just code chosen by user:

Validate length, charset (Base62 or a safer subset), and banned words.
Enforce uniqueness via DB constraint.
Optionally reserve system routes (api, blog, admin, etc.).

Analytics without slowing redirects

The redirect path should not block on analytics.

Do:

Emit event to a queue/stream (or fire-and-forget log) with minimal payload.
Aggregate asynchronously (daily/hourly).

If you only need total clicks:

Keep a clicks counter in Redis and flush to DB periodically.
Or use a write-optimized store for counters.

Scaling to millions of requests/day

Stateless app servers behind a load balancer.
Read replicas for DB if cache miss rate is high.
Sharding if mapping dataset grows very large:
- shard by code hash
CDN / edge caching can help for extremely hot codes, but be careful:
- caching redirects may hide revokes/expiry updates.

Security considerations

Phishing/malware: integrate a URL reputation check (async or on create).
Rate limiting:
- create endpoint: strict per IP/user
- redirect endpoint: softer limits + bot protection
Enumeration:
- prefer random codes
- monitor high 404 rates and throttle

Summary

To build a solid URL shortener:

Use random Base62 codes + unique constraint + retry.
Keep redirect fast with multi-layer caching.
Support custom alias, expiration, and analytics with async pipelines.
Design for scaling early: stateless services, durable DB, and observable cache behavior.