How distributed systems divide data into partitions for parallel processing, ordering guarantees, and horizontal scalability
85% of messaging interviews
Powers systems at LinkedIn, Uber, Netflix
Millions of msgs/sec query improvement
1+ trillion msgs/day
TL;DR
Topic partitioning divides a message topic into multiple independent partitions, enabling parallel processing, ordering guarantees per partition, and horizontal scaling. Each partition is an ordered, immutable sequence of messages that can be processed independently.
Visual Overview
TOPIC: user-events
βββ Partition 0: [msg1][msg2][msg3][msg4]... β Consumer A
βββ Partition 1: [msg5][msg6][msg7][msg8]... β Consumer B
βββ Partition 2: [msg9][msg10][msg11][msg12]... β Consumer C
βββ Partition 3: [msg13][msg14][msg15][msg16]... β Consumer D
Key Properties:
- Ordered within partition (strict)
- No ordering across partitions
- Independent parallel processing
- Partitions distributed across brokers
Core Explanation
What is Topic Partitioning?
A topic is a logical category of messages (e.g., βuser-eventsβ, βpayment-transactionsβ). Partitioning splits this topic into multiple ordered logs called partitions, where:
- Each partition is an independent, ordered sequence
- Messages in a partition maintain strict ordering
- Partitions can be processed in parallel
- Partitions are distributed across multiple servers (brokers)
How Partition Assignment Works
When a producer sends a message, it must decide which partition to send to:
// Method 1: Explicit partition (rare)
producer.send(new ProducerRecord<>("user-events",
partition, // specify partition 0, 1, 2, etc.
key,
value
));
// Method 2: Key-based hashing (most common)
producer.send(new ProducerRecord<>("user-events",
userId, // key determines partition
event
));
// partition = hash(userId) % num_partitions
// Same userId always goes to same partition
// Method 3: Round-robin (no key)
producer.send(new ProducerRecord<>("user-events",
event // no key, cycles through partitions
));
Key-based partitioning is most common because it provides:
- Ordering per key: All events for
user_123
arrive in order - Load distribution: Hash function spreads keys evenly
- Stateful processing: Consumer can maintain per-user state
Why Partitioning Enables Scale
Single Partition Limits:
Topic (1 partition)
βββ [msg1][msg2][msg3][msg4]...
βββ Can handle ~10K msgs/sec (single consumer)
Multiple Partitions Scale:
Topic (4 partitions)
βββ P0: [msgs]... β Consumer A (10K/sec)
βββ P1: [msgs]... β Consumer B (10K/sec)
βββ P2: [msgs]... β Consumer C (10K/sec)
βββ P3: [msgs]... β Consumer D (10K/sec)
Total: 40K msgs/sec
Scaling Pattern:
- 1 partition = 1 max consumer
- 4 partitions = 4 parallel consumers
- 100 partitions = 100 parallel consumers
- Throughput scales linearly with partitions
Ordering Guarantees
Within Partition (Strong Ordering):
Partition 0:
user_123: [login][click][purchase] β Ordered
Consumer reads: login β click β purchase
Across Partitions (No Ordering):
Partition 0: [user_123: login] at T1
Partition 1: [user_456: click] at T0
Consumer might see:
- user_456: click (T0) FIRST
- user_123: login (T1) SECOND
No guarantee on global order!
Tradeoffs
Advantages:
- β Horizontal scalability (add more partitions/consumers)
- β High throughput (parallel processing)
- β Fault isolation (partition failure doesnβt affect others)
- β Ordered processing per partition
Disadvantages:
- β No global ordering across topic
- β Partition count hard to change later
- β Poor key distribution creates hot partitions
- β Increases operational complexity
Real Systems Using This
Kafka (Apache)
- Implementation: Topic β Partitions β Segments on disk
- Scale: LinkedIn processes 7+ trillion messages/day
- Partition Strategy: Key-based hashing for ordering
- Typical Setup: 10-100 partitions per topic
Amazon Kinesis
- Implementation: Streams β Shards (similar to partitions)
- Scale: Handles millions of events/second
- Partition Strategy: Explicit shard keys
- Typical Setup: Start with 1 shard, scale to thousands
Apache Pulsar
- Implementation: Topics β Partitions β Segments
- Scale: Handles petabytes of data
- Partition Strategy: Key-based + custom routing
- Typical Setup: 100-1000+ partitions for high-scale topics
Comparison Table
System | Term | Max Throughput/Partition | Partition Limit | Rebalancing |
---|---|---|---|---|
Kafka | Partition | ~100 MB/sec | Thousands | Consumer group |
Kinesis | Shard | 1 MB/sec write | Thousands | Manual split/merge |
Pulsar | Partition | ~200 MB/sec | Thousands | Automatic |
When to Use Topic Partitioning
β Perfect Use Cases
High-Volume Event Streams
Scenario: User activity tracking (1M events/sec)
Solution: 100 partitions, key by userId
Result: Each partition handles 10K/sec, easily scalable
Parallel Data Processing
Scenario: Real-time analytics pipeline
Solution: Partition by eventType or userId
Result: Multiple workers process different partitions in parallel
Ordered Processing by Key
Scenario: Financial transactions per account
Solution: Partition by accountId
Result: All transactions for an account processed in order
β When NOT to Use
Need Total Ordering
Problem: Stock trade execution order matters globally
Issue: Partitions break global ordering
Alternative: Single partition + high-performance consumer
Highly Skewed Keys
Problem: 90% of traffic is one celebrity user
Issue: Hot partition, uneven load distribution
Alternative: Sub-partition by timestamp or random suffix
Small Message Volume
Problem: 100 messages/day
Issue: Overhead of managing partitions not worth it
Alternative: Single partition, simple queue
Interview Application
Common Interview Question 1
Q: βDesign a chat system that handles 1 billion messages/day. How would you partition messages?β
Strong Answer:
βIβd partition by
conversation_id
to maintain message ordering within each conversation. This allows parallel processing across conversations while guaranteeing order within each chat. With 10 million active conversations, Iβd start with 100 partitions (100K conversations per partition), giving us ~420K messages/sec aggregate throughput. As we scale, we can add more partitions and rebalance.β
Why this is good:
- Identifies the key (conversation_id)
- Explains ordering requirement
- Does capacity math
- Plans for scaling
Common Interview Question 2
Q: βWhat happens when a partition gets hot (e.g., celebrity user gets 100x traffic)?β
Strong Answer:
βHot partitions are a real problem. Solutions:
- Add random suffix to key:
celebrity_123_<random>
spreads across partitions- Dedicated partition: Give celebrity their own partition with dedicated consumer
- Application-level caching: Reduce duplicate messages
- Re-partition: If possible, change partition key strategy
Trade-off: Losing per-key ordering if we split the key. For celebrities, eventual consistency might be acceptable since their feed already lags.β
Why this is good:
- Shows awareness of real problem
- Multiple solutions with tradeoffs
- Production thinking (celebrity use case)
Red Flags to Avoid
- β Claiming you can have total ordering with partitions
- β Not considering hot partition problem
- β Choosing partition count without capacity reasoning
- β Forgetting that partition count is hard to change later
Quick Self-Check
Before moving on, can you:
- Explain topic partitioning in 60 seconds?
- Draw a diagram showing partitions and consumers?
- Explain how key-based partitioning works?
- Identify when to use vs NOT use partitioning?
- Understand the ordering guarantees?
- Calculate partition count for a given throughput?
Related Content
Prerequisites
None - this is a foundational concept
Related Concepts
- Consumer Groups - How consumers coordinate across partitions
- Sharding - Similar concept for databases
- Load Balancing - Distribution strategies
Used In Systems
- Real-Time Chat System - Message distribution per conversation
- Analytics Pipeline - Event processing at scale
Explained In Detail
- Kafka Architecture - Topics, Partitions & Segments section (7 minutes)
- Deep dive into Kafkaβs implementation of partitioning, segment storage, and replication across brokers
Next Recommended: Consumer Groups - Learn how consumers coordinate to process partitions in parallel