How message producers batch records to achieve high throughput by amortizing network overhead and maximizing sequential I/O
70% of performance interviews
Powers systems at LinkedIn (7+ trillion msgs/day)
10-100x throughput query improvement
90%+ fewer network requests
TL;DR
Producer batching groups multiple messages together before sending them to the server, amortizing network overhead and maximizing throughput. Instead of sending each message immediately (1 message = 1 network request), batching collects messages for a short time window or until reaching a size threshold, then sends them together in a single request. This technique can improve throughput by 10-100x.
Visual Overview
WITHOUT BATCHING (Naive Approach):
T=0ms: [Message A] βββΆ Network Request 1
T=5ms: [Message B] βββΆ Network Request 2
T=8ms: [Message C] βββΆ Network Request 3
T=12ms: [Message D] βββΆ Network Request 4
Result: 4 network requests, ~50ms total latency
Overhead: 4x network round-trips, 4x TCP overhead
WITH BATCHING (Optimized):
T=0ms: [Message A] βββ
T=5ms: [Message B] βββ€
T=8ms: [Message C] βββΌβββ Batch Accumulation
T=12ms: [Message D] βββ
T=20ms: [Batch: A,B,C,D] βββΆ Single Network Request
Result: 1 network request, ~30ms total latency
Overhead: 1x network round-trip, 4x compression efficiency
BATCH TRIGGERS:
βββ Size Threshold: batch.size = 32 KB (default)
βββ Time Threshold: linger.ms = 20 ms (configurable)
βββ Memory Pressure: Buffer full, send immediately
βββ Explicit Flush: Application calls flush()
Core Explanation
What is Producer Batching?
Producer batching is a performance optimization where a message producer accumulates multiple messages in memory before sending them to the server in a single network request.
BATCHING ARCHITECTURE:
Application Thread:
producer.send(message_1) βββ
producer.send(message_2) βββ€
producer.send(message_3) βββΌβββΆ Batch Buffer (per partition)
producer.send(message_4) βββ€ β
producer.send(message_5) βββ β
βΌ
Background Sender Thread: β
ββββββββββββββ΄βββββββββββββ
β Wait for trigger: β
β - Size >= 32 KB β
β - Time >= linger.ms β
β - Buffer full β
ββββββββββββββ¬βββββββββββββ
βΌ
[Send Batch] βββΆ Server
Key Batching Parameters:
// Batch size threshold (bytes)
batch.size = 32768 // 32 KB default
// Time to wait for batch to fill (milliseconds)
linger.ms = 0 // Send immediately (default)
linger.ms = 20 // Wait up to 20ms for more messages
// Total memory for all batches
buffer.memory = 67108864 // 64 MB default
Why Batching Dramatically Improves Performance
Network Overhead Analysis:
SINGLE MESSAGE SEND:
βββββββββββββββββββββββββββββββββββββββββββββββ
β TCP/IP Header: 40 bytes β
β Kafka Protocol Header: 100 bytes β
β Message Overhead: 50 bytes β
β Actual Message Payload: 200 bytes β
β βββββββββββββββββββββββββββββββββββββββββββ β
β Total: 390 bytes β
β Efficiency: 200/390 = 51% β
βββββββββββββββββββββββββββββββββββββββββββββββ
BATCHED SEND (100 messages):
βββββββββββββββββββββββββββββββββββββββββββββββ
β TCP/IP Header: 40 bytes (1x) β
β Kafka Protocol Header: 100 bytes (1x) β
β Message Overhead: 50 bytes Γ 100 = 5000 β
β Actual Message Payload: 200 Γ 100 = 20000 β
β βββββββββββββββββββββββββββββββββββββββββββ β
β Total: 25,140 bytes β
β Efficiency: 20000/25140 = 80% β
β Network Savings: 64x fewer requests β
βββββββββββββββββββββββββββββββββββββββββββββββ
Result: 64x reduction in network overhead!
Throughput Impact:
Scenario: Send 100,000 messages (200 bytes each)
NO BATCHING:
βββ Network RTT: 1ms per request
βββ Total time: 100,000 Γ 1ms = 100 seconds
βββ Throughput: 1,000 messages/sec
WITH BATCHING (100 msg/batch):
βββ Network RTT: 1ms per batch
βββ Total time: 1,000 batches Γ 1ms = 1 second
βββ Throughput: 100,000 messages/sec
100x improvement! π
Batching Triggers and Tradeoffs
Batch Completion Triggers:
TRIGGER 1: SIZE THRESHOLD REACHED
βββββββββββββββββββββββββββββββββββββ
Current batch: 31 KB
New message: 2 KB
Total: 33 KB > batch.size (32 KB)
Action: Send batch immediately
TRIGGER 2: TIME THRESHOLD REACHED
βββββββββββββββββββββββββββββββββββββ
Batch started: T=0ms
Current time: T=20ms >= linger.ms (20ms)
Action: Send batch (even if not full)
TRIGGER 3: MEMORY PRESSURE
βββββββββββββββββββββββββββββββββββββ
Buffer memory: 64 MB
Used: 62 MB (97% full)
Action: Send oldest batches to free memory
TRIGGER 4: EXPLICIT FLUSH
βββββββββββββββββββββββββββββββββββββ
Application calls: producer.flush()
Action: Send all pending batches immediately
The Latency-Throughput Tradeoff:
CONFIGURATION SPECTRUM:
Low Latency (Real-time Systems):
βββββββββββββββββββββββββββββββββββ
β linger.ms = 0 β β Send immediately
β batch.size = 16384 (16 KB) β
β β
β Latency: ~1-2ms β
β Throughput: ~10K msg/sec β
β Use case: Trading, alerts β
βββββββββββββββββββββββββββββββββββ
Balanced (Most Applications):
βββββββββββββββββββββββββββββββββββ
β linger.ms = 10-20 β β Small wait window
β batch.size = 32768 (32 KB) β
β β
β Latency: ~15-25ms β
β Throughput: ~50K msg/sec β
β Use case: Event streaming β
βββββββββββββββββββββββββββββββββββ
High Throughput (Analytics):
βββββββββββββββββββββββββββββββββββ
β linger.ms = 50-100 β β Longer wait
β batch.size = 131072 (128 KB) β
β β
β Latency: ~60-120ms β
β Throughput: ~200K msg/sec β
β Use case: Log aggregation β
βββββββββββββββββββββββββββββββββββ
Production Configuration Examples
Example 1: High-Throughput Log Ingestion
Properties props = new Properties();
// Optimize for throughput
props.put(ProducerConfig.BATCH_SIZE_CONFIG, 131072); // 128 KB
props.put(ProducerConfig.LINGER_MS_CONFIG, 50); // Wait 50ms
props.put(ProducerConfig.BUFFER_MEMORY_CONFIG, 268435456); // 256 MB
// Enable compression for better batching
props.put(ProducerConfig.COMPRESSION_TYPE_CONFIG, "lz4");
// Allow more in-flight requests
props.put(ProducerConfig.MAX_IN_FLIGHT_REQUESTS_PER_CONNECTION, 5);
// Result: 10x throughput improvement
// Tradeoff: ~60ms added latency
Example 2: Low-Latency Real-Time Events
Properties props = new Properties();
// Optimize for latency
props.put(ProducerConfig.BATCH_SIZE_CONFIG, 16384); // 16 KB
props.put(ProducerConfig.LINGER_MS_CONFIG, 0); // No wait
props.put(ProducerConfig.BUFFER_MEMORY_CONFIG, 33554432); // 32 MB
// Minimal compression overhead
props.put(ProducerConfig.COMPRESSION_TYPE_CONFIG, "lz4");
// Limit in-flight for ordering
props.put(ProducerConfig.MAX_IN_FLIGHT_REQUESTS_PER_CONNECTION, 1);
// Result: <5ms p99 latency
// Tradeoff: Lower throughput (~20K msg/sec)
Batch Compression and Efficiency
Compression with Batching:
WHY BATCHING IMPROVES COMPRESSION:
Single Message Compression:
Message 1: {"user_id": 123, "event": "click", "timestamp": 1234567890}
Compressed: 58 bytes β 52 bytes (10% savings)
Batched Messages Compression (100 messages):
Original: 5,800 bytes
Compressed (lz4): 1,200 bytes (80% savings!)
Why better compression?
βββ Repeated keys: "user_id", "event", "timestamp" appear 100x
βββ Similar values: Timestamps are sequential
βββ Pattern recognition: Better with larger data sets
βββ Compression dictionary: More effective context
Combined Batching + Compression:
βββ Network overhead: 64x reduction (batching)
βββ Payload size: 5x reduction (compression)
βββ Total efficiency: 320x improvement!
Production Compression Strategy:
public class CompressionStrategy {
// LZ4: Fast compression, low CPU
// Best for: High-throughput systems with large batches
// Compression: 2:1 ratio, 300 MB/sec
props.put(ProducerConfig.COMPRESSION_TYPE_CONFIG, "lz4");
// Snappy: Balanced
// Best for: Moderate throughput, balanced CPU usage
// Compression: 2.3:1 ratio, 250 MB/sec
props.put(ProducerConfig.COMPRESSION_TYPE_CONFIG, "snappy");
// GZIP: Best compression
// Best for: Network-limited systems, low volume
// Compression: 3.2:1 ratio, 50 MB/sec (high CPU)
props.put(ProducerConfig.COMPRESSION_TYPE_CONFIG, "gzip");
// None: No compression
// Best for: Already-compressed data (images, video)
props.put(ProducerConfig.COMPRESSION_TYPE_CONFIG, "none");
}
Memory Management and Buffer Pool
Buffer Pool Architecture:
PRODUCER MEMORY LAYOUT:
Total Buffer: 64 MB (buffer.memory)
βββββββββββββββββββββββββββββββββββββββββββ
β Partition 0 Batch: 32 KB (ready) β β Full batch
β Partition 1 Batch: 28 KB (building) β β Accumulating
β Partition 2 Batch: 31 KB (ready) β β Full batch
β Partition 3 Batch: 15 KB (building) β
β ... β
β Free Memory: 10 MB β
βββββββββββββββββββββββββββββββββββββββββββ
Memory Exhaustion Behavior:
1. Buffer full (free < new batch size)
2. Block send() call for max.block.ms (default 60s)
3. If still full, throw BufferExhaustedException
4. As batches send, memory freed for new batches
Monitoring:
kafka.producer:type=producer-metrics,name=buffer-available-bytes
Tradeoffs
Advantages:
- β Massively improved throughput (10-100x)
- β Reduced network overhead (90%+ fewer requests)
- β Better compression efficiency with larger batches
- β Lower CPU usage per message (amortized overhead)
- β Reduced server-side processing load
Disadvantages:
- β Increased latency (messages wait in batch)
- β Higher memory usage (buffering messages)
- β Complexity in tuning (batch.size vs linger.ms)
- β Risk of data loss if producer crashes before send
- β Larger failure blast radius (entire batch fails together)
Real Systems Using This
Apache Kafka
- Implementation: Per-partition batching with configurable size and time thresholds
- Scale: 7+ trillion messages/day at LinkedIn with aggressive batching
- Default Config: 32 KB batch.size, 0ms linger.ms (conservative)
- Production Config: 64-128 KB batch.size, 20-50ms linger.ms (optimized)
AWS Kinesis
- Implementation: Automatic batching via PutRecords API (up to 500 records)
- Limits: 5 MB/sec per shard, 1 MB per batch
- SDK Behavior: KPL (Kinesis Producer Library) batches automatically
Google Cloud Pub/Sub
- Implementation: Client library batches messages automatically
- Config: Max batch size (1000 messages), max batch bytes (10 MB)
- Optimization: Batching + request compression for efficiency
RabbitMQ
- Implementation: Optional publisher confirms batching
- Config: Manual batching via application-level buffering
- Performance: 10x improvement with batching enabled
When to Use Producer Batching
β Perfect Use Cases
High-Volume Event Streaming
Scenario: Ingesting millions of events per second
Why batching: Maximizes network and disk efficiency
Example: Clickstream analytics, IoT sensor data
Config: Large batches (128 KB), medium linger (20-50ms)
Log Aggregation
Scenario: Centralized logging from 1000s of services
Why batching: Reduces load on logging infrastructure
Example: ELK stack ingestion, Splunk forwarding
Config: Large batches (128 KB), high linger (50-100ms)
Bulk Data Migration
Scenario: Moving large datasets between systems
Why batching: Maximum throughput, latency not critical
Example: Database CDC, ETL pipelines
Config: Maximum batches (256 KB), high linger (100ms)
β When NOT to Use (or Use Minimal Batching)
Real-Time Alerting
Problem: Critical alerts delayed by batching
Solution: linger.ms=0, small batches (16 KB)
Example: Security alerts, system monitoring
Trading Systems
Problem: Milliseconds matter, batching adds latency
Solution: No batching (linger.ms=0) or very small windows
Example: High-frequency trading, order execution
Request-Response Patterns
Problem: User waiting for immediate response
Solution: Minimal batching, sync sends
Example: API calls, user-facing operations
Interview Application
Common Interview Question 1
Q: βHow would you optimize a producer thatβs sending 100,000 small messages per second, causing high CPU and network usage?β
Strong Answer:
βThe issue is likely excessive network overhead from sending each message individually. Iβd implement producer batching:
Diagnosis:
- Current: 100K messages Γ 1 KB = 100K network requests/sec
- Network overhead: ~50% of bandwidth wasted on headers
- CPU overhead: 100K serialize/send operations
Solution:
// Enable aggressive batching props.put(ProducerConfig.BATCH_SIZE_CONFIG, 65536); // 64 KB props.put(ProducerConfig.LINGER_MS_CONFIG, 20); // 20ms window props.put(ProducerConfig.COMPRESSION_TYPE_CONFIG, "lz4");
Result:
- Batching: 100K messages β ~2K batches (50x reduction)
- Compression: 64 KB β ~15 KB per batch (4x savings)
- Network: 98% reduction in requests
- CPU: 95% reduction in overhead
- Added latency: ~20ms (acceptable for most use cases)
Tradeoff: 20ms added latency vs 50x throughput improvement. For log/event streaming, this is optimal.β
Why this is good:
- Quantifies the problem
- Provides specific configuration
- Explains each parameter choice
- Analyzes tradeoffs explicitly
- Gives measurable results
Common Interview Question 2
Q: βYour Kafka producer is dropping messages under high load. How would you debug and fix this?β
Strong Answer:
βMessage drops under load suggest buffer memory exhaustion. Hereβs my approach:
Diagnosis Steps:
- Check JMX metric:
buffer-available-bytes
β Likely near 0- Check logs for
BufferExhaustedException
- Check
max.block.ms
timeout (default 60s)Root Cause Analysis:
- Batches accumulating faster than sender thread can send
- Possible causes:
- Network slowness (broker response time)
- Too small buffer.memory for traffic volume
- Inefficient batching (small batches = more sends)
Solutions (in order):
1. Increase buffer memory:
props.put(ProducerConfig.BUFFER_MEMORY_CONFIG, 268435456); // 256 MB
2. Optimize batching for throughput:
props.put(ProducerConfig.BATCH_SIZE_CONFIG, 131072); // 128 KB props.put(ProducerConfig.LINGER_MS_CONFIG, 50); // Wait for fuller batches props.put(ProducerConfig.COMPRESSION_TYPE_CONFIG, "lz4");
3. Application-level backpressure:
producer.send(record, (metadata, exception) -> { if (exception instanceof BufferExhaustedException) { // Implement retry with exponential backoff // Or shed load (return 503 to clients) } });
Result: Larger buffer + more efficient batching = 10x capacity improvementβ
Why this is good:
- Systematic debugging approach
- Multiple solution layers
- Specific metrics to check
- Code examples
- Explains root cause clearly
Red Flags to Avoid
- β Not understanding latency tradeoff of batching
- β Setting linger.ms without understanding batch.size
- β Not considering memory implications
- β Ignoring compression benefits with batching
- β Not knowing how to measure batching efficiency
Quick Self-Check
Before moving on, can you:
- Explain producer batching in 60 seconds?
- Draw the batching flow from send() to network?
- List all 4 batch trigger conditions?
- Explain the latency-throughput tradeoff?
- Calculate network savings from batching?
- Configure producer for high-throughput vs low-latency?
Related Content
Prerequisites
None - this is a foundational performance concept
Related Concepts
- Log-Based Storage - Batching works well with sequential writes
- Topic Partitioning - Batching is per-partition
Used In Systems
- Distributed Message Queue - Core performance technique
- Event-Driven Architecture - Essential for high throughput
Explained In Detail
- Kafka Producer Mechanics - Implementation details (35 minutes)
Next Recommended: Producer Acknowledgments - Understand reliability guarantees