Producer Acknowledgments

TL;DR

Producer acknowledgments (acks) control when Kafka considers a message successfully written. Options include acks=0 (no confirmation), acks=1 (leader confirms), and acks=all (all replicas confirm), trading latency for durability guarantees. Critical for balancing performance vs data safety in message brokers.

Visual Overview

Producer Acknowledgments Overview

Core Explanation

What are Producer Acknowledgments?

Producer acknowledgments (acks) control when a Kafka producer considers a write operation successful. This determines:

When producer receives confirmation that message is safe
How many replicas must persist the message
Trade-off between latency and durability

Three Levels:

Three Acknowledgment Levels

acks=0: No Acknowledgment

Behavior:

acks=0 Behavior

When Message Can Be Lost:

acks=0 Message Loss Scenarios

Configuration:

const producer = kafka.producer({
  acks: 0, // No acknowledgment
  compression: "gzip", // Often used with acks=0 for max throughput
});

Use Cases:

acks=0 Use Cases

acks=1: Leader Acknowledgment

Behavior:

acks=1 Behavior

When Message Can Be Lost:

acks=1 Message Loss Scenario

Configuration:

const producer = kafka.producer({
  acks: 1, // Leader acknowledgment (default)
  timeout: 30000, // 30s timeout
  retry: {
    retries: 3, // Retry on failure
  },
});

Use Cases:

acks=1 Use Cases

acks=all: Full ISR Acknowledgment

Behavior:

acks=all Behavior

In-Sync Replicas (ISR):

In-Sync Replicas (ISR)

min.insync.replicas:

min.insync.replicas Configuration

Configuration:

const producer = kafka.producer({
  acks: -1, // -1 means "all" (acks=all)
  timeout: 30000,
  retry: {
    retries: 5,
  },
});

// Topic configuration
min.insync.replicas = 2; // At least 2 replicas must ack
replication.factor = 3; // Total of 3 replicas

Use Cases:

acks=all Use Cases

Real Systems Using Producer Acks

System	Default acks	Typical Config	Rationale
Kafka Streams	acks=all	acks=all, min.insync.replicas=2	State stores require durability
Netflix (Keystone)	acks=1	acks=1, replication=3	High throughput, tolerate rare loss
LinkedIn	acks=all	acks=all, min.insync.replicas=2	Business-critical events
Uber	acks=1	acks=1 (logs), acks=all (trips)	Mixed based on data criticality
Confluent Cloud	acks=all	acks=all, min.insync.replicas=2	Default for safety

Case Study: Kafka at LinkedIn

LinkedIn Kafka Acknowledgment Strategy

When to Use Each Ack Level

acks=0: Fire and Forget

Use When:

acks=0 When to Use

acks=1: Leader Only

Use When:

acks=1 When to Use

acks=all: Full Replication

Use When:

acks=all When to Use

Hybrid Approach

Different Topics, Different Acks:

// Critical orders: acks=all
const orderProducer = kafka.producer({
  acks: -1,
  timeout: 30000,
});

// Analytics events: acks=1
const analyticsProducer = kafka.producer({
  acks: 1,
  timeout: 10000,
});

// Metrics: acks=0
const metricsProducer = kafka.producer({
  acks: 0,
  compression: "gzip",
});

Interview Application

Common Interview Question

Q: “How would you ensure zero data loss in a Kafka-based order processing system?”

Strong Answer:

“To ensure zero data loss for orders, I’d configure producers with acks=all and proper ISR settings:

Producer Configuration:
acks=all (or acks=-1)
min.insync.replicas=2
replication.factor=3
retries=MAX_INT (infinite retries)
max.in.flight.requests=1 (for ordering)
How This Prevents Loss:

acks=all: Producer waits for full replication before considering write successful

min.insync.replicas=2: Requires at least 2 replicas (leader + 1 follower) to acknowledge

replication.factor=3: Total of 3 copies across brokers

Result: Message on ≥2 replicas before ACK

Failure Scenarios:

Network failure: Producer retries until successful

Leader failure: Message already on follower (promoted to new leader)

Follower failure: Still have leader + other follower (meets min ISR)

Leader + Follower fail: Third replica exists, can rebuild ISR

Only lose data if: All 3 replicas fail simultaneously (extremely rare)

Trade-offs:

Latency: 20-30ms vs 5-10ms for acks=1

Throughput: Lower (wait for replication)

Availability: May reject writes if ISR < 2

Worth It: For orders where data loss = lost revenue + angry customers

Monitoring: Alert if ISR falls below min.insync.replicas”

Code Example

Producer with Different Ack Levels

const { Kafka } = require("kafkajs");

const kafka = new Kafka({
  clientId: "my-producer",
  brokers: ["kafka1:9092", "kafka2:9092", "kafka3:9092"],
});

// Configuration 1: acks=0 (Fire and Forget)
async function sendMetrics() {
  const producer = kafka.producer({
    acks: 0, // No acknowledgment
    compression: "gzip",
  });

  await producer.connect();

  const start = Date.now();
  await producer.send({
    topic: "metrics",
    messages: [{ value: JSON.stringify({ cpu: 80, mem: 60 }) }],
  });
  const latency = Date.now() - start;

  console.log(`Metrics sent (acks=0): ${latency}ms`);
  // Typical output: 1-2ms
  // Risk: Message may be lost
}

// Configuration 2: acks=1 (Leader Acknowledgment)
async function sendUserActivity() {
  const producer = kafka.producer({
    acks: 1, // Leader acknowledgment (default)
    timeout: 30000,
    retry: {
      retries: 3,
      initialRetryTime: 100,
    },
  });

  await producer.connect();

  const start = Date.now();
  await producer.send({
    topic: "user-activity",
    messages: [
      {
        key: "user-123",
        value: JSON.stringify({ action: "click", page: "/products" }),
      },
    ],
  });
  const latency = Date.now() - start;

  console.log(`Activity sent (acks=1): ${latency}ms`);
  // Typical output: 5-10ms
  // Risk: Lost if leader fails before replication
}

// Configuration 3: acks=all (Full ISR Acknowledgment)
async function sendOrder() {
  const producer = kafka.producer({
    acks: -1, // acks=all (wait for full ISR)
    timeout: 30000,
    retry: {
      retries: Number.MAX_VALUE, // Retry forever
      initialRetryTime: 100,
      maxRetryTime: 30000,
    },
    idempotent: true, // Exactly-once semantics
    maxInFlightRequests: 1, // Preserve ordering
  });

  await producer.connect();

  const start = Date.now();
  try {
    await producer.send({
      topic: "orders", // Topic config: min.insync.replicas=2, replication.factor=3
      messages: [
        {
          key: "order-456",
          value: JSON.stringify({
            orderId: "456",
            userId: "123",
            total: 99.99,
            items: [{ id: "product-1", qty: 2 }],
          }),
        },
      ],
    });
    const latency = Date.now() - start;

    console.log(`Order sent (acks=all): ${latency}ms`);
    // Typical output: 15-30ms
    // Guarantee: Message on ≥2 replicas, zero loss
  } catch (error) {
    if (error.type === "NOT_ENOUGH_REPLICAS") {
      // ISR < min.insync.replicas (degraded cluster)
      console.error("Cluster degraded: Not enough in-sync replicas");
      // Alert operations team
      // Queue order for retry
    }
    throw error;
  }
}

// Demonstrating latency differences
async function benchmark() {
  console.log("Benchmarking producer acknowledgments...\n");

  await sendMetrics(); // ~1-2ms
  await sendUserActivity(); // ~5-10ms
  await sendOrder(); // ~15-30ms

  // Trade-off: Latency vs Durability
  // acks=0:   Fastest, least safe
  // acks=1:   Balanced (default)
  // acks=all: Slowest, safest
}

benchmark();

Error Handling with acks=all

async function sendCriticalData(data) {
  const producer = kafka.producer({
    acks: -1,
    retry: {
      retries: 5,
      initialRetryTime: 300,
    },
  });

  await producer.connect();

  try {
    await producer.send({
      topic: "critical-data",
      messages: [{ value: JSON.stringify(data) }],
    });

    console.log("Data persisted successfully (acks=all)");
  } catch (error) {
    // Error types to handle:

    if (error.type === "NOT_ENOUGH_REPLICAS") {
      // ISR < min.insync.replicas
      console.error("Not enough in-sync replicas");
      // Action: Alert operations, queue for retry
    }

    if (error.type === "NOT_ENOUGH_REPLICAS_AFTER_APPEND") {
      // Message written to leader, but ISR shrank before replication
      console.error("Replication failed after append");
      // Action: Retry (may be duplicate, use idempotent producer)
    }

    if (error.type === "REQUEST_TIMED_OUT") {
      // Replication took longer than timeout
      console.error("Acknowledgment timeout");
      // Action: Retry (may be duplicate)
    }

    // Store in dead letter queue for manual review
    await storeInDLQ(data, error);
    throw error;
  }
}

Prerequisites:

Leader-Follower Replication - Understanding ISR
Topic Partitioning - Kafka architecture

Related Concepts:

Quorum - ISR is a form of quorum
Idempotence - Idempotent producer with acks=all
Exactly-Once Semantics - Combines idempotence + acks=all

Used In Systems:

Kafka (producer acknowledgments)
Pulsar (similar ack levels)
RabbitMQ (publisher confirms)

Explained In Detail:

Kafka Deep Dive - Producer mechanics and acknowledgments

See It In Action

Producer Acknowledgments Explainer - ~75 second animated visual showing acks=0, acks=1, and acks=all

Quick Self-Check

Can explain acks=0/1/all in 60 seconds?
Understand latency vs durability trade-offs?
Know when messages can be lost for each ack level?
Can explain min.insync.replicas and ISR?
Understand acks=all + min.insync.replicas=2 pattern?
Know which ack level to use for different use cases?

TL;DR

Visual Overview

Core Explanation

What are Producer Acknowledgments?

acks=0: No Acknowledgment

acks=1: Leader Acknowledgment

acks=all: Full ISR Acknowledgment

Real Systems Using Producer Acks

Case Study: Kafka at LinkedIn

When to Use Each Ack Level

acks=0: Fire and Forget

acks=1: Leader Only

acks=all: Full Replication

Hybrid Approach

Interview Application

Common Interview Question

Code Example

Producer with Different Ack Levels

Error Handling with acks=all

Related Content

See It In Action

Quick Self-Check