Skip to content

Instantly share code, notes, and snippets.

@rsrini7
Last active December 13, 2025 17:56
Show Gist options
  • Select an option

  • Save rsrini7/d33eaa8b5f9f6d74122e21e2cc489879 to your computer and use it in GitHub Desktop.

Select an option

Save rsrini7/d33eaa8b5f9f6d74122e21e2cc489879 to your computer and use it in GitHub Desktop.
kafka 4.0 features

Apache Kafka 4.0 Details:

  • Cloud-Native Focus: The primary theme of Kafka 4.0 is its continued evolution towards being more cloud-native (from video).

  • ZooKeeper Removal (KIP-500):

    • ZooKeeper has been fully removed and deprecated (from video)
    • KRaft is now the standard for metadata management, offering improved scalability and resilience (from video).
    • This change simplifies cloud deployments by eliminating a component that posed networking challenges (from video).
  • KIP-848: New Consumer Rebalance Protocol

    • "Lightweight Clients, Stable and Rapid Rebalances": Experience significantly faster and more stable consumer group rebalances[cite: 1].
    • Scalability and Efficiency:
      • Broker-driven rebalancing: Shifts coordination from clients to brokers, reducing client overhead[cite: 2].
      • Multi-threaded broker processing: Parallelizes rebalances, handling large groups efficiently[cite: 2, 3].
    • Improved Heartbeat API: Provides more efficient communication between clients and brokers with less network traffic[cite: 3].
    • Simplified Management: Server-side configuration of heartbeats, timeouts, and assignors streamlines consumer setup[cite: 4].
    • Optimized Rebalancing: Incremental, multi-threaded rebalancing allows for faster partition assignment for new consumers[cite: 5].
    • Server Enabled, Client Opt-In:
      • Enabled by default on Kafka 4.0 servers[cite: 6].
      • Clients activate with group.protocol=consumer[cite: 6].
    • Enhanced Monitoring: New metrics provide deeper insights into rebalance performance[cite: 7].
    • This protocol shifts rebalance functionality from the client to the broker (from video)
    • This is beneficial for cloud deployments, providing better control over consumer groups and rebalances (from video).
    • Enhanced metrics are included for better observability during rebalances (from video).
  • KIP-932: Queues for Kafka (early access)

    • What it is:
      • Introduces a new way to consume messages in Kafka, enabling cooperative consumption through "share groups"[cite: 8].
      • Decouples consumers from partitions, so multiple consumers can read the same partition[cite: 9].
      • Designed for queuing use cases, where higher throughput and parallel processing are desired[cite: 9].
    • New Features:
      • KafkaShareConsumer: A new client interface for interacting with share groups[cite: 10].
      • Kafka CLI:
        • kafka-console-share-consumer.sh: For consuming data using a share group[cite: 10].
        • kafka-share-groups.sh: For managing share groups[cite: 11].
      • AdminClient: Includes new methods for managing share groups[cite: 11].
      • Metrics: Share groups have their own metrics (e.g., consumer lag) for monitoring and scaling[cite: 12].
      • New APIs: Introduces ShareGroupHeartbeat, ShareGroupDescribe, ShareFetch, and ShareAcknowledge to support share groups[cite: 13].
    • The goal of this release is to test the new share group functionality[cite: 14].
    • NOT FOR PRODUCTION USE[cite: 15].
    • Undocumented configs necessary in the server.properties file[cite: 15].
    • Cannot upgrade clusters if share groups are enabled[cite: 15].
    • Introduces share groups as an experimental feature, which is the beginning of queue functionality in Kafka (from video).
    • Share groups allow multiple consumers to consume from the same partition (from video).
    • This feature is not for production use and requires an undocumented configuration (from video).
  • KIP-1106: Add Duration-Based Offset Reset Option for Consumer Clients

    • KIP-405 added tiered storage, which made auto.offset.reset=earliest less fun[cite: 16].
    • New duration-based offset reset: by_duration[cite: 16].
    • No change to the existing semantics and default value of auto.offset.reset[cite: 17].
    • KafkaConsumer: Updated auto.offset.reset config option to support new config value of the format by_duration:[cite: 18, 19].
    • This automatically resets the offsets back by configured duration[cite: 19].
    • KafkaShareConsumer: Add support for by_duration: config value for share.auto.offset.reset[cite: 19].
    • Kafka Streams: Deprecate existing Topology.AutoOffsetReset enum, add a new class org.apache.kafka.streams.AutoOffsetReset to capture the reset strategies[cite: 20].
    • Adds the option to reset consumer offsets by duration (from video).
    • This is particularly useful for tiered storage, preventing the need to read through large amounts of old data (from video).
  • KIP-1102: Clients re-bootstrapping based on timeout or error code

    • KIP-899 Limitations: Re-bootstrapping only occurred when no brokers were available[cite: 21].
    • Clients could get stuck with stale metadata if connected to some brokers[cite: 22].
    • KIP-1102 Improvements:
      • Proactive re-bootstrapping: Introduces a metadata refresh timeout (default 5 minutes)[cite: 23].
      • Clients re-bootstrap if no metadata update is received within this time[cite: 24].
      • Server-triggered re-bootstrapping: Brokers/proxies can now return a REBOOTSTRAP_REQUIRED error code, explicitly telling clients to re-bootstrap[cite: 25].
      • Default in 4.0: metadata.recovery.strategy defaults to rebootstrap for new clients, improving resilience out-of-the-box[cite: 26].
    • Improves the bootstrapping process (where clients discover broker metadata) (from video).
    • Proactive and server-triggered re-bootstrapping prevents issues with stale metadata, especially after long client downtimes (from video).
  • KIP-966: Eligible Leader Replicas (part 1)

    • Part 1 (Included in Kafka 4.0): Enhanced Data Safety (No Unclean Leader Election):
      • Introduces Eligible Leader Replicas (ELR): A subset of ISR replicas guaranteed to have complete data up to the high-watermark[cite: 27, 28].
      • ELRs are safe for leader election, preventing data loss[cite: 28].
      • ELR tracking: Managed by the KRaft controller, stored in PartitionRecord, no performance impact[cite: 29].
      • DescribeTopicPartitions API introduced (default in 3.9, so not new in 4.0)[cite: 29, 30].
      • Admin clients now use this API instead of the Metadata API[cite: 30].
      • Requires minimum metadata version 4.0 IV1[cite: 31].
      • Note: ELR is not enabled by default in 4.0. May become default in 4.1[cite: 31].
    • Protects against data loss in specific scenarios involving single in-sync replicas (from video).
    • Introduces "eligible leader replicas" which are guaranteed to have data up to the watermark (from video).
  • KIP-1076 & KIP-1091: Unlocking Kafka Streams Observability

    • Before KIP-1076 & KIP-1091
      • Manual and cumbersome health assessment[cite: 32].
      • Developers had to instrument and export metrics[cite: 32].
      • Operators relied on dashboards for application health[cite: 32].
    • KIP-1076 Highlights
      • Extends KIP-714 (AK 3.7 in Feb, 2024) for Kafka Streams applications[cite: 32].
      • Cluster operators can pull application metrics from the application[cite: 32].
    • KIP-1091 Highlights
      • Adds state metrics for each StreamThread and the client instance itself[cite: 32].
      • Provides detailed visibility into application state[cite: 32].
    • KIP-1076 allows the cluster to pull observability data from Kafka Streams clients (from video).
    • KIP-1091 extends this with specific metadata about stream threads (from video).
  • KIP-1112: Custom Processor Wrapping

    • Before KIP-1112
      • Cross-cutting logic required manual integration into each processor[cite: 33].
      • DSL users lacked a straightforward method to apply these altogether[cite: 33].
      • Resulted in duplicated code and increased maintenance efforts[cite: 33].
    • KIP-1112 Highlights
      • Introduces ProcessorWrapper interface[cite: 33].
      • Inject custom logic around both Processor API and DSL processors seamlessly[cite: 33].
    • Enables custom code to be run automatically within Kafka Streams DSL methods (from video).
    • This facilitates cross-cutting concerns like logging within DSL calls (from video).
  • KIP-1104: Extract ForeignKey from Key in KTable Join

    • Before KIP-1104
      • Foreign key extractor function could only access the record value[cite: 34].
      • Required duplicating the record key into the value if one wants to use the key for FK joins[cite: 34].
      • More work, more storage overhead[cite: 34].
    • KIP-1104 Highlights
      • Foreign keys can now be extracted from both the record key and record value[cite: 34].
      • Simplifies the join process with a more intuitive API[cite: 34].
    • Allows extracting a foreign key from the key of a message, simplifying join operations (from video).
  • KIP-1065: Better Reliability with Automatic Retry Mechanisms

    • Before KIP-1065
      • Persistent errors (e.g., broker unavailable, missing output topic)[cite: 35].
      • Default retry mechanism until timeout without the ability to break the retry loop, even after timeout[cite: 35].
      • Expensive task recovery[cite: 35].
    • KIP-1065 Improvements
      • Enables users to break retry loops[cite: 35].
      • "RETRY" option in ProductionExceptionHandler[cite: 35].
      • Custom handlers for more control: Retry, Fail gracefully, Drop record and continue[cite: 35].
    • Improves logging and handling of errors when output topics are unavailable (from video).
  • Summary of Key Changes:

    • ZooKeeper is replaced by KRaft for metadata management (from video).
    • Queue functionality is in progress, with share groups as the first step (from video).
    • Users are advised to read the release notes for complete details (from video).
@rsrini7
Copy link
Copy Markdown
Author

rsrini7 commented Apr 1, 2025

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment