1. Welcome Developer Playground by Giri

Developer Playground

About Kafka Basic

Updated: March 31, 2025

Topics

  • A category or feed name to which records are published
  • Identified by unique names within a Kafka cluster
  • Store messages in various formats (JSON, Avro, Protobuf, text, binary, custom)
  • Split into partitions for distributed data scaling
  • Configured with replication factor for fault tolerance
  • Support configurable retention policies (time/size based)
  • Immutable append-only logs - once written, cannot be modified
  • Names are case-sensitive (alphanumeric, dots, underscores, hyphens)
  • Internal topics: __consumer_offsets and __transaction_state
Kafka Topics and PartitionsTopic APartition 00123Partition 1012Topic BPartition 001234Partition 101Message with offset

Partitions &Offsets

  • Topics have one or multiple partitions for parallel processing
  • Each partition is an ordered, immutable sequence of records
  • Messages in a partition are strictly ordered with sequential offsets
  • Offsets are partition-specific identifiers that are immutable
  • Each partition starts with offset 0
  • Default retention: 7 days (configurable by time/size)
  • Oldest messages are removed when retention limits are reached
  • Partitions are distributed across brokers for load balancing
  • Each has a leader broker and zero or more follower brokers
  • Partition count can be increased but not decreased after creation

Producer

  • Write (publish) data to Kafka topics
  • Can specify partition or let Kafka handle assignment
  • Message components: key, value, headers, timestamp
  • Compression options: none (default), gzip, snappy, lz4, zstd
  • Timestamp options: system time (default) or custom
  • Support for retries, idempotence, and exactly-once semantics (since 2.8)

Partitioning strategies:

  • Null key: round-robin distribution
  • Non-null key: consistent hashing (murmur2)
  • Custom partitioning possible

Acknowledgment modes (acks):

  • acks=0: No acknowledgment (fire and forget)
  • acks=1: Leader acknowledgment only (default)
  • acks=all/-1: Full acknowledgment from leader and all in-sync replicas
Kafka Producers and ConsumersProducer 1Producer 2Kafka ClusterTopicPartition 0Partition 1Partition 2Consumer GroupConsumer 1Consumer 2Consumer 3Key-based routingAcks, compressionOrdered messages in each partitionImmutable offsetsOne partitionper consumerwithin group

Consumers

  • Pull (fetch) data from Kafka topics
  • Read messages in exact write order within each partition
  • Use deserializers for various data formats
  • Maintain position by tracking last consumed offset
  • Offset reset policies: earliest, latest, none
  • Configurable fetch settings for throughput vs. latency optimization

Consumer Groups

  • Organized for parallel processing
  • Each consumer assigned exclusive partitions within a group
  • Dynamic rebalancing when consumers join/leave
  • Inactive consumers if more consumers than partitions
  • Identified by unique group.id
  • Offsets committed to __consumer_offsets topic
  • Managed by a group coordinator broker
  • Partition assignment strategies: Range, RoundRobin, Sticky, CooperativeSticky
Kafka Consumer Groups and Delivery SemanticsKafka TopicPartition 0Partition 1Partition 2Partition 3Group AConsumer A1Consumer A2Consumer A3Consumer A4Group BConsumer B1Consumer B2Delivery Semantics• At least once (default):Commit after processing, possible duplicates• At most once:Commit on receive, possible data loss• Exactly once:Transactions API, no duplicates, no lossEach consumer handles2 partitionsTopic with 4 partitions1 partition per consumerwithin the groupOffsets stored in__consumer_offsets topicGroup A connectionsGroup B connections

Delivery semantics for consumers

At least once (default):

  • Commit after processing
  • May cause duplicates if failure occurs
  • Requires idempotent consumers

At most once:

  • Commit on receive, before processing
  • No reprocessing on failure, potential data loss

Exactly once:

  • Via Kafka Transactions API
  • Requires idempotent producers and transactional consumers
  • Primarily for Kafka-to-Kafka workflows

Kafka brokers

  • Distributed system of multiple servers (3-100+)
  • Each identified by integer ID
  • Connect via bootstrap servers
  • Manage partitions, handle requests, manage replication
  • Automatic leadership transfer on failure
  • Controller broker manages administrative operations
Kafka Brokers and ReplicationBroker 1 (id: 101)Topic A - P0 (Leader)Topic A - P1 (Follower)Topic B - P1 (Follower)Broker 2 (id: 102)Topic A - P0 (Follower)Topic A - P1 (Leader)Topic B - P0 (Follower)Broker 3 (id: 103)Topic A - P0 (Follower)Topic A - P1 (Follower)Topic B - P0 (Leader)Replication Information• Replication Factor: 3 (each partition has 3 copies across brokers)• For each partition, one broker is the leader, others are followers (ISR)• With RF=3, cluster can tolerate 2 broker failures without data lossLeader PartitionFollower PartitionController BrokerManages administrative tasksand leadership elections

Topic replication factor

  • Replication factor = number of copies per partition
  • Recommended: factor of 3 for production
  • Must be ≤ number of brokers
  • Set at topic level, can differ between topics
  • With factor N, tolerate N-1 broker failures

Concept of Leader for a Partition

  • One leader per partition handles all reads/writes
  • Followers passively replicate from leader
  • In-Sync Replicas (ISR) = leader + caught-up followers
  • With 3 partitions, RF=3, 3 brokers: each partition has ISR=3
  • Automatic leadership transfer on failure
  • Controller manages elections

Kafka Topic durability

  • Replication factor N tolerates N-1 broker failures
  • Enhanced by min.insync.replicas setting
  • Strongest guarantees: RF=3, min.insync.replicas=2, acks=all
  • Trade-off: higher durability vs. latency/throughput
Kafka: Zookeeper vs KRaftZookeeper-based Architecture(Before Kafka 4.0)Zookeeper EnsembleZK1ZK2ZK3Kafka BrokersB1B2B3KRaft Architecture(Kafka 4.0+)Kafka Brokers with KRaftKRaft Controller QuorumKR1KR2KR3B1B2B3Kafka Version Timelinev2.8KRaft Previewv3.0KRaft Supportedv3.3KRaft Production-Readyv4.0Zookeeper RemovedZookeeper:Manages metadata, broker coordination, leader electionKRaft:Self-managed consensus, simplified architecture

Zookeeper

  • Managed metadata and broker coordination (historically)
  • Handled broker registration, configurations, elections
  • Stored consumer offsets until 0.10
  • Required until Kafka 2.7
  • KRaft mode preview in 2.8, official in 3.0
  • Production-ready in 3.3.0
  • Removed completely in 4.0
  • Typically used 3-5 nodes (7 for large clusters)
  • Leader-follower ensemble with quorum consensus

Related Articles

Kafka Consumer Rate Control
Controlling Processing Rate in Kafka Consumers

Learn how to control message processing rates in Kafka consumers for optimized throughput.

Read more
Data Encoding Tools
Base64 Encoding for Message Serialization

Useful for encoding binary data in Kafka messages. Try our Base64 encoding tool.

Try the tool
Copyright © 2025 Giri Labs Inc.·Trademark Policy