Kafka & Distributed Systems

🚀 Why 100% Event Ordering in Kafka is an Illusion

Published: April 19, 2026

When dealing with events in a Microservices Architecture (MSA), we often design our systems believing the fundamental principle that "Kafka guarantees order within a partition." However, even if you configure your producer for idempotence (enable.idempotence=true) and restrict your architecture to a single pod publishing to a single partition, expecting 100% perfect ordering from a strict business perspective is near impossible.

Here are the primary reasons why this illusion shatters in production environments:

1. Concurrency at the Application Level (Race Conditions)

The most fundamental reason is that the order can already be inverted in the application memory long before the events ever reach the Kafka broker.

  • Multi-threaded Processing: Applications like Spring Boot process incoming API requests asynchronously via a thread pool. A later request can easily overtake an earlier one.
  • OS Scheduling Context Switches: Imagine state change requests A and B for the exact same domain object arriving milliseconds apart. While thread A is executing heavy business logic, the OS scheduler might pause it and grant CPU time to thread B, allowing B to invoke the Kafka send operation first.

Here is a typical Kotlin/Spring Boot snippet highly susceptible to this exact concurrency issue:

// Anti-pattern: Sending events directly inside a business method.
@Service
class OrderService(
    private val orderRepository: OrderRepository,
    private val kafkaTemplate: KafkaTemplate<String, OrderEvent>
) {
    @Transactional
    fun updateOrderStatus(orderId: Long, status: OrderStatus) {
        val order = orderRepository.findById(orderId).orElseThrow()
        order.changeStatus(status)
        
        // ⚠️ RACE CONDITION DANGER
        // Thread might be paused by OS scheduler exactly here.
        // If Thread B executes this method for the same order, it might send() first.
        kafkaTemplate.send("order-topic", orderId.toString(), OrderEvent(order, status))
    }
}

2. Asynchronous Network Communication & Application-level Retries

Network unpredictability is another major culprit in tangling the sequential order of events.

  • Limitations of Asynchronous Publishing: If a producer asynchronously publishes events A and B sequentially, a momentary network timeout during A's transmission might allow B to traverse the network successfully and arrive at the broker entirely first.
  • Improper Exception Handling: Suppose the transmission of A fails, and the application attempts a retry via a catch block or a library like @Retryable. If events B and C have already been successfully published in the meantime, the retried event A will be pushed to the very end of the line, landing after C.
// Example: Application-level asynchronous retry silently reversing order
fun publishStatusChanges(events: List<OrderEvent>) {
    events.forEach { event ->
        kafkaTemplate.send("order-topic", event.orderId, event)
            .whenComplete { _, exception ->
                if (exception != null) {
                    // ⚠️ DANGER: If Msg 'A' fails here due to an unstable network,
                    // it triggers this async callback ms or seconds later.
                    // Meanwhile, Msg 'B' might have already succeeded flawlessly.
                    log.warn("Failed to send ${event.id}, retrying asynchronously...")
                    retryPublishing(event) // 'A' will now be appended AFTER 'B'
                }
            }
    }
}

3. Dynamic Infrastructure Changes in Cloud Environments

In cloud-native environments like Kubernetes (EKS), the infrastructure is relentlessly dynamic and rarely static.

  • Consumer Rebalancing: When pods scale horizontally (via HPA) or rollout, partition assignments rapidly shift. Uncommitted offset messages get re-polled by new pods, tangling state sequence.
  • Partition Scaling: Adding more partitions dynamically alters the key-hash destination. Successive events for the identical userId will instantly be scattered to isolated partitions.
// Theoretical demonstration of how Partition Scaling shatters Sequence mapping
fun determineTargetPartition(key: String, partitionCount: Int): Int {
    // Kafka's DefaultPartitioner relies heavily on the total partition count
    return Math.abs(Utils.murmur2(key.toByteArray())) % partitionCount
}

fun main() {
    val userId = "User_9921"
    
    // Scenario 1: Topic initially has 2 partitions
    println(determineTargetPartition(userId, 2)) // Output: Partition 0
    
    // Scenario 2: Traffic surges, Ops dynamically scales topic to 3 partitions
    // Events for 'User_9921' instantly drift to a completely different partition!
    println(determineTargetPartition(userId, 3)) // Output: Partition 2
}

💡 Practical Best Practices: Defensive Consumers

Rather than attempting the Sisyphean task of preserving chronological order at the precarious infrastructure layer, it is vastly more robust to design defensive logic at the consumer layer.

Optimistic Locking / Timestamp Versioning

Compare the incoming event's occurredAt timestamp against the database. If it's stale, discard it purely based on chronological truth, effectively filtering out any out-of-order latency noise.

// Consumer-layer Idempotency using Timestamp/Versioning
@KafkaListener(topics = ["order-topic"])
@Transactional
fun handleOrderEvent(event: OrderEvent) {
    val order = orderRepository.findById(event.orderId).orElseThrow()
    
    // Defensive logic: Completely neutralize out-of-order event risks.
    // If the DB's updated time is strictly newer than the event's creation time, skip.
    if (event.occurredAt.isBefore(order.lastUpdatedAt)) {
        log.warn("Ignored stale/out-of-order event for order ${event.orderId}")
        return
    }
    
    // Proceed with business logic securely...
    order.syncStateFromEvent(event)
}

💡 Attempting 100% Ordering: Transactional Outbox Pattern

If an absolute guarantee is mandatory without entirely destroying throughput, the Transactional Outbox Pattern paired with CDC is the architectural gold standard.

// Transactional Outbox Implementation approach
@Service
class PaymentService(
    private val paymentRepository: PaymentRepository,
    private val outboxRepository: OutboxRepository
) {
    @Transactional
    fun processPayment(paymentReq: PaymentRequest) {
        // 1. Process local core business logic securely
        val payment = paymentRepository.save(Payment(paymentReq))
        
        // 2. Save event in the highly precise same Local DB Transaction
        // Absolutely NO direct kafkaTemplate.send() happens here.
        outboxRepository.save(OutboxEntity(
            aggregateId = payment.id.toString(),
            aggregateType = "PAYMENT",
            payload = objectMapper.writeValueAsString(PaymentRequestedEvent(payment))
        ))
    }
}
// An external, highly-resilient CDC process automatically tails the DB binary log
// emitting outbox entities into Kafka in perfect, undeniable mathematical order.

Because both records are flushed simultaneously exactly when the Database commits, you completely eliminate the application sequence risk.

🎯 Conclusion: The Architect's Choice

Ultimately, 100% guaranteed order is not technically impossible; rather, it is a demanding question of: "Is your business truly willing to pay the steep, punitive costs in degraded performance and massive infrastructure complexity to achieve it?"

In the vast majority of practical, high-throughput environments, rather than making the underlying system convoluted, proactively leveraging consumer-side idempotency and discarding stale data based on robust versioning or timestamps is widely accepted as the most cost-effective and pragmatic architectural compromise.

Advertisement