← All cheatsheets
Apache Kafka® Cheatsheet
Key concepts - hover any entry for details
What is Apache Kafka?

Apache Kafka is a distributed event streaming platform built around an immutable, append-only commit log. Producers write records to topics; consumers read at their own pace; the log retains data on disk for hours, days, or forever. Unlike a traditional message queue, Kafka does not delete messages on consumption - many independent consumers can replay the same stream.

Kafka scales horizontally by partitioning each topic across brokers and replicating partitions for durability. It is the de-facto backbone for event-driven architectures: connecting microservices, feeding data lakes, powering CDC pipelines, and serving as the source-of-truth log for stream processors like Flink.

When Kafka is the right fit

Less ideal for: request/response RPC (use gRPC), small low-throughput queues with complex routing (use RabbitMQ), or per-message TTL/priority semantics (Kafka is FIFO per partition only).

Often replaces: RabbitMQ, ActiveMQ, IBM MQ and other JMS brokers at scale; ad-hoc HTTP webhook fan-out; in-house log shippers feeding a data lake.

Project resources
Releases & stats
4.1.0 4.0.0 3.9.x 3.8.x 3.7.x 3.6.x Full history →
4.1.0
Latest (Apr 2026)
~30k
GitHub stars
~1,400
Contributors
2012
ASF top-level since