Apache Flink is a distributed stream-processing engine built for stateful computations over unbounded and bounded data streams. Unlike batch systems that wait for all data before processing, Flink processes each event as it arrives - with millisecond latency - while maintaining persistent state across millions of keys and guaranteeing exactly-once correctness even after failures.
Its unified API handles both streaming (continuous, never-ending) and batch (finite, historical) workloads with the same SQL or DataStream code. State is first-class: Flink checkpoints the entire pipeline to durable storage periodically, so any failure is fully recoverable with no data loss or duplication.
Less ideal for: pure ad-hoc SQL queries (use Trino/Athena), simple periodic batch reports with no streaming requirement (use Spark), or sub-millisecond latency (use in-memory DBs).
Often replaces: Apache Spark Streaming / Structured Streaming, Apache Storm, Apache Samza, or hand-rolled Kafka consumer apps with bespoke state management.