Semantic but important point, Kafka is not a queue, it's a distributed append only log. I deal with so many people who think it's a super-scalable replacement for an MQ, and it's such the wrong way to think about it.
Semantic but important point, Kafka is not a queue, it's a distributed append only log. I deal with so many people who think it's a super-scalable replacement for an MQ, and it's such the wrong way to think about it.
Yes but the practical reality of it is it can be used exactly the same way as you would do a queue and you can make it work just as well as any MQ based system. I know this as I moved from a RabbitMQ system to Kafka for additionally scalability requirements and it worked perfectly.
So sure "technically" it's not a queue, but in reality its used as a queue for 1000s of companies around the world for huge production workloads which no MQ system can support.
> you can make it work just as well as any MQ based system
you really can't. getting per-message acks, dynamically scaling competing consumers without having to repartition while retaining ordering, etc. requires a ton of hacks like client side tracking / building your own storage on top of offset metadata / etc.. and you still won't have all of the features actual message queues provide.
to make it worse, there is very little public work/discussion on this so you'll be on your own to figure out all of the quirks. the only notable example is https://github.com/confluentinc/parallel-consumer which is effectively abandoned
There is a whole KIP that is “preview” on Kafka 4.1 to handle this use case natively: https://cwiki.apache.org/confluence/plugins/servlet/mobile?c...
Note: I haven’t had a chance to test it out in anger personally yet.
yep i mentioned that in another thread below. very excited to see it added, but it will be some time until its production ready / they implement the full feature set
I don't think "while retaining ordering" is something you want to include here, since you can't guarantee processing order for any MQ system without serializing the consumption down to a single consumer.
absolute order, you're correct. but several systems support demultiplexing an interleaved stream by some key so that order is retained for all messages in each key space (pulsar key_shared subscriptions, azure service bus sessions, etc). you're still serializing consumption / have an exclusive consumer for each key, but with global parallelism.
here's the equivalent in parallel-consumer: https://github.com/confluentinc/parallel-consumer?tab=readme...
Don't you always need a database after reading events from Kafka to deduplication?
So the competing solutions are: PostgreSQL or Kafka+PostgreSQL
Kafka does provide some extrs there, handling load spikes, more clients that PG can handle natively and resilience to some DB downtime. But is it worth the complexity, in most cases no.
Actually you can avoid having a separate DB! You can build a materialized view of the data using [KTables](https://developer.confluent.io/courses/kafka-streams/ktable/) or use [interactive queries](https://developer.confluent.io/courses/kafka-streams/interac...). The "table" is built up from a backing kafka topic so you don't need maintain another datastore if the data view you want is entirely derived from one or more Kafka topics.
It is a *very bad* replacement for an MQ system, for the simple reason you can't quickly and effortlessly scale int/out consumers.
Why can't you? In my experience, scaling Kafka was far easier than scaling our RabbitMQ cluster. We started running into issues when our RabbitMQ cluster hit 25k TPS, our Kafka cluster of equivalent resources didn't break a sweat at 500k TPS.
Scaling consumers, not throughput. And it’s both directions (in and out, not just out).
How did you handle re-partitioning and rebalancing every time you scaled your cluster in or out?
What sort of systems do you work on to require this kind of traffic volume? I've worked on one project that I'd consider relatively high volume (UK Post Office Horizon Online) and we were only targeting 500 TPS.
Think extremely popular multiplayer videogames. We used these systems for all the backend infra, logins, logouts, chat messages, purchases.
We often had millions of players online at a given moment which means lots of transactions!
Is it true that a message from a queue will disappear after it is consumed successfully? If yes, at this moment, how do you make kafka topics work as queues?
It "disappears" in the sense that the Consumer-Group that read/committed that message (event) will never see it again. It doesn't "disappear" in the sense that a new Consumer-Group can be started in a way that will get that message, or you can reset your Consumer-Group's offset to re-consume it.
Think about this for a second. Kafka offsets are a thing, consumer groups are a thing. It's trivial to ensure that only one message is delivered to only one consumer if that's what you want. Consumer groups track their offset and then commit the offset, the message stays in Kafka but it won't be read again.
This IMO is better behaviour than RabbitMQ since you can always re-read messages once they have been processed, whereas generally with MQ systems the message is then marked for deletion and asynchronously deleted.
> It's trivial to ensure that only one message is delivered to only one consumer
Exactly-once delivery is one of the hardest distributed systems problems. If you’ve “trivially” solved it, please show us your solution.
> It's trivial to ensure that only one message is delivered to only one consumer if that's what you want. Consumer groups track their offset and then commit the offset, the message stays in Kafka but it won't be read again. This IMO is better behaviour than RabbitMQ
The trivial solution is to use Kafka. They're clearly saying that Kafka makes it trivial, not that it's trivial to solve from scratch.
What the parent poster described isn’t what makes Kafka’s “exactly once” semantics work. It’s the use of an idempotency token associated with each publication, which effectively turns “at-least-once” semantics into effectively “exactly once” via deduplication.
> better behaviour than RabbitMQ since you can always re-read messages once they have been processed
I can imagine, a 1 Billion dollar transaction accidentally gets processed by ten thousand client nodes due to a client app synchronization bug, company rethinks its dumb data dumper server strategy...news at 11.
To be fair, any (immutable) data structure that includes the creation timestamp can be a queue. It might not be a good queue, but it can be used as one.
On this note... has anyone here used object storage directly as a queue? How did it go?
You can make a bucket immutable, and entries have timestamps. I don't think any cloud provider makes claims about the accuracy or monoatomicity of these timestamps, so you would merely get an ordering, not necessarily the ordering in which things truly occurred. But I have use cases where that is fine.
I believe with a clever naming scheme and cooperating clients it could be made to work.
Once used object storage as queue, you can implement queue semantic at the application level, with one object per entry.
But the application was fairly low volume in Data and usage, So eventual consistency and capacity was not an issue. And yes timestamp monotonicity is not guaranteed when multiple client upload at the same time so unique id was given to each client at startup and used for to add guarantee of entries name. Metadata and prefix were used to indicate state of object during processing.
Not ideal, but it was cheaper that a DB or a dedicated MQ. The application did not last, but would try again the approach if adapted to stuation.
The application I'm interested in is a log-based artifact registry. Volume would be very low. Much more important is the immutability and durability of the log.
I was thinking that writes could be indexed/prefixed into timestamp buckets according to the clients local time. This can't be trusted, of course. But the application consumers could detect and reject any writes whose upload timestamp exceeds a fixed delta from the timestamp bucket it was uploaded to. That allows for arbitrary seeking to any point on the log.
Do you mean this in the sense that listeners don't remove messages, as one would expect from a queue data structure?
Well, it's impractical to try to handle messages individually in Kafka, it's designed to acknowledge entire batches (since it's a distributed append-only log). You can still do that, but the performance will be no better than an SQL database
That is the major difference - clients track their read offsets rather than the structure removing messages. There aren't really "listeners" in the sense of a pub-sub?
> There aren't really "listeners" in the sense of a pub-sub?
They are still "listeners", it's that events aren't pushed to the listener. Instead the API is designed for sheer throughput.
Exactly. There's no concept in Kafka (yet...) of "acking" or DLQs, Kafka is very good at what it does by being deliberately stupid, it knows nothing about your messages or who has consumed them and who hasn't.
That was all deliberately pushed onto consumers to manage to achieve scale.
Kafka is the MongoDB of sequential storage. Built for webscale and then widely adopted based on impressive single-metric numbers without regard for actual fitness to purpose in smaller operations. Fortunately it always what reliable enough.
I believe RabbitMQ is much more balanced and is closer to what people expect from a high level queueing/pubsub system.
What do you mean "no concept...of asking?" During consumption, you must either auto-commit offsets, or manually commit them. If you don't, you'll get the same events over and over again.
Or you can just store your next offset in a DB and tell the consumer to read from that offset - the Kafka offset storage topic is a convenient implementation of the most common usage pattern, but it's not mandatory to use - and again, the broker doesn't do anything with that information when serving you data.
Acking in an MQ is very different.
Yup, that's a fair and important distinction, Kafka comes with very little of the ergonomics that make MQ good at what it does. In exchange you get blistering scale, for when blistering scale is the main problem.