|
Greetings,
In reviewing the mailing list archives, I see various threads which state that ensuring "exactly once" delivery requires deduplication by the consumer. For example the following: "Exactly-once requires coordination between consumers, or idempotency, even when there is just a single queue. The consumer, broker or network may die during the transmission of the ack for a message, thus causing retransmission of the message (which the consumer has already seen and processed) at a later point." http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/2009-July/004237.html In the case of competing consumers which pull messages from the same queue, this will require some sort of shared state between consumers to de-duplicate messages (assuming the consumers are not idempotent). Our application is using RabbitMQ to distribute tasks across multiple workers residing on different servers, this adds to the cost of sharing state between the workers. Another message in the email archive mentions that "You can guarantee exactly-once delivery if you use transactions, durable queues and exchanges, and persistent messages, but only as long as any failing node eventually recovers." From the way I understand it, the transaction only affects the publishing of the message into RabbitMQ and prevents the message from being queued until the transaction is committed. If this is correct, I don't understand how the transaction will prevent a duplicate message in the previously mentioned scenarios that will cause a retransmission. Can anybody clarify? On a more practical level: What's the recommended way to deal with the potential of duplicate messages? What do people generally do? Is this a rare enough edge case that most people just ignore it? Thanks, Mike _______________________________________________ rabbitmq-discuss mailing list [hidden email] https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss |
|
Hi Mike,
On Tue, Aug 03, 2010 at 04:43:56AM -0400, Mike Petrusis wrote: > In reviewing the mailing list archives, I see various threads which state that ensuring "exactly once" delivery requires deduplication by the consumer. For example the following: > > "Exactly-once requires coordination between consumers, or idempotency, > even when there is just a single queue. The consumer, broker or network > may die during the transmission of the ack for a message, thus causing > retransmission of the message (which the consumer has already seen and > processed) at a later point." http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/2009-July/004237.html > > In the case of competing consumers which pull messages from the same queue, this will require some sort of shared state between consumers to de-duplicate messages (assuming the consumers are not idempotent). > > Our application is using RabbitMQ to distribute tasks across multiple workers residing on different servers, this adds to the cost of sharing state between the workers. > > Another message in the email archive mentions that "You can guarantee exactly-once delivery if you use transactions, durable queues and exchanges, and persistent messages, but only as long as any failing node eventually recovers." All the above is sort of wrong. You can never *guarantee* exactly once (there's always some argument about whether receiving message duplicates but relying on idempotency is achieving exactly once. I don't feel it does, and this should become clearer as to why further on...) The problem is publishers. If the server on which RabbitMQ is running crashes, after commiting a transaction containing publishes, it's possible the commit-ok message may get lost. Thus the publishers still think they need to republish, so wait until the broker comes back up and then republishes. This can happen an infinite number of times: the publishers connect, start a transaction, publish messages, commit the transaction and then the commit-ok gets lost and so the publishers repeat the process. As a result, on the clients, you need to detect duplicates. Now this is really a barrier to making all operations idempotent. The problem is that you never know how many copies of a message there will be. Thus you never know when it's safe to remove messages from your dedup cache. Now things like redis apparently have the means to delete entries after an amount of time, which would at least allow you to avoid the database eating up all the RAM in the universe, but there's still the possibility that after the entry's been deleted, another duplicate will come along which you now won't detect as a duplicate. This isn't just a problem with RabbitMQ - in any messaging system, if any message can be lost, you can not achieve exactly once semantics. The best you can hope for is a probability of a large number of 9s that you will be able to detect all the duplicates. But that's the best you can achieve. Scaling horizontally is thus more tricky because, as you say, you may now have multiple consumers which each receive one copy of a message. Thus the dedup database would have to be distributed. With high message rates, this might well become prohibitive because of the amount of network traffic due to transactions between the consumers. > What's the recommended way to deal with the potential of duplicate messages? Currently, there is no "recommended" way. If you have a single consumer, it's quite easy - something like tokyocabinet should be more than sufficiently performant. For multiple consumers, you're currently going to have to look at some sort of distributed database. > Is this a rare enough edge case that most people just ignore it? No idea. But one way of making your life easier is for the producer to send slightly different messages on every republish (they would still obviously need to have the same msg id). That way, if you detect a msg with "republish count" == 0, then you know it's the first copy, so you can insert async into your shared database and then act on the message. You only need to do a query on the database whenever you receive a msg with "republish count" > 0 - thus you can tune your database for inserts and hopefully save some work - the common case will then be the first case, and lookups will be exceedingly rare. The question then is: if you've received a msg, republish count > 0 but there are no entries in the database, what do you do? It shouldn't have overtaken the first publish (though if consumers disconnected without acking, or requeued messages, it could have), but you need to cause some sort of synchronise operation between all the consumers to ensure none are in the process of adding to the database - it all gets a bit hairy at this point. Thus if your message rate is low, you're much safer doing the insert and select on every message. If that's too expensive, you're going to have to think very hard indeed about how to avoid races between different consumers thinking they're both/all responsible for acting on the same message. This stuff isn't easy. Matthew _______________________________________________ rabbitmq-discuss mailing list [hidden email] https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss |
|
Matthew,
an excellent response and thank you for it! Yes, difficult it is! It raises a somewhat philosophical discussion around where the onus is placed in terms of guaranteeing such things as 'guaranteed once', i.e., on the client side or on the server side? The JMS standard offers guaranteed once, whereby the onus is on the server (JMS implementation) and not on the client.
What I am trying to say is that, in my opinion, client programs should be as 'simple' as possible with the servers doing all the hard work. This is what the JMS standard forces on implementors and, perhaps to a lesser extent today, do does AMQP.
Note: the word 'server' is horribly overloaded these days. It is used here to indicate the software with which clients, producers and consumers, communicate. Oh well, off to librabbitMQ and some example programs written in COBOL...
Cheers, John
On Thu, Aug 5, 2010 at 13:22, Matthew Sackman <[hidden email]> wrote: Hi Mike, -- --- John Apps (49) 171 869 1813 _______________________________________________ rabbitmq-discuss mailing list [hidden email] https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss |
|
John Apps wrote:
> The JMS standard offers guaranteed once What exactly do they mean by that? In particular, how do they deal with duplicates? Do they report failure, or silently let a dup through in certain situations? If you could point me to the part of the spec that sets out the JMS resolution of these issues, that's be really useful. Tony _______________________________________________ rabbitmq-discuss mailing list [hidden email] https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss |
|
In reply to this post by Matthew Sackman-3
Matthew Sackman wrote:
> As a result, on the clients, you need to detect duplicates. Now this is > really a barrier to making all operations idempotent. The problem is > that you never know how many copies of a message there will be. Thus you > never know when it's safe to remove messages from your dedup cache. The other piece of this is time-to-live (TTL). Given a finite-length dedup cache and message TTL, you can detect and report failure. (And if the ack travels upstream to the publisher, you can report failures at the send end, too.) Without the TTL, you have silent dups on rare occasions. Tony _______________________________________________ rabbitmq-discuss mailing list [hidden email] https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss |
|
In reply to this post by Tony Garnock-Jones-5
Tony Garnock-Jones <[hidden email]> writes:
> John Apps wrote: >> The JMS standard offers guaranteed once > > What exactly do they mean by that? In particular, how do they deal > with duplicates? Do they report failure, or silently let a dup through > in certain situations? If you could point me to the part of the spec > that sets out the JMS resolution of these issues, that's be really > useful. As an API spec, it's quite easy for JMS to mandate something apparently impossible, without hinting at how it might actually be implemented. Most of the spec says that the PERSISTENT delivery mode gives "once-and-only-once" delivery. But section 4.4.13 (of JMS 1.1) admits that there are a number of caveats to this. So it's really "once-and-only-once-except-in-some-corner-cases". I think the wrinkle that might prevent us saying that RabbitMQ gives the same guarantees is on the publishing side. The caveats in JMS all seems to apply only to the consuming side. But what happens with an AMQP producer if the connection gets dropped before a tx.commit-ok gets back to the client? In that case the client has to re-publish, leading to a potential dup. This can be avoided by a de-dup filter on published messages in the broker. I don't know if JMS brokers really go to such lengths. David -- David Wragg Staff Engineer, RabbitMQ SpringSource, a division of VMware _______________________________________________ rabbitmq-discuss mailing list [hidden email] https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss |
|
In reply to this post by Tony Garnock-Jones-5
> John Apps wrote:
>> The JMS standard offers guaranteed once > > What exactly do they mean by that? In particular, how do they deal with > duplicates? Do they report failure, or silently let a dup through in certain > situations? If you could point me to the part of the spec that sets out the JMS > resolution of these issues, that's be really useful. For consumers, JMS has client ack mode; the application acknowledges messages, and the server must not resend a message that has been acknowledged. A failure in the connection may result in the server resending a message which the application thinks it has acknowledged. The spec suggests "Since such clients cannot know for certain if a particular message has been acknowledged, they must be prepared for redelivery of the last consumed message.". I.e., the client application has to have an idempotency barrier. For producers, duplicate publishing is simply prohibited. As for failure modes -- "A message that is redelivered due to session recovery is not considered a duplicate message." So JMS cannot magically do "exactly once" any more than anything else. --Michael _______________________________________________ rabbitmq-discuss mailing list [hidden email] https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss |
|
On Thu, Aug 05, 2010 at 03:17:28PM +0100, Michael Bridgen wrote:
> For producers, duplicate publishing is simply prohibited. So that seems to suggest that every messages is universally unique? If this is correct, who's responsibility is it to add GUIDs (or some such) to every message? Does the client library do that automatically? Matthew _______________________________________________ rabbitmq-discuss mailing list [hidden email] https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss |
|
The JMS provider sets the message id. It is supposed to be unique enough to be used for a "historical repository" but the scope of uniqueness is left to the provider. It is recommended that it should be at least unique for a given "installation". I don't think this helps on the publisher side since as you pointed out the notification of the completion of the publish might not make it back to the producer. JMS requires the provider to set the redelivered flag (and optionally the delivery count) field if it thinks the message has been given to the application before. The application may or may not have seen it but this flag can be used to trigger the check for a duplicate by the application. The use of unique message ids helps on this end. Tony Menges VMware, Inc. On 8/5/10 7:25 AM, "Matthew Sackman" <[hidden email]> wrote: On Thu, Aug 05, 2010 at 03:17:28PM +0100, Michael Bridgen wrote: > For producers, duplicate publishing is simply prohibited. So that seems to suggest that every messages is universally unique? If this is correct, who's responsibility is it to add GUIDs (or some such) to every message? Does the client library do that automatically? Matthew _______________________________________________ rabbitmq-discuss mailing list [hidden email] https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss _______________________________________________ rabbitmq-discuss mailing list [hidden email] https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss |
|
In reply to this post by Tony Garnock-Jones-5
From my possibly naive understanding of the spec, it means quite simply that a message will be delivered guaranteed once and only once; but I somehow do not think that that is quite what you were asking?
The nice part about JMS is that it is only an API spec and says nothing about implementation.
I would have to look into the spec to see what the answer is to the question: "...how do they deal with duplicates..." etc. If I find the time, I shall be happy to look at the odd JMS implementation and see what the various vendors do in cases such as that in question.
What I do know is that one can specify notification for when a message with "guaranteed delivery" simply cannot be delivered, for whatever reason. This can be to the client or, more likely, as a message from the 'server' to those that want to know.
A relatively unknown product called Reliable Transaction Router (RTR), architected and developed long ago by DEC and still maintained and developed by HP, warns, when it considers that a message *may* be a duplicate, i.e., has possibly been delivered previously, of the fact. This is also the case when messages are being 'replayed' after a server has been brought down and is now receiving messages which flowed through the network whilst it was down. There is much discussion around the word "guaranteed", the objection being that nothing can be "guaranteed". Of course it cannot, but if we take things to that extent, we may as well give up right away!
On Thu, Aug 5, 2010 at 15:48, Tony Garnock-Jones <[hidden email]> wrote:
-- --- John Apps (49) 171 869 1813 _______________________________________________ rabbitmq-discuss mailing list [hidden email] https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss |
|
In reply to this post by Tony Menges
On Thu, Aug 05, 2010 at 09:16:14AM -0700, Tony Menges wrote:
> The JMS provider sets the message id. It is supposed to be unique enough to be used for a "historical repository" but the scope of uniqueness is left to the provider. It is recommended that it should be at least unique for a given "installation". I don't think this helps on the publisher side since as you pointed out the notification of the completion of the publish might not make it back to the producer. > > JMS requires the provider to set the redelivered flag (and optionally the delivery count) field if it thinks the message has been given to the application before. The application may or may not have seen it but this flag can be used to trigger the check for a duplicate by the application. The use of unique message ids helps on this end. Ahh interesting. It would thus seem that JMS requires slightly more of the producer when publishing messages (more logic is required in the client library there) and AMQP possibly requires more at the consumer side. Matthew _______________________________________________ rabbitmq-discuss mailing list [hidden email] https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss |
|
Thanks all for the input. I've got a better understanding of the issues now and it sounds like the issue is the same regardless of the use of transactions.
Matthew's idea of having producers add a "republish count" to the messages is good suggestion to optimize the de-duplication of messages, but this only helps for messages resent by a producer. Can messages get duplicated while they are propagating in the broker? If duplicates are produced in the broker they will have the same "republish count" and this method won't work. -----Original Message----- From: [hidden email] [mailto:[hidden email]] On Behalf Of Matthew Sackman Sent: Thursday, August 05, 2010 9:26 AM To: [hidden email] Subject: Re: [rabbitmq-discuss] Exactly Once Delivery On Thu, Aug 05, 2010 at 09:16:14AM -0700, Tony Menges wrote: > The JMS provider sets the message id. It is supposed to be unique enough to be used for a "historical repository" but the scope of uniqueness is left to the provider. It is recommended that it should be at least unique for a given "installation". I don't think this helps on the publisher side since as you pointed out the notification of the completion of the publish might not make it back to the producer. > > JMS requires the provider to set the redelivered flag (and optionally the delivery count) field if it thinks the message has been given to the application before. The application may or may not have seen it but this flag can be used to trigger the check for a duplicate by the application. The use of unique message ids helps on this end. Ahh interesting. It would thus seem that JMS requires slightly more of the producer when publishing messages (more logic is required in the client library there) and AMQP possibly requires more at the consumer side. Matthew _______________________________________________ rabbitmq-discuss mailing list [hidden email] https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss _______________________________________________ rabbitmq-discuss mailing list [hidden email] https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss |
|
On Thu, Aug 05, 2010 at 10:28:17PM -0400, Mike Petrusis wrote:
> Can messages get duplicated while they are propagating in the broker? If duplicates are produced in the broker they will have the same "republish count" and this method won't work. Well, a message that is sent to an exchange which then results in the message going to several queues will obviously be duplicated. But presumably in that case, your consumers consuming from the different queues would be doing different tasks with the messages, hence the need for the different queues in the first place. That aside, no, within a queue, Rabbit does not arbitrarily duplicate messages. Matthew _______________________________________________ rabbitmq-discuss mailing list [hidden email] https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss |
|
In reply to this post by David Wragg-4
On 05/08/10 15:16, David Wragg wrote:
> Tony Garnock-Jones<[hidden email]> writes: > >> John Apps wrote: >> >>> The JMS standard offers guaranteed once >>> >> What exactly do they mean by that? In particular, how do they deal >> with duplicates? Do they report failure, or silently let a dup through >> in certain situations? If you could point me to the part of the spec >> that sets out the JMS resolution of these issues, that's be really >> useful. >> > As an API spec, it's quite easy for JMS to mandate something apparently > impossible, without hinting at how it might actually be implemented. > > Most of the spec says that the PERSISTENT delivery mode gives > "once-and-only-once" delivery. But section 4.4.13 (of JMS 1.1) admits > that there are a number of caveats to this. So it's really > "once-and-only-once-except-in-some-corner-cases". > > I think the wrinkle that might prevent us saying that RabbitMQ gives the > same guarantees is on the publishing side. The caveats in JMS all seems > to apply only to the consuming side. But what happens with an AMQP > producer if the connection gets dropped before a tx.commit-ok gets back > to the client? In that case the client has to re-publish, leading to a > potential dup. This can be avoided by a de-dup filter on published > messages in the broker. I don't know if JMS brokers really go to such > lengths. > detection on the server side, to get around the "lost commit-ok problem" and give us near as possible once and only once, from the publisher to the server at least. The way we do it in HornetQ is we have a well defined header key "_HQ_DUP_ID". The client can set this with some unique value of it's choice before sending (e.g. a GUID). When the server receives the message if the _HQ_DUP_ID header is set, it looks up the value in it's cache, and if it's seen it before it ignores it. The cache can optionally be persisted. On the client side, the producer can resend the message/transaction if it does not receive a confirmation-ok, so it effectively makes sends/commits idempotent. David > -- Sent from my BBC Micro Model B Tim Fox JBoss HornetQ - putting the buzz in messaging http://hornetq.org http://hornetq.blogspot.com/ http://twitter.com/hornetq irc://irc.freenode.net:6667#hornetq [hidden email] _______________________________________________ rabbitmq-discuss mailing list [hidden email] https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss |
|
On Fri, Aug 06, 2010 at 10:43:56PM +0100, Tim Fox wrote:
> The way we do it in HornetQ is we have a well defined header key > "_HQ_DUP_ID". The client can set this with some unique value of it's > choice before sending (e.g. a GUID). When the server receives the > message if the _HQ_DUP_ID header is set, it looks up the value in > it's cache, and if it's seen it before it ignores it. The cache can > optionally be persisted. How do you prevent the cache from growing without bound? Matthew _______________________________________________ rabbitmq-discuss mailing list [hidden email] https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss |
|
On Sat, Aug 7, 2010 at 13:50, Matthew Sackman <[hidden email]> wrote: That's really like the piece of string question, no? Of course it can fill up, as can the DB where things are persisted for those cases where messages cannot be delivered.
Having an unique ID in every message is not something new and not restricted to messaging, of course. It is simply a very good idea! TCP/IP claims to be a 'reliable' transport...The problem with that is that packets get 'lost' or 'dropped' or simply die of 'old age'. Similar, but more complex, problems exist with queuing.
What has not been touched on in this little discussion so far is the question of transactions, and I do not mean those in the 0.9.1 spec, but those described in the 1.0 spec. Here again, JMS is leading the way with something which in my mind is as necessary as guaranteed once (or at least once). Updating DBs from queues and posting the results of those updates to queues should be atomic; and if I want my debit/credit to happen once rather than many times or not at all, then a combination of transactions and guaranteed delivery becomes very attractive both to the designer and the developers. Yes, ACID comes to mind here...and it is indeed what I am referring to.
It is great to participate in conversations of this nature - thank you for putting up with my sometimes oblique ramblings:-) --- John Apps (49) 171 869 1813 _______________________________________________ rabbitmq-discuss mailing list [hidden email] https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss |
|
John,
John Apps wrote: > That's really like the piece of string question, no? Of course it can > fill up, as can the DB where things are persisted for those cases where > messages cannot be delivered. > Having an unique ID in every message is not something new and not > restricted to messaging, of course. It is simply a very good idea! I believe Matthew was simply trying to point out that many of the supposed guarantees of messaging systems are a lot softer than most people think. In reality a "guarantee" is little more than an increase in the probability that the right thing will happen. Coming clean about that is going to be important for cloud computing to succeed - improving the probabilities does come at a price, and for systems at massive scales the cost/benefit calculations look quite different. So, for example, using publisher-supplied message ids for de-duping simply does not scale. Think what a genuine cloud messaging system would have to do to handle the case where a producer injects the same message first in a node in Australia and then in New York. > What has not been touched on in this little discussion so far is the > question of transactions Similar considerations apply here. XA in the cloud? Hmmm. Regards, Matthias. _______________________________________________ rabbitmq-discuss mailing list [hidden email] https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss |
|
Matthias Radestock wrote:
> So, for example, using publisher-supplied message ids for de-duping > simply does not scale. Think what a genuine cloud messaging system would > have to do to handle the case where a producer injects the same message > first in a node in Australia and then in New York. What is the problem you're thinking of? Would a setup like the following cope? - publishers choose a message ID - publishers choose a TTL - receivers dedup based on message ID - receiver's dedup buffer is expired by (some factor of) TTL - each delivery contains an address to which the ACK should be routed Tony _______________________________________________ rabbitmq-discuss mailing list [hidden email] https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss |
|
Tony,
Tony Garnock-Jones wrote: > Would a setup like the following cope? > > - publishers choose a message ID > - publishers choose a TTL > - receivers dedup based on message ID > - receiver's dedup buffer is expired by (some factor of) TTL > - each delivery contains an address to which the ACK should be routed That's end-to-end dedup you are thinking of. Nothing wrong with that, and it doesn't require the broker to do/know anything. The context of the discussion here was a "broker dedups publishes" feature. Matthias. _______________________________________________ rabbitmq-discuss mailing list [hidden email] https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss |
|
In reply to this post by Matthew Sackman-3
On Sat, Aug 7, 2010 at 12:50 PM, Matthew Sackman <[hidden email]> wrote:
> On Fri, Aug 06, 2010 at 10:43:56PM +0100, Tim Fox wrote: >> The way we do it in HornetQ is we have a well defined header key >> "_HQ_DUP_ID". The client can set this with some unique value of it's >> choice before sending (e.g. a GUID). When the server receives the >> message if the _HQ_DUP_ID header is set, it looks up the value in >> it's cache, and if it's seen it before it ignores it. The cache can >> optionally be persisted. > > How do you prevent the cache from growing without bound? AFAIK the normal approach with this system is to bound it arbitrarily. > Matthew > _______________________________________________ > rabbitmq-discuss mailing list > [hidden email] > https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss > _______________________________________________ rabbitmq-discuss mailing list [hidden email] https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss |
| Free forum by Nabble | Edit this page |
