Confirm consistent hash exchange behavior

classic Classic list List threaded Threaded
16 messages Options
Reply | Threaded
Open this post in threaded view
|

Confirm consistent hash exchange behavior

SteveO
Greetings-

I wanted to get some clarification on CHX hash key routing to a queue. Let's say I have Queue1 and Queue2 bound with equal weight to a CHX. I produce a message with the routing key "Car.Color.Blue" to the CHX and it is routed to Queue2. All is well. I delete Queue2. I produce another message with the routing key "Car.Color.Blue" to the CHX and it is now routed to Queue1. All is well. I re-create and bind Queue2 to the CHX. I produce another message with the routing key "Car.Color.Blue" to the CHX and it is routed back to Queue2. That's the part that seemed odd.

The design requirements I have is that all messages with the same routing-key be present in only one queue. I can't meet those requirements by that queue deletion/re-creation with the same name. So, I'd have to avoid doing that or find out if this is expected behavior.

The set up I have is this...

P ---> TopicExchange --->CHX ---> Q1, Q2

Producers send messages with routing keys like "Car.Color.<some color>".
The CHX is bound to the TopicExchange with a routing key of "Car.Color.#".
Queues 1 and 2 are bound to the CHX with equal weight.

I am at Rabbit 3.0.1.

Thanks.

s
Reply | Threaded
Open this post in threaded view
|

Re: Confirm consistent hash exchange behavior

francois

Le 2013-03-14 à 13:07, SteveO a écrit :

> Greetings-
>
> I wanted to get some clarification on CHX hash key routing to a queue. Let's
> say I have Queue1 and Queue2 bound with equal weight to a CHX. I produce a
> message with the routing key "Car.Color.Blue" to the CHX and it is routed to
> Queue2. All is well. I delete Queue2. I produce another message with the
> routing key "Car.Color.Blue" to the CHX and it is now routed to Queue1. All
> is well. I re-create and bind Queue2 to the CHX. I produce another message
> with the routing key "Car.Color.Blue" to the CHX and it is routed back to
> Queue2. That's the part that seemed odd.
I find it odd that you find it odd. My expectations were exactly as you described above: the routing decision is taken in light of the bound queues at the time the routing decision is taken.

You expected that once a routing key was routed to a queue, it continued routing to that specific queue until the queue is unbound?

Bye,
François
_______________________________________________
rabbitmq-discuss mailing list
[hidden email]
https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss

smime.p7s (5K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Confirm consistent hash exchange behavior

SteveO
"You expected that once a routing key was routed to a queue, it continued routing to that specific queue until the queue is unbound? "

Yeah, not sure why I - but that is what I expected. I probably started thinking that because that's what I wanted it to do :)

We wanted to guarantee that at any given time, messages would only be delivered to one queue even if the bindings to other queues changed - like other queues being added. Processing messages in order is important for us and if we create a condition where messages exist in multiple queues, we can't guarantee that order. Over time, if we create and bind more queues to the same CHX, then we could create that condition - so we don't want to do that.

If we run into an issue where we need to add more queues, it sounds like with a slightly more sophisticated producer and routing scheme, we could just add another CHX. Something like this...

                             |--->CHX1--->Q1, Q2
P ---> TopicExchange
                             |--->CHX2--->Q1, Q2

Thanks.

s
Reply | Threaded
Open this post in threaded view
|

Re: Confirm consistent hash exchange behavior

Simon MacMullen-2
If we were to build the CHX this way, wouldn't that mean that as soon as
you bound the first queue to it, all messages would have to get routed
to that queue - and therefore subsequent bindings would have no effect
since we can't change any routing?

Cheers, Simon

On 14/03/13 20:42, SteveO wrote:

> "You expected that once a routing key was routed to a queue, it continued
> routing to that specific queue until the queue is unbound? "
>
> Yeah, not sure why I - but that is what I expected. I probably started
> thinking that because that's what I wanted it to do :)
>
> We wanted to guarantee that at any given time, messages would only be
> delivered to one queue even if the bindings to other queues changed - like
> other queues being added. Processing messages in order is important for us
> and if we create a condition where messages exist in multiple queues, we
> can't guarantee that order. Over time, if we create and bind more queues to
> the same CHX, then we could create that condition - so we don't want to do
> that.
>
> If we run into an issue where we need to add more queues, it sounds like
> with a slightly more sophisticated producer and routing scheme, we could
> just add another CHX. Something like this...
>
>                               |--->CHX1--->Q1, Q2
> P ---> TopicExchange
>                               |--->CHX2--->Q1, Q2
>
> Thanks.
>
> s
>
>
>
> --
> View this message in context: http://rabbitmq.1065348.n5.nabble.com/Confirm-consistent-hash-exchange-behavior-tp25458p25462.html
> Sent from the RabbitMQ mailing list archive at Nabble.com.
> _______________________________________________
> rabbitmq-discuss mailing list
> [hidden email]
> https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss
>


--
Simon MacMullen
RabbitMQ, VMware
_______________________________________________
rabbitmq-discuss mailing list
[hidden email]
https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss
Reply | Threaded
Open this post in threaded view
|

Re: Confirm consistent hash exchange behavior

SteveO
Subsequent bindings would have no affect on routing keys that have already been seen. When a message with a new routing key arrives, new bindings (or even existing ones) could be used for the new routing key.

Once I looked beyond the specific requirements of our application, and thinking about distribution, I completely realize the behavior I seek breaks the nature of the CHX. I understand why it's not doing what I want. So I guess what I am looking at is some crazy feature or a completely different exchange altogether.

Here are the high level requirements:
1. Need to scale
2. Need to process messages with the same routing key in order

Here is what is highly desired:
1. Elastic scaling of queues
2. No back-channel consumer/producer communication

A single queue with multiple consumers gives me req1, but not req2. A single queue with a single consumer gives me req2, but not req1. By way of a CHX, multiple queues and multiple consumers gets me both req1 and req2. However with the CHX, if I accomplish scaling up by adding another queue and consumer, I can run into situations where I can't guarantee req2. That's because I can get messages with the same routing key in more than one queue. For an alternative method of scale, I can accomplish req1 and req2 by adding another CHX. However, with our application that would require some back-channel communication between producers and consumers which is highly undesirable.

A type of "pick and stick" method could work well for us. Our application is processing messages for events that span a finite duration of time. These events could last a few minutes, a few hours or a few days. Messages pertaining to a specific event are signified by a GUID in the routing key. Once the event is done, messages are consumed and processed, the routing key for that event is never seen again. I just can't trust the timeliness of my consumers, and I have to guarantee message processing order. That's why I have to ensure that messages with the same routing key only exist in one queue at a time.

I suspect I could have discussed this situation much better if had simply stated that rather than questioning the CHX : )

Is this a crazy feature request for a switch to turn this on in the CHX?
Am I looking at a custom exchange?
Something else?

Thanks again.

s
Reply | Threaded
Open this post in threaded view
|

Re: Confirm consistent hash exchange behavior

Alvaro Videla-2
Hi,

I think this solution can address your problem:

- Create queues 1, 2, .... N
- Bind queues to the CHX
- publish the queue names to a queue called "lock" for example.

Start consumer A, and make it consume _just one message_ from queue "lock", in ack mode. That message will contain the queue name from where that consumer needs to get messages. 

Then you can start consumer B and do the same, so you get the name of the next queue where to get the messages.

Because you are using the CH exchange, messages witch certain routing key will always end up in the same queue. Therefore you can also process messages in order.

If you need to stop one of those consumers, they need to basic_reject(requeue=true) the message they first got from the "lock" queue so another consumer can use that queue. If one of those consumers crashes then the message with the queue name will get requeued. NOTE: you never ack any message from the "lock" queue.

Does this help?

Regards,

Alvaro



On Fri, Mar 15, 2013 at 4:46 PM, SteveO <[hidden email]> wrote:
Subsequent bindings would have no affect on routing keys that have already
been seen. When a message with a new routing key arrives, new bindings (or
even existing ones) could be used for the new routing key.

Once I looked beyond the specific requirements of our application, and
thinking about distribution, I completely realize the behavior I seek breaks
the nature of the CHX. I understand why it's not doing what I want. So I
guess what I am looking at is some crazy feature or a completely different
exchange altogether.

Here are the high level requirements:
1. Need to scale
2. Need to process messages with the same routing key in order

Here is what is highly desired:
1. Elastic scaling of queues
2. No back-channel consumer/producer communication

A single queue with multiple consumers gives me req1, but not req2. A single
queue with a single consumer gives me req2, but not req1. By way of a CHX,
multiple queues and multiple consumers gets me both req1 and req2. However
with the CHX, if I accomplish scaling up by adding another queue and
consumer, I can run into situations where I can't guarantee req2. That's
because I can get messages with the same routing key in more than one queue.
For an alternative method of scale, I can accomplish req1 and req2 by adding
another CHX. However, with our application that would require some
back-channel communication between producers and consumers which is highly
undesirable.

A type of "pick and stick" method could work well for us. Our application is
processing messages for events that span a finite duration of time. These
events could last a few minutes, a few hours or a few days. Messages
pertaining to a specific event are signified by a GUID in the routing key.
Once the event is done, messages are consumed and processed, the routing key
for that event is never seen again. I just can't trust the timeliness of my
consumers, and I have to guarantee message processing order. That's why I
have to ensure that messages with the same routing key only exist in one
queue at a time.

I suspect I could have discussed this situation much better if had simply
stated that rather than questioning the CHX : )

Is this a crazy feature request for a switch to turn this on in the CHX?
Am I looking at a custom exchange?
Something else?

Thanks again.

s



--
View this message in context: http://rabbitmq.1065348.n5.nabble.com/Confirm-consistent-hash-exchange-behavior-tp25458p25471.html
Sent from the RabbitMQ mailing list archive at Nabble.com.
_______________________________________________
rabbitmq-discuss mailing list
[hidden email]
https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss


_______________________________________________
rabbitmq-discuss mailing list
[hidden email]
https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss
Reply | Threaded
Open this post in threaded view
|

Re: Confirm consistent hash exchange behavior

Matthias Radestock-3
On 15/03/13 21:56, Alvaro Videla wrote:
> - Create queues 1, 2, .... N
> - Bind queues to the CHX
> - publish the queue names to a queue called "lock" for example.
> [etc]

With that approach the number of consumers must be exactly the same as
the number of queues; if you have fewer then some messages won't get
consumed, and if you have more then some consumers won't process any
messages.

Furthermore, you cannot add/remove queues since that will cause some
subsequent messages to get routed to different queues than prior to the
queue addition/removal. They will then be picked up by different consumers.

Matthias.
_______________________________________________
rabbitmq-discuss mailing list
[hidden email]
https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss
Reply | Threaded
Open this post in threaded view
|

Re: Confirm consistent hash exchange behavior

Brett Cameron
Could you possibly get around this by extending Alvaro's idea a little bit, and have consumers do a basic.reject with requeue=true on messages (queue names) consumed from the "lock" queue?

On Sat, Mar 16, 2013 at 11:37 AM, Matthias Radestock <[hidden email]> wrote:
On 15/03/13 21:56, Alvaro Videla wrote:
- Create queues 1, 2, .... N
- Bind queues to the CHX
- publish the queue names to a queue called "lock" for example.
[etc]

With that approach the number of consumers must be exactly the same as the number of queues; if you have fewer then some messages won't get consumed, and if you have more then some consumers won't process any messages.

Furthermore, you cannot add/remove queues since that will cause some subsequent messages to get routed to different queues than prior to the queue addition/removal. They will then be picked up by different consumers.

Matthias.

_______________________________________________
rabbitmq-discuss mailing list
[hidden email]
https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss


_______________________________________________
rabbitmq-discuss mailing list
[hidden email]
https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss
Reply | Threaded
Open this post in threaded view
|

Re: Confirm consistent hash exchange behavior

Matthias Radestock-3
On 15/03/13 22:49, Brett Cameron wrote:
> Could you possibly get around this by extending Alvaro's idea a little
> bit, and have consumers do a basic.reject with requeue=true on messages
> (queue names) consumed from the "lock" queue?

The whole point of the "lock" queue is to ensure that at most one
consumer consumes from a given queue. Requeueing messages there, and
thus releasing the lock, would break that.

Matthias.
_______________________________________________
rabbitmq-discuss mailing list
[hidden email]
https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss
Reply | Threaded
Open this post in threaded view
|

Re: Confirm consistent hash exchange behavior

SteveO
Thanks all for the feedback.

I'm not certain how to handle my basic "elastic scale up" scenario. That would be where I have 4 queues bound to the CHX, and my consumers aren't keeping up. I add a 5th without removing any of the other 4. I think that is a condition where I could wind up with messages with the same routing key in multiple queues.

Our application will have producers sending messages with new GUIDs in the routing key throughout the course of any given day. The frequency of just how many new GUIDs over a set duration of time will vary throughout the course of the day. Business hours would have more new GUIDs than off-business hours, for example. That's why I thought it would be in my best interest to have a mechanism for scaling up and down. That just gets really tricky when I have that "must process everything in order" requirement.

s
Reply | Threaded
Open this post in threaded view
|

Re: Confirm consistent hash exchange behavior

Simon MacMullen-2
Thanks for explaining your use case!

Of course, the whole point of the hashing part of the CHX is to ensure
that the broker does not need to maintain a routing key -> queue
mapping. It sounds like your case would require that. Which is not an
inconceivable thing to do, but it would mean we would need to think
about how that mapping can be prevented from constituting a memory leak.

I assume that you see new GUIDs quite frequently; is there a point at
which you know you won't see a certain GUID again? Or can you say
something like "if I haven't seen a certain GUID for 5 min, we can
forget about it?"

Cheers, Simon

On 16/03/2013 12:17AM, SteveO wrote:

> Thanks all for the feedback.
>
> I'm not certain how to handle my basic "elastic scale up" scenario. That
> would be where I have 4 queues bound to the CHX, and my consumers aren't
> keeping up. I add a 5th without removing any of the other 4. I think that is
> a condition where I could wind up with messages with the same routing key in
> multiple queues.
>
> Our application will have producers sending messages with new GUIDs in the
> routing key throughout the course of any given day. The frequency of just
> how many new GUIDs over a set duration of time will vary throughout the
> course of the day. Business hours would have more new GUIDs than
> off-business hours, for example. That's why I thought it would be in my best
> interest to have a mechanism for scaling up and down. That just gets really
> tricky when I have that "must process everything in order" requirement.
>
> s
>
>
>
> --
> View this message in context: http://rabbitmq.1065348.n5.nabble.com/Confirm-consistent-hash-exchange-behavior-tp25458p25478.html
> Sent from the RabbitMQ mailing list archive at Nabble.com.
> _______________________________________________
> rabbitmq-discuss mailing list
> [hidden email]
> https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss
>
_______________________________________________
rabbitmq-discuss mailing list
[hidden email]
https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss
Reply | Threaded
Open this post in threaded view
|

Re: Confirm consistent hash exchange behavior

Alvaro Videla-2
Why then not just have one queue per GUID and use the anon exchange to do the routing? Then remove the queue when the GUID is not needed anymore.

Or here are we talking about of mapping several GUIDs to the same queue?

BTW, for a use case similar to this one is why I proposed the interval exchange.

Regards,

Alvaro



On Mon, Mar 18, 2013 at 11:20 AM, Simon MacMullen <[hidden email]> wrote:
Thanks for explaining your use case!

Of course, the whole point of the hashing part of the CHX is to ensure that the broker does not need to maintain a routing key -> queue mapping. It sounds like your case would require that. Which is not an inconceivable thing to do, but it would mean we would need to think about how that mapping can be prevented from constituting a memory leak.

I assume that you see new GUIDs quite frequently; is there a point at which you know you won't see a certain GUID again? Or can you say something like "if I haven't seen a certain GUID for 5 min, we can forget about it?"

Cheers, Simon


On 16/03/2013 12:17AM, SteveO wrote:
Thanks all for the feedback.

I'm not certain how to handle my basic "elastic scale up" scenario. That
would be where I have 4 queues bound to the CHX, and my consumers aren't
keeping up. I add a 5th without removing any of the other 4. I think that is
a condition where I could wind up with messages with the same routing key in
multiple queues.

Our application will have producers sending messages with new GUIDs in the
routing key throughout the course of any given day. The frequency of just
how many new GUIDs over a set duration of time will vary throughout the
course of the day. Business hours would have more new GUIDs than
off-business hours, for example. That's why I thought it would be in my best
interest to have a mechanism for scaling up and down. That just gets really
tricky when I have that "must process everything in order" requirement.

s



--
View this message in context: http://rabbitmq.1065348.n5.nabble.com/Confirm-consistent-hash-exchange-behavior-tp25458p25478.html
Sent from the RabbitMQ mailing list archive at Nabble.com.
_______________________________________________
rabbitmq-discuss mailing list
[hidden email]
https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss

_______________________________________________
rabbitmq-discuss mailing list
[hidden email]
https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss


_______________________________________________
rabbitmq-discuss mailing list
[hidden email]
https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss
Reply | Threaded
Open this post in threaded view
|

Re: Confirm consistent hash exchange behavior

Simon MacMullen-2
On 18/03/2013 10:42AM, Alvaro Videla wrote:
> Why then not just have one queue per GUID and use the anon exchange to
> do the routing? Then remove the queue when the GUID is not needed anymore.

I am assuming the use case involves large numbers of GUIDs with a few
messages each. Keeping track of all those queues and ensuring consumers
each get their share would not be fun.

> BTW, for a use case similar to this one is why I proposed the interval
> exchange.

But don't you have the same problem as with the CHX? When you want to
add a new queue, you have to rearrange the intervals somehow in order
for the new queue to receive any work... at which point you just broke
per-GUID message ordering.

Cheers, Simon
_______________________________________________
rabbitmq-discuss mailing list
[hidden email]
https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss
Reply | Threaded
Open this post in threaded view
|

Re: Confirm consistent hash exchange behavior

SteveO
Simon-

Earlier you had asked...

I assume that you see new GUIDs quite frequently; is there a point at
which you know you won't see a certain GUID again? Or can you say
something like "if I haven't seen a certain GUID for 5 min, we can
forget about it?"


Yes, that assumption could be made. I'd have to work through some use cases to see what the timer should be. I suspect if it were configurable, the more flexible it could be if others find this useful. Thanks.

s
Reply | Threaded
Open this post in threaded view
|

Re: Confirm consistent hash exchange behavior

SteveO
I was just curious if I wanted to accomplish the requirements mentioned earlier, would I be looking at writing a custom exchange, or was there any consideration of this as a feature request. The exact requirements felt pretty specific to our needs, I didn't know if others found it useful or not.

I was just trying to identify scope of work. Thanks!
Reply | Threaded
Open this post in threaded view
|

Re: Confirm consistent hash exchange behavior

Simon MacMullen-2
On 25/03/13 16:23, SteveO wrote:
> I was just curious if I wanted to accomplish the requirements mentioned
> earlier, would I be looking at writing a custom exchange, or was there any
> consideration of this as a feature request. The exact requirements felt
> pretty specific to our needs, I didn't know if others found it useful or
> not.

It's being considered as a feature request; I suspect there are others
around with similar needs.

Having said that, we're focussing on getting RabbitMQ 3.1 ready to go at
the moment, so I wouldn't expect anything to happen imminently.

Cheers, Simon

--
Simon MacMullen
RabbitMQ, VMware
_______________________________________________
rabbitmq-discuss mailing list
[hidden email]
https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss