Mirrored Queues Down Behaviour

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Mirrored Queues Down Behaviour

Ben Murphy
I've noticed that when rabbitmq nodes becomes unreachable the gm
module can behave very strangely.

It looks like the server is trying to fetch its own record in the view
but this record no longer exists so it crashes. It's view according to
the state at the start of the function is:

'{{0,<2908.114.8>}, {0,<0.19014.9>}, {1,<23021.2099.0>}' and its
process is ' {0,<0.19014.9>}' however it ends up trying to look itself
up in a view that only contains these members: '{1,<23021.2099.0>}'
(https://github.com/rabbitmq/rabbitmq-server/blob/rabbitmq_v3_2_3/src/gm.erl#L1245).
I notice that the state of the view is also stored in mnesia. I think
this state is shared across the cluster but I'm not sure.

If this is true then it might be possible for a node A to consider a
gm process on node B down and then push this state to node A via
mnesia. Then when node A gets the down message for node B and tries to
update its state it will try and fetch itself from the view and will
crash.

** Reason for termination ==
** {function_clause,
       [{orddict,fetch,
            [{0,<0.19014.9>},
             [{{1,<23021.2099.0>},
               {view_member,
                   {1,<23021.2099.0>},
                   [{0,<2908.114.8>}],
                   {1,<23021.2099.0>},
                   {1,<23021.2099.0>}}}]],
            [{file,"orddict.erl"},{line,72}]},
        {gm,check_neighbours,1,[]},
        {gm,handle_info,2,[]},
        {gen_server2,handle_msg,2,[]},
        {proc_lib,wake_up,3,[{file,"proc_lib.erl"},{line,249}]}]}

=ERROR REPORT==== 22-May-2014::15:12:18 ===
** Generic server <0.19014.9> terminating
** Last message in was {'DOWN',#Ref<0.0.195.189462>,process,<2908.114.8>,
                               noconnection}
** When Server state == {state,
                            {0,<0.19014.9>},
                            {{0,<2908.114.8>},#Ref<0.0.195.189462>},
                            {{1,<23021.2099.0>},#Ref<0.0.1718.177207>},
                            {resource,<<"/">>,queue,
                                <<"retry_queue_600_mo_service_12">>},
                            rabbit_mirror_queue_coordinator,
                            {2,
                             [{{0,<2908.114.8>},
                               {view_member,
                                   {0,<2908.114.8>},
                                   [],
                                   {1,<23021.2099.0>},
                                   {0,<0.19014.9>}}},
                              {{0,<0.19014.9>},
                               {view_member,
                                   {0,<0.19014.9>},
                                   [],
                                   {0,<2908.114.8>},
                                   {1,<23021.2099.0>}}},
                              {{1,<23021.2099.0>},
                               {view_member,
                                   {1,<23021.2099.0>},
                                   [],
                                   {0,<0.19014.9>},
                                   {0,<2908.114.8>}}}]},
                            2,
                            [{{0,<2908.114.8>},{member,{[],[]},0,0}},
                             {{0,<0.19014.9>},{member,{[],[]},2,2}},
                             {{1,<23021.2099.0>},{member,{[],[]},0,0}}],
                            [<0.19013.9>],
                            {[],[]},
                            [],0,undefined,
                            #Fun<rabbit_misc.execute_mnesia_transaction.1>}
_______________________________________________
rabbitmq-discuss mailing list
[hidden email]
https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss
Reply | Threaded
Open this post in threaded view
|

Re: Mirrored Queues Down Behaviour

Matthias Radestock-3
On 23/05/14 13:21, Ben Murphy wrote:
> I've noticed that when rabbitmq nodes becomes unreachable the gm
> module can behave very strangely.

Please try the latest rabbit version. There are a number of bugs we have
fixed in that area.

Matthias.
_______________________________________________
rabbitmq-discuss mailing list
[hidden email]
https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss