rabbitmq cluster startup problem

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view

rabbitmq cluster startup problem

Ramesh Natarajan
We are currently running a rabbitmq cluster running on RHEL 6.5 64 bit os, comprising of 10 nodes. We use the auto-configuration for the cluster where the first node doesn't have any cluster_nodes specified in the rabbitmq.config and all other nodes have just the first node specified in the cluster_nodes.  I bring up the first node and then the rest of the nodes.  I see the cluster is setup correctly and things seem to work fine.   

However occasionally when the nodes reboot I see the startup hangs in Starting rabbitmq-cluster. It seems to hang forever and doesn't timeout or anything.  In some cases we have left the system for a couple of hours and it doesn't seem to timeout, suggesting the system is in a deadlock or something.  A reset of the node in the hung state sometimes recovers and sometimes it doesn't. 

The strange part is I cannot reproduce this at will but it happens nevertheless.

Has anyone seen this behavior?

Is specifying the cluster_nodes the way I described is the correct way to do so?

I would appreciate if anyone has any suggestions on how to deal with this issue..

                        {os_mon,"CPO  CXC 138 46","2.2.7"},
                        {xmerl,"XML parser","1.2.10"},
                        {mnesia,"MNESIA  CXC 138 12","4.5"},
                        {sasl,"SASL  CXC 138 11","2.1.10"},
                        {stdlib,"ERTS  CXC 138 10","1.17.5"},
                        {kernel,"ERTS  CXC 138 10","2.14.5"}]},
 {erlang_version,"Erlang R14B04 (erts-5.8.5) [source] [64-bit] [smp:4:4] [rq:4] [async-threads:30] [kernel-poll:true]\n"},

rabbitmq-discuss mailing list has moved to https://groups.google.com/forum/#!forum/rabbitmq-users,
please subscribe to the new list!

[hidden email]