Excessive memory consumption of one server in cluster setup

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

Excessive memory consumption of one server in cluster setup

Matthias Reik
Hi,

2 days ago I have upgraded our RabbitMQ cluster (2 machines running in HA mode) from 2.8.1 to 2.8.5.
Mainly to get those OOD fixes in, since we experienced those exact issues.

The upgrade went very smooth, but at some point one of the machines (server2) started to allocate more
and more memory (even though all queues are more or less at 0 with almost no outstanding acks)

server 1 uses ~200Mb
server 2 (at the point where I took it down) used ~6Gb

I run a rabbitmqctl report... but it didn't give me any insights
I run a rabbitmqctl eval 'erlang:memory().' but that didn't tell me too much more (but I will attach that at the end)

I found people with similar problems:
http://grokbase.com/t/rabbitmq/rabbitmq-discuss/1223qcx3gg/rabbitmq-memory-usage-in-two-node-cluster
but that's a while back so many things might have changed since then. Also the memory difference was
rather minimal, whereas here the difference is _very_ significant, especially since the node with less load
has the increased memory footprint.

I can upgrade to 2.8.6 (unfortunately I upgraded just before it was released :-(), but I only want to do that if
there is some hope that the problem is solved.
I can bring server2 back online and try to investigate what is consuming that much memory, but my
RabbitMQ/Erlang knowledge is not good enough, therefore reaching out for some help.

So any help would be much appreciated.

Thx
Matthias

Our setup is something like the following:
      2 servers exclusively running RabbitMQ on CentOS 5.x (high watermark ~22Gb),
            - both with web-console enabled
            - both defined as disk nodes
            - both running RabbitMQ 2.8.5 on Erlang R15B01 (after the upgrade, Erlang was already at R15 before)
    10 queues configured with mirroring
      3 queues configured (without mirroring) only on server1 with a TTL
    Most consumers are connecting to server1, server2 only in case of failover

We get about 1k messages/sec (with peaks much higher than that) into the system, and each message is
passed through several queues for processing.

-bash-3.2$ sbin/rabbitmqctl eval 'erlang:memory().'
[{total,5445584424},
 {processes,2184155418},
 {processes_used,2184122352},
 {system,3261429006},
 {atom,703377},
 {atom_used,678425},
 {binary,3216386480},
 {code,17978535},
 {ets,4142048}]
...done.

PS: Sorry for the eventual spam (caused by mail-box mix up :-(

_______________________________________________
rabbitmq-discuss mailing list
[hidden email]
https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss
Reply | Threaded
Open this post in threaded view
|

Re: Excessive memory consumption of one server in cluster setup

Francesco Mazzoli-2
Hi Matthias,

Can you please post the logs and `rabbitmqctl report'?  It's almost impossible
to help you without those.

Could you also try to reproduce the issue with less things going on (e.g. only
one queue) and describe in detail what you're doing to trigger the problem?
Thanks.

At Fri, 24 Aug 2012 17:30:44 +0200,
Matthias Reik wrote:
> I can upgrade to 2.8.6 (unfortunately I upgraded just before it was released
> :-(), but I only want to do that if there is some hope that the problem is
> solved.

There weren't any HA related leaks fixed in 2.8.6, so I don't think upgrading
will fix the problem.  However it would make our work much easier, so please
upgrade if you can.

--
Francesco * Often in error, never in doubt
_______________________________________________
rabbitmq-discuss mailing list
[hidden email]
https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss
Reply | Threaded
Open this post in threaded view
|

Re: Excessive memory consumption of one server in cluster setup

Matthias Reik
Thx Francsco,

I send you the files directly (don't want to make them public, even thought they don't contain
anything super secret.

I hope that's ok

Cheers
Matthias


On Fri, Aug 24, 2012 at 5:43 PM, Francesco Mazzoli <[hidden email]> wrote:
Hi Matthias,

Can you please post the logs and `rabbitmqctl report'?  It's almost impossible
to help you without those.

Could you also try to reproduce the issue with less things going on (e.g. only
one queue) and describe in detail what you're doing to trigger the problem?
Thanks.

At Fri, 24 Aug 2012 17:30:44 +0200,
Matthias Reik wrote:
> I can upgrade to 2.8.6 (unfortunately I upgraded just before it was released
> :-(), but I only want to do that if there is some hope that the problem is
> solved.

There weren't any HA related leaks fixed in 2.8.6, so I don't think upgrading
will fix the problem.  However it would make our work much easier, so please
upgrade if you can.

--
Francesco * Often in error, never in doubt
_______________________________________________
rabbitmq-discuss mailing list
[hidden email]
https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss


_______________________________________________
rabbitmq-discuss mailing list
[hidden email]
https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss
Reply | Threaded
Open this post in threaded view
|

Re: Excessive memory consumption of one server in cluster setup

Francesco Mazzoli-2
Thanks.  I'm going home, and Monday is bank holiday in the UK, so I'll have a
look at this on Tuesday.

--
Francesco * Often in error, never in doubt
_______________________________________________
rabbitmq-discuss mailing list
[hidden email]
https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss
Reply | Threaded
Open this post in threaded view
|

Re: Excessive memory consumption of one server in cluster setup

Francesco Mazzoli-2
At Fri, 24 Aug 2012 17:10:33 +0100,
Francesco Mazzoli wrote:
>
> Thanks.  I'm going home, and Monday is bank holiday in the UK, so I'll have a
> look at this on Tuesday.

Oh well, I didn't realise that you posted that message in the mailing list.  As
you say, they shouldn't contain anything secret :).

--
Francesco * Often in error, never in doubt
_______________________________________________
rabbitmq-discuss mailing list
[hidden email]
https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss
Reply | Threaded
Open this post in threaded view
|

Fwd: Excessive memory consumption of one server in cluster setup

Matthias Reik
In reply to this post by Francesco Mazzoli-2
Now also posting to the list...

Hi Fransesco,

I don't want to publicly publish the rabbitmqctl report... I hope it's ok that I send it to you directly
Unfortunately the log file is 35Mb big... so I stripped out all the stuff until the "interesting" part

So far I have never seen this behavior on any of our other RabbitMQ instances (but that was the
first one we upgraded to 2.8.5)... and that one is a production one... so running with a smaller
test setup is not really possible :-(

Regarding upgrading:
Currently one of the queues is very full (due to the backlog created by the slowed down RabbitMQ,
the side effect of the excessive memory consumption) and therefore I don't want to do the upgrade
right now, but by Monday the queue should be worked down and then I can upgrade. I can then
also try to have both servers working together again and see whether the problem exists

Thanks for taking the time to look into this. If there is anything I can provide you more, then let me
know.

Cheers
Matthias


---------- Forwarded message ----------
From: Francesco Mazzoli <[hidden email]>
Date: Fri, Aug 24, 2012 at 5:43 PM
Subject: Re: [rabbitmq-discuss] Excessive memory consumption of one server in cluster setup
To: Discussions about RabbitMQ <[hidden email]>


Hi Matthias,

Can you please post the logs and `rabbitmqctl report'?  It's almost impossible
to help you without those.

Could you also try to reproduce the issue with less things going on (e.g. only
one queue) and describe in detail what you're doing to trigger the problem?
Thanks.

At Fri, 24 Aug 2012 17:30:44 +0200,
Matthias Reik wrote:
> I can upgrade to 2.8.6 (unfortunately I upgraded just before it was released
> :-(), but I only want to do that if there is some hope that the problem is
> solved.

There weren't any HA related leaks fixed in 2.8.6, so I don't think upgrading
will fix the problem.  However it would make our work much easier, so please
upgrade if you can.

--
Francesco * Often in error, never in doubt
_______________________________________________
rabbitmq-discuss mailing list
[hidden email]
https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss



_______________________________________________
rabbitmq-discuss mailing list
[hidden email]
https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss

rabbitmq_report_20120823.txt (162K) Download Attachment
rabbit@mbuzz-queue-002_shortened.log (50K) Download Attachment