We recently upgraded RabbitMQ from 3.0.4 to 3.1.1 after noticing two bug fixes in 3.1.0 related to our RabbitMQ deployment:
However, in our case, it seems that the management plugin may still have a memory leak. We have a monitoring agent that hits the REST API to gather information about the broker (number of queues, queue depth, etc.). With the monitoring agent running and making requests against the API, memory consumption steadily increased; when we stopped the agent, memory consumption in the management plugin leveled off. Here a couple graphs detailing memory consumption in the broker (the figures are parsed from rabbitmqctl report). The first graph shows the ebb and flow of memory consumption in a number of components and the second shows just consumption by the management plugin. You can see pretty clearly where we stopped the monitoring agent at 1:20. https://dl.dropboxusercontent.com/u/7022167/Screenshots/n-np6obt-m9f.png https://dl.dropboxusercontent.com/u/7022167/Screenshots/an6dpup33xvx.png We have two clustered brokers, both running RabbitMQ 3.1.1 on Erlang R14B-04.1. There are typically around 200 queues, about 20 of which are mirrored. There are generally about 200 consumers. Messages are rarely queued and most queues typically sit idle. I'll be happy to provide any further diagnostic information. Thanks! _______________________________________________ rabbitmq-discuss mailing list [hidden email] https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss |
Hi. Thanks for the report.
My first guess is that garbage collection for the management DB process is happening too slowly. Can you invoke: $ rabbitmqctl eval 'P=global:whereis_name(rabbit_mgmt_db),M1=process_info(P, memory),garbage_collect(P),M2=process_info(P, memory),{M1,M2,rabbit_vm:memory()}.' and post the results? Cheers, Simon On 15/06/13 03:09, Travis Mehlinger wrote: > We recently upgraded RabbitMQ from 3.0.4 to 3.1.1 after noticing two bug > fixes in 3.1.0 related to our RabbitMQ deployment: > > * 25524 fix memory leak in mirror queue slave with many short-lived > publishing channel > * 25290 fix per-queue memory leak recording stats for mirror queue slaves > > However, in our case, it seems that the management plugin may still have > a memory leak. We have a monitoring agent that hits the REST API to > gather information about the broker (number of queues, queue depth, > etc.). With the monitoring agent running and making requests against the > API, memory consumption steadily increased; when we stopped the agent, > memory consumption in the management plugin leveled off. > > Here a couple graphs detailing memory consumption in the broker (the > figures are parsed from rabbitmqctl report). The first graph shows the > ebb and flow of memory consumption in a number of components and the > second shows just consumption by the management plugin. You can see > pretty clearly where we stopped the monitoring agent at 1:20. > > https://dl.dropboxusercontent.com/u/7022167/Screenshots/n-np6obt-m9f.png > https://dl.dropboxusercontent.com/u/7022167/Screenshots/an6dpup33xvx.png > > We have two clustered brokers, both running RabbitMQ 3.1.1 on Erlang > R14B-04.1. There are typically around 200 queues, about 20 of which are > mirrored. There are generally about 200 consumers. Messages are rarely > queued and most queues typically sit idle. > > I'll be happy to provide any further diagnostic information. > > Thanks! > > > _______________________________________________ > rabbitmq-discuss mailing list > [hidden email] > https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss > -- Simon MacMullen RabbitMQ, Pivotal _______________________________________________ rabbitmq-discuss mailing list [hidden email] https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss |
Hi Simon, Thanks for getting back to me. I'll need to restart our monitor and give it some time to leak the memory. I'll let you know the results sometime later today. One thing I failed to mention in my initial report: whenever we restarted one of our services, the queues they were using would get cleaned up (we have them set to auto-delete) and redeclared. Whenever we did that, we would see the memory consumption of the management DB fall off sharply before starting to rise again.
Let me know if you'd like any further information in the meantime. Best, Travis
On Mon, Jun 17, 2013 at 6:39 AM, Simon MacMullen <[hidden email]> wrote: Hi. Thanks for the report. _______________________________________________ rabbitmq-discuss mailing list [hidden email] https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss |
On 17/06/13 15:45, Travis Mehlinger wrote:
> Hi Simon, > > Thanks for getting back to me. I'll need to restart our monitor and give > it some time to leak the memory. I'll let you know the results sometime > later today. > > One thing I failed to mention in my initial report: whenever we > restarted one of our services, the queues they were using would get > cleaned up (we have them set to auto-delete) and redeclared. Whenever we > did that, we would see the memory consumption of the management DB fall > off sharply before starting to rise again. That is presumably because the historical data the management plugin has been retaining for those queues got thrown away. If you don't want to retain this data at all, change the configuration as documented here: http://www.rabbitmq.com/management.html#sample-retention However, I (currently) don't believe it's this historical data you are seeing as "leaking" since making queries should not cause any more of it to be retained. Cheers, Simon > Let me know if you'd like any further information in the meantime. > > Best, Travis > > > On Mon, Jun 17, 2013 at 6:39 AM, Simon MacMullen <[hidden email] > <mailto:[hidden email]>> wrote: > > Hi. Thanks for the report. > > My first guess is that garbage collection for the management DB > process is happening too slowly. Can you invoke: > > $ rabbitmqctl eval > 'P=global:whereis_name(rabbit___mgmt_db),M1=process_info(P, > memory),garbage_collect(P),M2=__process_info(P, > memory),{M1,M2,rabbit_vm:__memory()}.' > > and post the results? > > Cheers, Simon > > On 15/06/13 03:09, Travis Mehlinger wrote: > > We recently upgraded RabbitMQ from 3.0.4 to 3.1.1 after noticing > two bug > fixes in 3.1.0 related to our RabbitMQ deployment: > > * 25524 fix memory leak in mirror queue slave with many > short-lived > publishing channel > * 25290 fix per-queue memory leak recording stats for mirror > queue slaves > > However, in our case, it seems that the management plugin may > still have > a memory leak. We have a monitoring agent that hits the REST API to > gather information about the broker (number of queues, queue depth, > etc.). With the monitoring agent running and making requests > against the > API, memory consumption steadily increased; when we stopped the > agent, > memory consumption in the management plugin leveled off. > > Here a couple graphs detailing memory consumption in the broker (the > figures are parsed from rabbitmqctl report). The first graph > shows the > ebb and flow of memory consumption in a number of components and the > second shows just consumption by the management plugin. You can see > pretty clearly where we stopped the monitoring agent at 1:20. > > https://dl.dropboxusercontent.__com/u/7022167/Screenshots/n-__np6obt-m9f.png > <https://dl.dropboxusercontent.com/u/7022167/Screenshots/n-np6obt-m9f.png> > https://dl.dropboxusercontent.__com/u/7022167/Screenshots/__an6dpup33xvx.png > <https://dl.dropboxusercontent.com/u/7022167/Screenshots/an6dpup33xvx.png> > > We have two clustered brokers, both running RabbitMQ 3.1.1 on Erlang > R14B-04.1. There are typically around 200 queues, about 20 of > which are > mirrored. There are generally about 200 consumers. Messages are > rarely > queued and most queues typically sit idle. > > I'll be happy to provide any further diagnostic information. > > Thanks! > > > _________________________________________________ > rabbitmq-discuss mailing list > [hidden email] > <mailto:[hidden email]> > https://lists.rabbitmq.com/__cgi-bin/mailman/listinfo/__rabbitmq-discuss > <https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss> > > > > -- > Simon MacMullen > RabbitMQ, Pivotal > > -- Simon MacMullen RabbitMQ, Pivotal _______________________________________________ rabbitmq-discuss mailing list [hidden email] https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss |
Hi Simon, I flipped our monitor back on and let Rabbit consume some additional memory. Invoking the garbage collector had no impact. Let me know what further information you'd like to see and I'll be happy to provide it.
Thanks, Travis On Mon, Jun 17, 2013 at 10:32 AM, Simon MacMullen <[hidden email]> wrote:
_______________________________________________ rabbitmq-discuss mailing list [hidden email] https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss |
Hi Simon, I have more information for you. It turns out I hadn't fully understood the interaction causing this to happen. Aside from their regular communication, our services also declare a queue bound on # to an exchange that we use for collecting stats the services store internally. In addition to hitting the REST API for information about the broker, the monitor also opens a connection/channel, declares an anonymous queue for itself, then sends a message indicating to our services that they should respond with their statistics. The services then send a message with a routing key that will direct the response onto the queue declared by the monitor. This happens every five seconds.
It appears that this is in fact responsible for memory consumption growing out of control. If I disable that aspect of monitoring and leave the REST API monitor up, memory consumption stays level.
The problem seems reminiscent of the issues described in this mailing list thread: http://rabbitmq.1065348.n5.nabble.com/RabbitMQ-Queues-memory-leak-td25813.html. However, the queues we declare for stats collection are *not* mirrored.
Hope that helps narrow things down. :) Best, Travis On Mon, Jun 17, 2013 at 12:58 PM, Travis Mehlinger <[hidden email]> wrote:
_______________________________________________ rabbitmq-discuss mailing list [hidden email] https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss |
In reply to this post by Travis Mehlinger
Hi,
I just wanted to mention that we are seeing the same behavior, but we don't have any mirrored queues. We have had to disable the management interface and the memory usage gradually dropped. We are using 3.1.1 on R13B04 on some machines, and a few others with Erlang R16B. I looked into the code, and it creates some ets tables in the rabbit_mgmt_db module. The first snippet is memory usage with the management plugin:
This second snippet is after disabling the management plugin and restarting one of the nodes: [{total,190412936}, {memory, You'll notice that the memory used by mgmt_db is not 0 and other_system is around 48MB, while before mgmt_db was over 935MB, and other_system over 300MB. Unfortunately, I don't have any growth trends for the DB size as we have had to disable management on all nodes and we weren't tracking this memory usage. I'm not very familiar with RabbitMQ, but if I have some time I will try and dig more deeply into it and run some tests. - Joe On Saturday, June 15, 2013 10:09:21 AM UTC+8, Travis Mehlinger wrote: We recently upgraded RabbitMQ from 3.0.4 to 3.1.1 after noticing two bug fixes in 3.1.0 related to our RabbitMQ deployment: _______________________________________________ rabbitmq-discuss mailing list [hidden email] https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss |
Some things to look at that may let you re-enable the management plugin:
The management DB keeps historical data for queues / connections etc. If you have a large number of these things then tweaking the retention policies may help: http://www.rabbitmq.com/management.html#sample-retention If that does not help then setting: {rabbitmq_management, [{db_module, rabbit_mgmt_old_db}]} in your config will revert to the 3.0.x DB implementation. Cheers, Simon On 18/06/13 08:17, Joseph Lambert wrote: > Hi, > > I just wanted to mention that we are seeing the same behavior, but we > don't have any mirrored queues. We have had to disable the management > interface and the memory usage gradually dropped. > > We are using 3.1.1 on R13B04 on some machines, and a few others with > Erlang R16B. > > I looked into the code, and it creates some ets tables in the > rabbit_mgmt_db module. The first snippet is memory usage with the > management plugin: > > {memory, > [{total,1417822136}, > {connection_procs,37788152}, > {queue_procs,6659896}, > {plugins,686592}, > {other_proc,11236576}, > {mnesia,855184}, > {mgmt_db,935311000}, > {msg_index,7976224}, > {other_ets,70559680}, > {binary,13529704}, > {code,19001963}, > {atom,1601817}, > {other_system,312615348}]}, > > > This second snippet is after disabling the management plugin and > restarting one of the nodes: > > [{total,190412936}, > > {memory, > {connection_procs,34512200}, > {queue_procs,8970352}, > {plugins,0}, > {other_proc,9246776}, > {mnesia,794776}, > {mgmt_db,0}, > {msg_index,1650736}, > {other_ets,6406656}, > {binary,63363448}, > {code,16232973}, > {atom,594537}, > {other_system,48640482}]} > > > You'll notice that the memory used by mgmt_db is not 0 and other_system > is around 48MB, while before mgmt_db was over 935MB, and other_system > over 300MB. Unfortunately, I don't have any growth trends for the DB > size as we have had to disable management on all nodes and we weren't > tracking this memory usage. > > I'm not very familiar with RabbitMQ, but if I have some time I will try > and dig more deeply into it and run some tests. > > - Joe > > On Saturday, June 15, 2013 10:09:21 AM UTC+8, Travis Mehlinger wrote: > > We recently upgraded RabbitMQ from 3.0.4 to 3.1.1 after noticing two > bug fixes in 3.1.0 related to our RabbitMQ deployment: > > * 25524 fix memory leak in mirror queue slave with many > short-lived publishing channel > * 25290 fix per-queue memory leak recording stats for mirror queue > slaves > > However, in our case, it seems that the management plugin may still > have a memory leak. We have a monitoring agent that hits the REST > API to gather information about the broker (number of queues, queue > depth, etc.). With the monitoring agent running and making requests > against the API, memory consumption steadily increased; when we > stopped the agent, memory consumption in the management plugin > leveled off. > > Here a couple graphs detailing memory consumption in the broker (the > figures are parsed from rabbitmqctl report). The first graph shows > the ebb and flow of memory consumption in a number of components and > the second shows just consumption by the management plugin. You can > see pretty clearly where we stopped the monitoring agent at 1:20. > > https://dl.dropboxusercontent.com/u/7022167/Screenshots/n-np6obt-m9f.png > <https://dl.dropboxusercontent.com/u/7022167/Screenshots/n-np6obt-m9f.png> > https://dl.dropboxusercontent.com/u/7022167/Screenshots/an6dpup33xvx.png > <https://dl.dropboxusercontent.com/u/7022167/Screenshots/an6dpup33xvx.png> > > We have two clustered brokers, both running RabbitMQ 3.1.1 on Erlang > R14B-04.1. There are typically around 200 queues, about 20 of which > are mirrored. There are generally about 200 consumers. Messages are > rarely queued and most queues typically sit idle. > > I'll be happy to provide any further diagnostic information. > > Thanks! > > > > _______________________________________________ > rabbitmq-discuss mailing list > [hidden email] > https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss > -- Simon MacMullen RabbitMQ, Pivotal _______________________________________________ rabbitmq-discuss mailing list [hidden email] https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss |
In reply to this post by Travis Mehlinger
Hi. So I assume your monitoring code is not actually leaking those
queues - they are getting deleted I assume? How? (Are they autodelete, exclusive, x-expires, deleted manually?) If so, can you run: rabbitmqctl eval '[{ets:info(T,size), ets:info(T,memory)} || T <- lists:sort(ets:all()), rabbit_mgmt_db <- [ets:info(T, name)]].' periodically? This will output a list of tuples showing the rows and bytes per table for each table in the mgmt DB. Do these increase? Cheers, Simon On 17/06/13 20:08, Travis Mehlinger wrote: > Hi Simon, > > I have more information for you. It turns out I hadn't fully understood > the interaction causing this to happen. > > Aside from their regular communication, our services also declare a > queue bound on # to an exchange that we use for collecting stats the > services store internally. In addition to hitting the REST API for > information about the broker, the monitor also opens a > connection/channel, declares an anonymous queue for itself, then sends a > message indicating to our services that they should respond with their > statistics. The services then send a message with a routing key that > will direct the response onto the queue declared by the monitor. This > happens every five seconds. > > It appears that this is in fact responsible for memory consumption > growing out of control. If I disable that aspect of monitoring and leave > the REST API monitor up, memory consumption stays level. > > The problem seems reminiscent of the issues described in this mailing > list thread: > http://rabbitmq.1065348.n5.nabble.com/RabbitMQ-Queues-memory-leak-td25813.html. > However, the queues we declare for stats collection are *not* mirrored. > > Hope that helps narrow things down. :) > > Best, Travis > > > On Mon, Jun 17, 2013 at 12:58 PM, Travis Mehlinger <[hidden email] > <mailto:[hidden email]>> wrote: > > Hi Simon, > > I flipped our monitor back on and let Rabbit consume some additional > memory. Invoking the garbage collector had no impact. > > Let me know what further information you'd like to see and I'll be > happy to provide it. > > Thanks, Travis > > > On Mon, Jun 17, 2013 at 10:32 AM, Simon MacMullen > <[hidden email] <mailto:[hidden email]>> wrote: > > On 17/06/13 15:45, Travis Mehlinger wrote: > > Hi Simon, > > Thanks for getting back to me. I'll need to restart our > monitor and give > it some time to leak the memory. I'll let you know the > results sometime > later today. > > One thing I failed to mention in my initial report: whenever we > restarted one of our services, the queues they were using > would get > cleaned up (we have them set to auto-delete) and redeclared. > Whenever we > did that, we would see the memory consumption of the > management DB fall > off sharply before starting to rise again. > > > That is presumably because the historical data the management > plugin has been retaining for those queues got thrown away. If > you don't want to retain this data at all, change the > configuration as documented here: > > http://www.rabbitmq.com/__management.html#sample-__retention > <http://www.rabbitmq.com/management.html#sample-retention> > > However, I (currently) don't believe it's this historical data > you are seeing as "leaking" since making queries should not > cause any more of it to be retained. > > Cheers, Simon > > Let me know if you'd like any further information in the > meantime. > > Best, Travis > > > On Mon, Jun 17, 2013 at 6:39 AM, Simon MacMullen > <[hidden email] <mailto:[hidden email]> > <mailto:[hidden email] <mailto:[hidden email]>>> wrote: > > Hi. Thanks for the report. > > My first guess is that garbage collection for the > management DB > process is happening too slowly. Can you invoke: > > $ rabbitmqctl eval > > 'P=global:whereis_name(rabbit_____mgmt_db),M1=process_info(P, > memory),garbage_collect(P),M2=____process_info(P, > memory),{M1,M2,rabbit_vm:____memory()}.' > > > and post the results? > > Cheers, Simon > > On 15/06/13 03:09, Travis Mehlinger wrote: > > We recently upgraded RabbitMQ from 3.0.4 to 3.1.1 > after noticing > two bug > fixes in 3.1.0 related to our RabbitMQ deployment: > > * 25524 fix memory leak in mirror queue slave > with many > short-lived > publishing channel > * 25290 fix per-queue memory leak recording > stats for mirror > queue slaves > > However, in our case, it seems that the management > plugin may > still have > a memory leak. We have a monitoring agent that hits > the REST API to > gather information about the broker (number of > queues, queue depth, > etc.). With the monitoring agent running and making > requests > against the > API, memory consumption steadily increased; when we > stopped the > agent, > memory consumption in the management plugin leveled > off. > > Here a couple graphs detailing memory consumption > in the broker (the > figures are parsed from rabbitmqctl report). The > first graph > shows the > ebb and flow of memory consumption in a number of > components and the > second shows just consumption by the management > plugin. You can see > pretty clearly where we stopped the monitoring > agent at 1:20. > > https://dl.dropboxusercontent.____com/u/7022167/Screenshots/n-____np6obt-m9f.png > > <https://dl.__dropboxusercontent.com/u/__7022167/Screenshots/n-np6obt-__m9f.png > <https://dl.dropboxusercontent.com/u/7022167/Screenshots/n-np6obt-m9f.png>> > https://dl.dropboxusercontent.____com/u/7022167/Screenshots/____an6dpup33xvx.png > > > <https://dl.__dropboxusercontent.com/u/__7022167/Screenshots/__an6dpup33xvx.png > <https://dl.dropboxusercontent.com/u/7022167/Screenshots/an6dpup33xvx.png>> > > We have two clustered brokers, both running > RabbitMQ 3.1.1 on Erlang > R14B-04.1. There are typically around 200 queues, > about 20 of > which are > mirrored. There are generally about 200 consumers. > Messages are > rarely > queued and most queues typically sit idle. > > I'll be happy to provide any further diagnostic > information. > > Thanks! > > > ___________________________________________________ > rabbitmq-discuss mailing list > [hidden email] > <http://rabbitmq.com> > <mailto:[hidden email] > <mailto:[hidden email]>> > https://lists.rabbitmq.com/____cgi-bin/mailman/listinfo/____rabbitmq-discuss > <https://lists.rabbitmq.com/__cgi-bin/mailman/listinfo/__rabbitmq-discuss> > > > <https://lists.rabbitmq.com/__cgi-bin/mailman/listinfo/__rabbitmq-discuss > <https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss>> > > > > -- > Simon MacMullen > RabbitMQ, Pivotal > > > > > -- > Simon MacMullen > RabbitMQ, Pivotal > > > -- Simon MacMullen RabbitMQ, Pivotal _______________________________________________ rabbitmq-discuss mailing list [hidden email] https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss |
Hi Simon, We declare those queues as exclusive so they're getting cleaned up automatically. I ran the command you gave periodically over the course of the last two hours. The row count and total size in the highlighted line are definitely growing unchecked. All other values hovered closely around what you see in the gist.
Thanks, Travis On Tue, Jun 18, 2013 at 5:23 AM, Simon MacMullen <[hidden email]> wrote: Hi. So I assume your monitoring code is not actually leaking those queues - they are getting deleted I assume? How? (Are they autodelete, exclusive, x-expires, deleted manually?) _______________________________________________ rabbitmq-discuss mailing list [hidden email] https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss |
OK, that definitely looks like a leak. Could you also give me the output
from: rabbitmqctl eval '{[{T, ets:info(T,size), ets:info(T,memory)} || T <- lists:sort(ets:all()), rabbit_mgmt_db <- [ets:info(T, name)]], sys:get_status(global:whereis_name(rabbit_mgmt_db))}.' to make sure I'm clear on which table is leaking. Cheers, Simon On 18/06/13 16:11, Travis Mehlinger wrote: > Hi Simon, > > We declare those queues as exclusive so they're getting cleaned up > automatically. > > I ran the command you gave periodically over the course of the last two > hours. The row count and total size in the highlighted line are > definitely growing unchecked. All other values hovered closely around > what you see in the gist. > > https://gist.github.com/tmehlinger/0c9a9a0d5fe1d31c8f6d#file-gistfile1-txt-L9 > > Thanks, Travis > > > On Tue, Jun 18, 2013 at 5:23 AM, Simon MacMullen <[hidden email] > <mailto:[hidden email]>> wrote: > > Hi. So I assume your monitoring code is not actually leaking those > queues - they are getting deleted I assume? How? (Are they > autodelete, exclusive, x-expires, deleted manually?) > > If so, can you run: > > rabbitmqctl eval '[{ets:info(T,size), ets:info(T,memory)} || T <- > lists:sort(ets:all()), rabbit_mgmt_db <- [ets:info(T, name)]].' > > periodically? This will output a list of tuples showing the rows and > bytes per table for each table in the mgmt DB. Do these increase? > > Cheers, Simon > > > On 17/06/13 20:08, Travis Mehlinger wrote: > > Hi Simon, > > I have more information for you. It turns out I hadn't fully > understood > the interaction causing this to happen. > > Aside from their regular communication, our services also declare a > queue bound on # to an exchange that we use for collecting stats the > services store internally. In addition to hitting the REST API for > information about the broker, the monitor also opens a > connection/channel, declares an anonymous queue for itself, then > sends a > message indicating to our services that they should respond with > their > statistics. The services then send a message with a routing key that > will direct the response onto the queue declared by the monitor. > This > happens every five seconds. > > It appears that this is in fact responsible for memory consumption > growing out of control. If I disable that aspect of monitoring > and leave > the REST API monitor up, memory consumption stays level. > > The problem seems reminiscent of the issues described in this > mailing > list thread: > http://rabbitmq.1065348.n5.__nabble.com/RabbitMQ-Queues-__memory-leak-td25813.html > <http://rabbitmq.1065348.n5.nabble.com/RabbitMQ-Queues-memory-leak-td25813.html>. > However, the queues we declare for stats collection are *not* > mirrored. > > Hope that helps narrow things down. :) > > Best, Travis > > > On Mon, Jun 17, 2013 at 12:58 PM, Travis Mehlinger > <[hidden email] <mailto:[hidden email]> > <mailto:[hidden email] <mailto:[hidden email]>>> wrote: > > Hi Simon, > > I flipped our monitor back on and let Rabbit consume some > additional > memory. Invoking the garbage collector had no impact. > > Let me know what further information you'd like to see and > I'll be > happy to provide it. > > Thanks, Travis > > > On Mon, Jun 17, 2013 at 10:32 AM, Simon MacMullen > <[hidden email] <mailto:[hidden email]> > <mailto:[hidden email] <mailto:[hidden email]>>> wrote: > > On 17/06/13 15:45, Travis Mehlinger wrote: > > Hi Simon, > > Thanks for getting back to me. I'll need to restart our > monitor and give > it some time to leak the memory. I'll let you know the > results sometime > later today. > > One thing I failed to mention in my initial report: > whenever we > restarted one of our services, the queues they were > using > would get > cleaned up (we have them set to auto-delete) and > redeclared. > Whenever we > did that, we would see the memory consumption of the > management DB fall > off sharply before starting to rise again. > > > That is presumably because the historical data the > management > plugin has been retaining for those queues got thrown > away. If > you don't want to retain this data at all, change the > configuration as documented here: > > http://www.rabbitmq.com/____management.html#sample-____retention > <http://www.rabbitmq.com/__management.html#sample-__retention> > > > <http://www.rabbitmq.com/__management.html#sample-__retention > <http://www.rabbitmq.com/management.html#sample-retention>> > > However, I (currently) don't believe it's this > historical data > you are seeing as "leaking" since making queries should not > cause any more of it to be retained. > > Cheers, Simon > > Let me know if you'd like any further information > in the > meantime. > > Best, Travis > > > On Mon, Jun 17, 2013 at 6:39 AM, Simon MacMullen > <[hidden email] <mailto:[hidden email]> > <mailto:[hidden email] <mailto:[hidden email]>> > <mailto:[hidden email] > <mailto:[hidden email]> <mailto:[hidden email] > <mailto:[hidden email]>>>> wrote: > > Hi. Thanks for the report. > > My first guess is that garbage collection for the > management DB > process is happening too slowly. Can you invoke: > > $ rabbitmqctl eval > > > 'P=global:whereis_name(rabbit_______mgmt_db),M1=process_info(__P, > > memory),garbage_collect(P),M2=______process_info(P, > memory),{M1,M2,rabbit_vm:______memory()}.' > > > > and post the results? > > Cheers, Simon > > On 15/06/13 03:09, Travis Mehlinger wrote: > > We recently upgraded RabbitMQ from 3.0.4 > to 3.1.1 > after noticing > two bug > fixes in 3.1.0 related to our RabbitMQ > deployment: > > * 25524 fix memory leak in mirror queue > slave > with many > short-lived > publishing channel > * 25290 fix per-queue memory leak recording > stats for mirror > queue slaves > > However, in our case, it seems that the > management > plugin may > still have > a memory leak. We have a monitoring agent > that hits > the REST API to > gather information about the broker (number of > queues, queue depth, > etc.). With the monitoring agent running > and making > requests > against the > API, memory consumption steadily > increased; when we > stopped the > agent, > memory consumption in the management > plugin leveled > off. > > Here a couple graphs detailing memory > consumption > in the broker (the > figures are parsed from rabbitmqctl > report). The > first graph > shows the > ebb and flow of memory consumption in a > number of > components and the > second shows just consumption by the > management > plugin. You can see > pretty clearly where we stopped the monitoring > agent at 1:20. > > https://dl.dropboxusercontent.______com/u/7022167/Screenshots/__n-____np6obt-m9f.png > > > <https://dl.__dropboxuserconte__nt.com/u/__7022167/__Screenshots/n-np6obt-__m9f.png > <http://dropboxusercontent.com/u/__7022167/Screenshots/n-np6obt-__m9f.png> > > <https://dl.__dropboxusercontent.com/u/__7022167/Screenshots/n-np6obt-__m9f.png > <https://dl.dropboxusercontent.com/u/7022167/Screenshots/n-np6obt-m9f.png>>> > https://dl.dropboxusercontent.______com/u/7022167/Screenshots/______an6dpup33xvx.png > > > > <https://dl.__dropboxuserconte__nt.com/u/__7022167/__Screenshots/__an6dpup33xvx.png > <http://dropboxusercontent.com/u/__7022167/Screenshots/__an6dpup33xvx.png> > > > <https://dl.__dropboxusercontent.com/u/__7022167/Screenshots/__an6dpup33xvx.png > <https://dl.dropboxusercontent.com/u/7022167/Screenshots/an6dpup33xvx.png>>> > > We have two clustered brokers, both running > RabbitMQ 3.1.1 on Erlang > R14B-04.1. There are typically around 200 > queues, > about 20 of > which are > mirrored. There are generally about 200 > consumers. > Messages are > rarely > queued and most queues typically sit idle. > > I'll be happy to provide any further > diagnostic > information. > > Thanks! > > > > _____________________________________________________ > rabbitmq-discuss mailing list > [hidden email] > <http://rabbi__tmq.com> > <http://rabbitmq.com> > <mailto:rabbitmq-discuss@ > <mailto:rabbitmq-discuss@>__lis__ts.rabbitmq.com > <http://lists.rabbitmq.com> > <mailto:[hidden email] > <mailto:[hidden email]>>> > https://lists.rabbitmq.com/______cgi-bin/mailman/listinfo/______rabbitmq-discuss > <https://lists.rabbitmq.com/____cgi-bin/mailman/listinfo/____rabbitmq-discuss> > > <https://lists.rabbitmq.com/____cgi-bin/mailman/listinfo/____rabbitmq-discuss > <https://lists.rabbitmq.com/__cgi-bin/mailman/listinfo/__rabbitmq-discuss>> > > > > > <https://lists.rabbitmq.com/____cgi-bin/mailman/listinfo/____rabbitmq-discuss > <https://lists.rabbitmq.com/__cgi-bin/mailman/listinfo/__rabbitmq-discuss> > > <https://lists.rabbitmq.com/__cgi-bin/mailman/listinfo/__rabbitmq-discuss > <https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss>>> > > > > -- > Simon MacMullen > RabbitMQ, Pivotal > > > > > -- > Simon MacMullen > RabbitMQ, Pivotal > > > > > > -- > Simon MacMullen > RabbitMQ, Pivotal > > -- Simon MacMullen RabbitMQ, Pivotal _______________________________________________ rabbitmq-discuss mailing list [hidden email] https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss |
Hi Simon, Sure thing, here you go: https://gist.github.com/tmehlinger/158bb003fe3d64618a4e
Best, Travis On Tue, Jun 18, 2013 at 10:47 AM, Simon MacMullen <[hidden email]> wrote: OK, that definitely looks like a leak. Could you also give me the output from: _______________________________________________ rabbitmq-discuss mailing list [hidden email] https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss |
In reply to this post by Travis Mehlinger
On 18/06/13 16:11, Travis Mehlinger wrote:
> We declare those queues as exclusive so they're getting cleaned up > automatically. I have found a plausible candidate for the leak. But it's dependent on having a long lived channel which declares and deletes lots of short-lived queues. We keep some information on the deleted queues until the channel is closed. Could your monitoring tool be doing that? Obviously it would have to be deleting queues explicitly, not relying on them being exclusive. Cheers, Simon -- Simon MacMullen RabbitMQ, Pivotal _______________________________________________ rabbitmq-discuss mailing list [hidden email] https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss |
Hi Simon, We aren't doing anything like that. Whenever one of our services starts (which are based on Kombu, if that helps), it plucks a connection from its internal pool, creates a channel on that connection, then binds on its request queue, which hangs around until the service stops. The only thing that deviates from this pattern is our monitor, which connects and disconnects fairly rapidly, and uses exclusive queues.
That said, it's entirely possible that we have something floating around that I don't know about that fits the behavior you described. I'll keep digging and see what I can turn up. In the meantime, let me know if there's any more information I can collect from Rabbit.
Thanks, Travis On Wed, Jun 19, 2013 at 6:13 AM, Simon MacMullen <[hidden email]> wrote:
_______________________________________________ rabbitmq-discuss mailing list [hidden email] https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss |
Thanks.
Could you run rabbitmqctl eval '{_,_,_,[_,_,_,_,[_,_,{_,[{_,S}]}]]}=sys:get_status(global:whereis_name(rabbit_mgmt_db)),T=element(5,S),ets:foldl(fun(E,_)->io:format("~p~n~n",[E])end,[],T).' please? This will dump the entire leaky table to standard output, so you would want to redirect it to a file. If you could also accompany that with "rabbitmqctl report" so I can see what actually exists at that time then I can at least see what is leaking. Cheers, Simon On 19/06/13 15:30, Travis Mehlinger wrote: > Hi Simon, > > We aren't doing anything like that. Whenever one of our services starts > (which are based on Kombu, if that helps), it plucks a connection from > its internal pool, creates a channel on that connection, then binds on > its request queue, which hangs around until the service stops. The only > thing that deviates from this pattern is our monitor, which connects and > disconnects fairly rapidly, and uses exclusive queues. > > That said, it's entirely possible that we have something floating around > that I don't know about that fits the behavior you described. I'll keep > digging and see what I can turn up. In the meantime, let me know if > there's any more information I can collect from Rabbit. > > Thanks, Travis > > > On Wed, Jun 19, 2013 at 6:13 AM, Simon MacMullen <[hidden email] > <mailto:[hidden email]>> wrote: > > On 18/06/13 16:11, Travis Mehlinger wrote: > > We declare those queues as exclusive so they're getting cleaned up > automatically. > > > I have found a plausible candidate for the leak. But it's dependent > on having a long lived channel which declares and deletes lots of > short-lived queues. We keep some information on the deleted queues > until the channel is closed. Could your monitoring tool be doing > that? Obviously it would have to be deleting queues explicitly, not > relying on them being exclusive. > > Cheers, Simon > > > -- > Simon MacMullen > RabbitMQ, Pivotal > > -- Simon MacMullen RabbitMQ, Pivotal _______________________________________________ rabbitmq-discuss mailing list [hidden email] https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss |
Also, I have a patched version of the management plugin which fixes the
leak I found today: http://www.rabbitmq.com/releases/plugins/v3.1.1/rabbitmq_management-3.1.1-leakfix1.ez (see http://www.rabbitmq.com/plugins.html#installing-plugins for where to put it, it replaces rabbitmq_management-3.1.1.ez) If you are able to test this version, that would also be great. Cheers, Simon On 19/06/13 15:57, Simon MacMullen wrote: > Thanks. > > Could you run > > rabbitmqctl eval > '{_,_,_,[_,_,_,_,[_,_,{_,[{_,S}]}]]}=sys:get_status(global:whereis_name(rabbit_mgmt_db)),T=element(5,S),ets:foldl(fun(E,_)->io:format("~p~n~n",[E])end,[],T).' > > > please? This will dump the entire leaky table to standard output, so you > would want to redirect it to a file. > > If you could also accompany that with "rabbitmqctl report" so I can see > what actually exists at that time then I can at least see what is leaking. > > Cheers, Simon > > On 19/06/13 15:30, Travis Mehlinger wrote: >> Hi Simon, >> >> We aren't doing anything like that. Whenever one of our services starts >> (which are based on Kombu, if that helps), it plucks a connection from >> its internal pool, creates a channel on that connection, then binds on >> its request queue, which hangs around until the service stops. The only >> thing that deviates from this pattern is our monitor, which connects and >> disconnects fairly rapidly, and uses exclusive queues. >> >> That said, it's entirely possible that we have something floating around >> that I don't know about that fits the behavior you described. I'll keep >> digging and see what I can turn up. In the meantime, let me know if >> there's any more information I can collect from Rabbit. >> >> Thanks, Travis >> >> >> On Wed, Jun 19, 2013 at 6:13 AM, Simon MacMullen <[hidden email] >> <mailto:[hidden email]>> wrote: >> >> On 18/06/13 16:11, Travis Mehlinger wrote: >> >> We declare those queues as exclusive so they're getting >> cleaned up >> automatically. >> >> >> I have found a plausible candidate for the leak. But it's dependent >> on having a long lived channel which declares and deletes lots of >> short-lived queues. We keep some information on the deleted queues >> until the channel is closed. Could your monitoring tool be doing >> that? Obviously it would have to be deleting queues explicitly, not >> relying on them being exclusive. >> >> Cheers, Simon >> >> >> -- >> Simon MacMullen >> RabbitMQ, Pivotal >> >> > > -- Simon MacMullen RabbitMQ, Pivotal _______________________________________________ rabbitmq-discuss mailing list [hidden email] https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss |
I won't be able to test it in the same production environment but should be able to reproduce the behavior in a test environment. I'm travelling tomorrow afternoon, so if you don't hear from me by end of business UK time tomorrow, I'll definitely have something for you Monday. Thanks so much for all your help. :) Best, Travis On Wed, Jun 19, 2013 at 10:34 AM, Simon MacMullen <[hidden email]> wrote: Also, I have a patched version of the management plugin which fixes the leak I found today: _______________________________________________ rabbitmq-discuss mailing list [hidden email] https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss |
In reply to this post by Simon MacMullen-2
Simon MacMullen <simon@...> writes: > > Also, I have a patched version of the management plugin which fixes the > leak I found today: > > http://www.rabbitmq.com/releases/plugins/v3.1.1/rabbitmq_management-3.1.1- leakfix1.ez > > (see http://www.rabbitmq.com/plugins.html#installing-plugins for where > to put it, it replaces rabbitmq_management-3.1.1.ez) > > If you are able to test this version, that would also be great. Hi Simon. We have the same issue here and we will apply your patch and run some tests. As soon as we have something we will reply here. Thanks! > > Cheers, Simon _______________________________________________ rabbitmq-discuss mailing list [hidden email] https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss |
Rodolfo Penha:
> We have the same issue here and we will apply your patch and run some tests. > > As soon as we have something we will reply here. Rodolfo, Note that the issue is resolved in RabbitMQ 3.1.3. Consider upgrading instead of applying the patch and running a custom build instead. -- MK _______________________________________________ rabbitmq-discuss mailing list [hidden email] https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss |
Ok!
Upgrading to 3.1.3 now! Thanks! -----Original Message----- From: [hidden email] [mailto:[hidden email]] On Behalf Of Michael Klishin Sent: sexta-feira, 12 de julho de 2013 10:25 To: Discussions about RabbitMQ Subject: Re: [rabbitmq-discuss] Possible memory leak in the management plugin Rodolfo Penha: > We have the same issue here and we will apply your patch and run some tests. > > As soon as we have something we will reply here. Rodolfo, Note that the issue is resolved in RabbitMQ 3.1.3. Consider upgrading instead of applying the patch and running a custom build instead. -- MK _______________________________________________ rabbitmq-discuss mailing list [hidden email] https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss |
Free forum by Nabble | Edit this page |