YAMLP(Yet Another Memory Leak Problem)

Hi guys,

I’m still having big troubles with 3.7.0beta (3.6.4 had the same kind of problem).

Usually the server handles 400 clients during work hours. I’ve seen bigger deployments in others posts, so this is ok.

We have a severe memory leak where the JVM goes up to the 2096MB Java memory limit.

It appears the same thing happens with just a few clients (35 at the time I’m testing).

Clients are essentially Spark/Pidgin.

The server is a Linux/x64 Debian Squeeze with Sun JRE/JDK6, LDAP auth, Database(MySQL) groups.

We have 3 dozen of groups being shared between each 350 users (enterprise groups).

It is connected to another Openfire 3.6.4 server which seems to work correctly (~50 clients, Windows2K3x64/SQLLite).

I’ve been able to make a heap dump parsed with Eclipse Memory Analyser (very good soft by the way).

Here are parts of the results(~40 clients)(tested after a fresh start to have the thinest memory footprint) :

One instance of “java.util.TaskQueue” loaded by “” occupies 776 748 384 (90,98%) bytes. The instance is referenced by org.jivesoftware.openfire.pep.PEPService @ 0x7f3b4699f6b0 , loaded by “org.jivesoftware.openfire.starter.JiveClassLoader @ 0x7f3b11cf5ab8”. The memory is accumulated in one instance of “java.util.TimerTask[]” loaded by “”.

Keywords

java.util.TaskQueue
java.util.TimerTask[]
org.jivesoftware.openfire.starter.JiveClassLoader @ 0x7f3b11cf5ab8

Class Name
Shallow Heap
Retained Heap
Percentage

  • java.util.TaskQueue @ 0x7f3b12708d88
    32
    776 748 384
    90,98%
  • java.util.TimerTask[2048] @ 0x7f3b3b254c60
    16 408
    776 748 352
    90,98%
  • org.jivesoftware.openfire.pep.PEPService$1 @ 0x7f3b367e0ff8
    88
    671 344
    0,08%
  • org.jivesoftware.openfire.pep.PEPService$1 @ 0x7f3b362a8ff8
    88
    671 344
    0,08%
  • org.jivesoftware.openfire.pep.PEPService$1 @ 0x7f3b2f38d1a8
    88
    671 344
    0,08%
  • org.jivesoftware.openfire.pep.PEPService$1 @ 0x7f3b32428918
    88
    671 344
    0,08%
  • org.jivesoftware.openfire.pep.PEPService$1 @ 0x7f3b31c67928
    88
    671 344
    0,08%
  • org.jivesoftware.openfire.pep.PEPService$1 @ 0x7f3b1aeb8d00
    88
    671 344
    0,08%
  • org.jivesoftware.openfire.pep.PEPService$1 @ 0x7f3b171c1328
    88
    671 344
    0,08%
  • org.jivesoftware.openfire.pep.PEPService$1 @ 0x7f3b1c628250
    88
    671 344
    0,08%

(truncated)

Label
Number Of Objects
Used Heap Size
Retained Heap Size
org.jivesoftware.openfire.pep.PEPService$1
1 157
101 816
776 731 944

The warn log reports a lot of (~1 per minute):

2010.10.21 23:39:54 Cache Roster was full, shrinked to 90% in 0ms

Is it a bug?

Is there a workaround?(property value?)

Is there a patch?

Hi,

can you upload the heapdump somewhere?

Openfire Properties / How to configure Openfire’s caches should help to increase the cache settings.

With log4j you should be able to hide these warnings but I think that you better increase the cache. Getting a full cache every minute is likely not good for the performance. So I prefer logging this as a full cache can cause serious performance issues.

LG

Hi,

I could upload the dump, but it contains sensitive data. It will be the last resort solution.

Looking further into the dump, it seems that 1 user using Empathy have most of the TimeTask slots. Each slot occupies 650~800KB whereas others users have just one slot.

I’ve looked to the code, I did not understand how this was possible, it seems each user is limited to one instance of PEPService. In practice, this is not the case (the user is the same for each PEPService instance):

cid:image001.png@01CB742A.6F866C00

As a workaround, I have disabled PEP (xmpp.pep.enabled=false) like recommended by community, and everything is just fine.

I’ve seen posts about this problem, and bug has been closed for 3.6.4. I think it needs to be reopened and be definitely fixed.

Hi,

OF-82 exists in 3.6.4 and it is fixed in 3.7.0-beta. So I’d say that it’s not another memory leak but the one which is already described in the announcement.

LG

As it is written in the post, the problem concerns both 3.6.4 and 3.7.0beta (the one we are using and from which I’ve looked the code and proposed a patch).

My SVN client points to http://svn.igniterealtime.org/svn/repos/openfire/tags/openfire_3_7_0_beta

This message has been scanned for viruses and

dangerous content by MailScanner, and is

believed to be clean.

Hi,

ok, so still not fixed. It would be interesting to know which packets the empathy client is sending. I wonder whether it has something to do with the “jidTo”, as soon as it does not exist a new PEPService is created:

final String jidTo = packet.getTo().toBareJID();
final PEPService pepService = pepServiceManager.getPEPService(jidTo);
if (pepService != null) {
    pepServiceManager.process(pepService, packet);
} else {
     PEPService dummyService = new PEPService(XMPPServer.getInstance(), senderJID.toBareJID());
     pepServiceManager.process(dummyService, packet);
}

LG

I’m trying to get those packets to help you.

I have asked the concerned user to disable SSL. I can get his packets using tcpdump, but the bug is not triggered.

When SSL is used, the server receive much more packets, but I’m unable to get them. I’ve tried ssldump, but it crashes after a few decoded packets.

Is there a way with Openfire to get those packets ?

This message has been scanned for viruses and

dangerous content by MailScanner, and is

believed to be clean.

Hi,

there is the audit log function http://openfire:9090/audit-policy.jsp which allows to log the packets.

LG