Openfire 3.6.4 memory leak with Empathy

Reported to Empathy mailing list now too.

Open-source can be frustrating sometimes. I’ve now reported this in Ubuntu Launchpad (try upstream), Gnome Bugzilla (try upstream) and now Freedesktop.org. I wish there was only one place to do this.

http://lists.freedesktop.org/archives/telepathy/2009-October/003941.html

If anyone can help with the inner workings of the server for the guys at Telepathy that’d be really useful. Java isn’t my strong point.

mikeycmccarthy wrote:

(…)
Does anyone know of an Openfire installation being used in any large-ish scale production environment?

Nimbuzz currently uses Openfire. Earlier this year, they passed the 10 million download mark of one of their clients.

I’ve created a new JIRA issue (OF-70) to track this issue.

I wouldn’t be surprised if this was introduced when JM-1066 was fixed. Is anyone able/willing to test a patch, if I provide it as a diff?

(said diff can be found in the JIRA issue)

I will apply this patch and see how it goes. If it fixes the memory leak it will be very obvious in our environment.

Thanks for looking into this issue,

Daniel

I have applied this patch to our live server and we’ll know during peak time in about 12 hours if it fixes the issue.

Thanks

I had to roll this back again.

The patch resulted in everyone getting logged out after about 5 minutes of being connected.

There must be more to it?

Thanks

My patch disconnected all idle clients (clients that had not been sent any data for a while). As you’ve found, that’s not the best of solutions. Instead, the code should detect write timeouts. Luckily, MINA appears to offer that functionality.

I’ve modified the patch to detect write-timeouts. I haven’t been able to test this yet, but could one of you give it a try?

The patch can be found in JIRA issue OF-70.

I will test this new patch and let you know how we go.

Thanks

We ran the patch during peak time but unfortunately we are still leaking memory. Its very frustrating.

Saying that it doesn’t mean your patch hasn’t fixed other issues, it just hasn’t solved the issue we are experiencing.

And this latest patch hasn’t caused any new problems which is good.

It would be good if some others with memory leaks could also test this.

Thanks

Daniel

Hi Guus

I’m ok to test it, but I just don’t know how to apply the patch. Would you mind explaining me how? Thanks!

We’re testing now. To run the patch simply check out the Openfire source, add the relevant line (or use Eclipse -> Patch -> Apply) then build the openfire jar using the Ant script.

Just as an aside, what we’re planning to do at some point is pull a network cable on our load test clients and see how Openfire deals with the sudden disconnection. If the patch works presumably it shouldn’t run out of memory.

PS - where in SVN is the tag of Openfire 3.6.4??!

Oups, already too complicated for me I never compiled a Java application, so I guess I will need to read some documentation first

Not quite sure how I’d do it but I can send you the jar with the changes compiled into it. You’d only need to put this into the libs directory of your Openfire server and restart.

That would be much easier for me indeed.

I don’t know if it matters, but it’s on a 32bits windows.

I will send you my email address by private message.

I got the file, thanks. I tried it on a virtual machine with a freshly installed server, but I don’t see any difference, I can’t get any memory leak for the moment… Only our real server seems to have problems. I don’t want to test it with too many people connected, so if this VM can’t help, I will try the patch one of these evenings.

Would you be able to install/run Ubuntu 9.10RC: http://releases.ubuntu.com/releases/9.10/ and try connecting to your server from Empathy? (I’m sure even from the LiveCD would be ok.)

This was the probable cause of our memory issues.

I was! I know it’s the problem. But no way to reproduce the leak on this new server. But as soon as I connect an Empathy client on the main server and disconnect it, it happens. I will try again tomorrow on the VM with more clients.