Openfire 3.6.4 memory leak with Empathy

I will test this new patch and let you know how we go.

Thanks

We ran the patch during peak time but unfortunately we are still leaking memory. Its very frustrating.

Saying that it doesn’t mean your patch hasn’t fixed other issues, it just hasn’t solved the issue we are experiencing.

And this latest patch hasn’t caused any new problems which is good.

It would be good if some others with memory leaks could also test this.

Thanks

Daniel

Hi Guus

I’m ok to test it, but I just don’t know how to apply the patch. Would you mind explaining me how? Thanks!

We’re testing now. To run the patch simply check out the Openfire source, add the relevant line (or use Eclipse -> Patch -> Apply) then build the openfire jar using the Ant script.

Just as an aside, what we’re planning to do at some point is pull a network cable on our load test clients and see how Openfire deals with the sudden disconnection. If the patch works presumably it shouldn’t run out of memory.

PS - where in SVN is the tag of Openfire 3.6.4??!

Oups, already too complicated for me I never compiled a Java application, so I guess I will need to read some documentation first

Not quite sure how I’d do it but I can send you the jar with the changes compiled into it. You’d only need to put this into the libs directory of your Openfire server and restart.

That would be much easier for me indeed.

I don’t know if it matters, but it’s on a 32bits windows.

I will send you my email address by private message.

I got the file, thanks. I tried it on a virtual machine with a freshly installed server, but I don’t see any difference, I can’t get any memory leak for the moment… Only our real server seems to have problems. I don’t want to test it with too many people connected, so if this VM can’t help, I will try the patch one of these evenings.

Would you be able to install/run Ubuntu 9.10RC: http://releases.ubuntu.com/releases/9.10/ and try connecting to your server from Empathy? (I’m sure even from the LiveCD would be ok.)

This was the probable cause of our memory issues.

I was! I know it’s the problem. But no way to reproduce the leak on this new server. But as soon as I connect an Empathy client on the main server and disconnect it, it happens. I will try again tomorrow on the VM with more clients.

mikeycmccarthy wrote:

Just as an aside, what we’re planning to do at some point is pull a network cable on our load test clients and see how Openfire deals with the sudden disconnection.

Speaking about sudden disconnection. There are more complaints about that and not about memory leaks. Such broken sessions stay online, keep updating their last activity (somehow) and confuses users alot, when they contacts are not replying while being shown as online all the day, while they laptops are not at the desk for hours.

Hi Daniel,

Sorry to hear the patch didn’t solve your issue. The patch was based on the description that (the guy in the team of) mikeycmccartarthy gave. Either I got the fix wrong, or your suffering from another problem.

I’ve got the feeling that several issues are being discussed in this thread. Things appear to get mixed up a bit. Perhaps we should discuss the issue that you’re experiencing one-on-one. Could you send me a message or chat to me offline? You’ll find my contact details in my profile.

Hello Francois,

Thanks for helping us out! I’m a bit confused. Did the problem disappear on the environment where you are testing the patch, or do you not have the experience there, even without the patch?

Hey mikeycmccarthy,

Any news on the test? Pulling the network cable should be detected, even without the patch. Doesn’t hurt to give it a try, though.

There doesn’t appear to be a tag for 3.6.4, but there’s a branch: http://www.igniterealtime.org/fisheye/browse/svn-org/openfire/branches/openfire_ 3_6_4

I guess I was not clear:

-we have a problem when we connect an Empathy client on our production server. 60 users, mainly Pidgin + Spark. Once we connect/disconnect an Empathy client, the server becomes unusable. Not clear yet if it’s a memory leak or something else. Processor goes 100%, machine unusable.

-I have installed a new server on a VM to test the patch. But I can’t reproduce the problem there. It’s working with the fresh install and Empathy client. So I don’t know if this patch solves my problem. The only way to test is on the production server, but i won’t do that during the day to avoid disturbing the users.

Hi Francois,

On first glance, your problem doesn’t look like a memory leak. These problems usually take some time to develop into a real issue - they don’t usually pop up immediately after one client connects or disconnects.

Perhaps you could install the monitoring plugin described in the blogpost New Openfire monitoring plugin. This will give you a bit more details of the overall health of your environment.

Are there any entries in the logs around the time the client connects that causes the instability?

I didn’t check the logs. I will do more tests when I have some time, and I will use the monitor plugin. Thanks a lot for your help!

Hi Guus,

We started our load test at 2pm yesterday using the patched version of Openfire. I was going to try the disconnect when I got in this morning but Openfire died at about 2am, 12 hours later.

The test gradually ramps up to 4000 people chatting on the server, across 5 rooms, with a throughput of about 3.6 chats / second. Heap memory rises slowly til it’s just over 1gb at 1:45, then suddenly it spikes up and Openfire dies.

I have lots of graphs, all logs etc, would really appreciate it if we could go through these together. I’m a bit worried that the Openfire load test stats on this website only show a short period of testing. Do you find you need to restart Openfire often for Nimbuzz and what is the typical load?

Many thanks

Michael

Bummer. I had hoped we’d tackled this thing.

You can send me all of the raw data (graphs, logs, etc) to my private address (see my profile). I’m also interested in your test environment. Could you send me the details of that too, please?

Non-disclosure forbids me to be exact, but the amount of users that Nimbuzz processes is a lot, lot higher than what you’re processing. They do have occasional problems, but restarts were typically required after a few weeks, not hours.