Pidgin randomly disconnecting from Openfire

Hello,

We’re having a handful of users who are reporting random disconnects from our Openfire server. The common theme seems to be that they’re all using pidgin. This was happening in both Openfire 3.5.2 and the newest version of Openfire 3.6.0. Here’s the stack trace I see in the logs when someone gets disconnected:

java.lang.IllegalArgumentException: IQ must be of type ‘set’ or ‘get’. Original IQ:
at org.xmpp.packet.IQ.createResultIQ(IQ.java:355)
at org.jivesoftware.openfire.IQRouter.route(IQRouter.java:104)

at org.jivesoftware.openfire.spi.PacketRouterImpl.route(PacketRouterImpl.java:68)
at org.jivesoftware.openfire.net.StanzaHandler.processIQ(StanzaHandler.java:311)
at org.jivesoftware.openfire.net.ClientStanzaHandler.processIQ(ClientStanzaHandler .java:79)
at org.jivesoftware.openfire.net.StanzaHandler.process(StanzaHandler.java:276)
at org.jivesoftware.openfire.net.StanzaHandler.process(StanzaHandler.java:175)
at org.jivesoftware.openfire.nio.ConnectionHandler.messageReceived(ConnectionHandl er.java:133)
at org.apache.mina.common.support.AbstractIoFilterChain$TailFilter.messageReceived (AbstractIoFilterChain.java:570)
at org.apache.mina.common.support.AbstractIoFilterChain.callNextMessageReceived(Ab stractIoFilterChain.java:299)
at org.apache.mina.common.support.AbstractIoFilterChain.access$1100(AbstractIoFilt erChain.java:53)
at org.apache.mina.common.support.AbstractIoFilterChain$EntryImpl$1.messageReceive d(AbstractIoFilterChain.java:648)
at org.apache.mina.common.IoFilterAdapter.messageReceived(IoFilterAdapter.java:80)

at org.apache.mina.common.support.AbstractIoFilterChain.callNextMessageReceived(Ab stractIoFilterChain.java:299)
at org.apache.mina.common.support.AbstractIoFilterChain.access$1100(AbstractIoFilt erChain.java:53)
at org.apache.mina.common.support.AbstractIoFilterChain$EntryImpl$1.messageReceive d(AbstractIoFilterChain.java:648)
at org.apache.mina.filter.codec.support.SimpleProtocolDecoderOutput.flush(SimplePr otocolDecoderOutput.java:58)
at org.apache.mina.filter.codec.ProtocolCodecFilter.messageReceived(ProtocolCodecF ilter.java:185)
at org.apache.mina.common.support.AbstractIoFilterChain.callNextMessageReceived(Ab stractIoFilterChain.java:299)
at org.apache.mina.common.support.AbstractIoFilterChain.access$1100(AbstractIoFilt erChain.java:53)
at org.apache.mina.common.support.AbstractIoFilterChain$EntryImpl$1.messageReceive d(AbstractIoFilterChain.java:648)
at org.apache.mina.filter.executor.ExecutorFilter.processEvent(ExecutorFilter.java :239)
at org.apache.mina.filter.executor.ExecutorFilter$ProcessEventsRunnable.run(Execut orFilter.java:283)

at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at org.apache.mina.util.NamePreservingRunnable.run(NamePreservingRunnable.java:51)
at java.lang.Thread.run(Unknown Source)

Any ideas?

Hi,

I routinely see that error with my pidgin users, but I don’t believe it produces a disconnection. Are you sure about that association?

Wanna try setting xmpp.client.idle to -1 on the Server settings of the admin console and see if that helps your disconnects?

daryl

Hi Daryl,

You’re right. This error does not appear to be producing the disconnection. I made this change, but the disconnects are still happening. I’ve been working closely with a user trying to figure out the source of the disconnects. He got disconnected, I went to the logs, and saw a couple of these errors, but not for the user in question. He got dropped at 2008.08.28 16:57:41, which is the same in the logs that these errrors appeared.

Again, this only happens for pidgin, and it also seems like all of the users are getting disconnected at the same time…

Compression is turned off and xmpp.client.idle is -1… we do have TLS enabled… that’s the only other thing I can think of?

Is this an Openfire issue or something with our network? Any ideas?

Hi,

The server config setting will only take effect for new connections. Are you still seeing disconnects after clients newly connect after the config setting change?

Otherwise, please consider starting one of your pidgin clients with the -d flag and see if you can trace what is happening.

daryl

Hi Daryl,

Your suggestions, along with the new release of Pidgin, seem to have substantially improved this issue for us. However, we’re still having users report disconnects every once in a while. Sometimes there aren’t any errors or warnings or anything when someone drops. Is there some kind of interoperability issue between Pidgin and Openfire?

Hi,

There are some issues that can cause trouble with Pidgin and Openfire. In fact, you will find my name a lot in the Pidgin bug trac referencing some of these problems. Invariably, the Pidgin devels point their finger at Openfire for causing the troubles and the openfire devels are more keen on supporting issues with spark[web]. I don’t believe the pidgin devels test against openfire for various reasons, so what’s my point…

If you have troubles, try to open up the XML debug console on Pidgin and see if you can catch the disconnect in action. Putting openfire into debug mode may help as well…

I have an install base of 2900 users, with most of them running Pidgin, but for the most part, things work just great! If they continue to look bad for you, check out other clients, like Spark (hi winsrev ). Thankfully, there are many quality options out there.

HTH,

daryl

amullen wrote:

Hi Daryl,

Your suggestions, along with the new release of Pidgin

You mean, you already tried 2.5.1? Cause i saw in the changelog:

* Avoid disconnecting from XMPP servers on parse errors that are non-fatal.

Yes, apparently this is still happening under 2.5.1. I’ve asked the users who are having these issues to start debugging their Pidgin sessions and send me the logs when they crash, but I’ve yet to receieve one

Daryl,

I had a user send me one of his log files. Here are some of the errors in the logs:

  1. jabber: XML parser error for JabberStream 01DB3DF8: Domain 1, code 5, level 3: Extra content at the end of the document

  2. jabber: Recv (ssl)(316):

  3. jabber: XML parser error for JabberStream 01DB4048: Domain 1, code 100, level 1: xmlns: URI vcard-temp is not absolute

The 1st error seems to be causing pidgin to crash. I don’t have enough data about the 2nd error, but it seems to be related to connection issues with our chat rooms. The 3rd error seems harmless, everything still runs fine even though pidgin spits out those errors.

Any ideas?

Same issue here with Pidgin, random disconnects after upgrading the version of Pidgin to 2.5.2 from 2.3.1. The old version works fine but the new version disconnects. It looks like the main change in Pidgin was the way it handles SSL certificates. Any ideas?

Openfire: 3.6.4

Pidgin: 2.5.8

I have the xmpp.idle set to -1 for my server settings as well. When the disconnects occur it does not do all users at once, but tons of users get disconnected and eventually reconnect…sometimes dozens of times. I have TLS as well.

Debug from Openfire while the disconnect dance was going on:

2009.08.17 08:23:03 ConnectionHandler: Closing connection that has been idle: org.jivesoftware.openfire.nio.NIOConnection@11a1774 MINA Session: (SOCKET, R: /10.135.200.149:1249, L: /10.135.210.50:5222, S: 0.0.0.0/0.0.0.0:5222)

2009.08.17 08:24:05 309638 (01/05/00) - #48 registered a statement as closed which wasn’t known to be open. This could happen if you close a statement twice.
2009.08.17 08:24:05 309638 (01/05/00) - Connection #48 tested: OK
2009.08.17 08:24:12 ConnectionHandler: Closing connection that has been idle: org.jivesoftware.openfire.nio.NIOConnection@1dbbfb8 MINA Session: (SOCKET, R: /10.135.210.12:1139, L: /10.135.210.50:5222, S: 0.0.0.0/0.0.0.0:5222)
2009.08.17 08:24:12 309638 (01/05/00) - Connection #49 tested: OK

Hello,

I’m also familiar with this kind of a problem with Openfire+Pidgin. In a company I work we deployed Openfire 3.6.4 server and all of our users are connecting via Pidgin. After a while we noticed that server randomly closes connections. I’ve tried to change configuration of Openfire, Pigdin, also tried to manage traffic to the server but nothing helped. After some research I found this ticket on Pidgin website: http://developer.pidgin.im/ticket/10767

I think everything is explained in it.

Just out of curiosity is this Openfire server on a VM or bare metal? Mine is on a VM…not sure if there is an underlying issue there. My issue also seems to happen regardless of iptables enabled or disabled. With the lastest Pidgin 2.6.6 now, I can confirm this issue still happens at random.

Pidgin 2.7.0 has corrected the issue. We are no longer seeing users disconnect/reconnect all day.

Yes, Pidgin developers finally managed to implement keepalives packets This is a bit late now, but you’ve could eliminate this issue by modifying idle settings. http://www.igniterealtime.org/community/docs/DOC-2053

I tried putting the keepalives setting to -1 as mentioned in a few posts. Made no difference, Pidgin would still disconnect.

Then, maybe it was something else that they’ve fixed.

Can’t really say, beacause I’ve already moved to ejabberd-2.1.3 server and the problem has gone - no random disconnections. I think that it’s not the problem of Pidgin client but Openfire. Pidgin devs also claim that Openfire uses strange keepalive policy. OF require that keepalive packets are sent at some specific time interval no matter if ther is or isn’t any packet transmission between client and serwer. On the other side Pidgin sends keepalives only when there is no traffic to/from server. Thus if e.g. one or more contacts on our list changes status often Pidgin sees that there is traffic from server so won’t send keepalive packet, in result Openfire think that if no keepelive was sent session is dead and closes connection. IMO Pidgins policy is quite logical and bandwith-saving. Let’s imagine Openfire with 1KK users logged at a time and sending simultainously keepalive packets… Looks like packet flooding for me

PS: sorry for my english, haven’t used it for a while

i have several users using pidgin 2.7.0, adium, and pandeon.

most of my disconnects are with pandeon, but am seeing these come from all different clients.

i have NOT changed the keep alive settings to -1, because i have see most of the post saying that it make any difference.

also we do not want to have idle connections sitting around on out servers.

i did increase the time before cutting off idle times out to 30 minutes. with no change.

i am using openfire 3.6.4.

i dont have any messages coming though the error.log.

i have messages in warn, and info. but i cant decide if they are actually saying anything about disconnects.

there have been many disconnects as of late, and not of the log messages seem to line up with the actual disconnects.

i have seen many posts with this issue, but no one ever seems to find an answer

Update pidgin to 2.7.1 to fix it’s issue. Adium I have not tested much.

On Jun 30, 2010 2:53 PM, “bennett.lain” <webmaster@igniterealtime.org