We’re having a handful of users who are reporting random disconnects from our Openfire server. The common theme seems to be that they’re all using pidgin. This was happening in both Openfire 3.5.2 and the newest version of Openfire 3.6.0. Here’s the stack trace I see in the logs when someone gets disconnected:
java.lang.IllegalArgumentException: IQ must be of type ‘set’ or ‘get’. Original IQ:
at org.xmpp.packet.IQ.createResultIQ(IQ.java:355)
at org.jivesoftware.openfire.IQRouter.route(IQRouter.java:104)
at org.jivesoftware.openfire.spi.PacketRouterImpl.route(PacketRouterImpl.java:68)
at org.jivesoftware.openfire.net.StanzaHandler.processIQ(StanzaHandler.java:311)
at org.jivesoftware.openfire.net.ClientStanzaHandler.processIQ(ClientStanzaHandler .java:79)
at org.jivesoftware.openfire.net.StanzaHandler.process(StanzaHandler.java:276)
at org.jivesoftware.openfire.net.StanzaHandler.process(StanzaHandler.java:175)
at org.jivesoftware.openfire.nio.ConnectionHandler.messageReceived(ConnectionHandl er.java:133)
at org.apache.mina.common.support.AbstractIoFilterChain$TailFilter.messageReceived (AbstractIoFilterChain.java:570)
at org.apache.mina.common.support.AbstractIoFilterChain.callNextMessageReceived(Ab stractIoFilterChain.java:299)
at org.apache.mina.common.support.AbstractIoFilterChain.access$1100(AbstractIoFilt erChain.java:53)
at org.apache.mina.common.support.AbstractIoFilterChain$EntryImpl$1.messageReceive d(AbstractIoFilterChain.java:648)
at org.apache.mina.common.IoFilterAdapter.messageReceived(IoFilterAdapter.java:80)
at org.apache.mina.common.support.AbstractIoFilterChain.callNextMessageReceived(Ab stractIoFilterChain.java:299)
at org.apache.mina.common.support.AbstractIoFilterChain.access$1100(AbstractIoFilt erChain.java:53)
at org.apache.mina.common.support.AbstractIoFilterChain$EntryImpl$1.messageReceive d(AbstractIoFilterChain.java:648)
at org.apache.mina.filter.codec.support.SimpleProtocolDecoderOutput.flush(SimplePr otocolDecoderOutput.java:58)
at org.apache.mina.filter.codec.ProtocolCodecFilter.messageReceived(ProtocolCodecF ilter.java:185)
at org.apache.mina.common.support.AbstractIoFilterChain.callNextMessageReceived(Ab stractIoFilterChain.java:299)
at org.apache.mina.common.support.AbstractIoFilterChain.access$1100(AbstractIoFilt erChain.java:53)
at org.apache.mina.common.support.AbstractIoFilterChain$EntryImpl$1.messageReceive d(AbstractIoFilterChain.java:648)
at org.apache.mina.filter.executor.ExecutorFilter.processEvent(ExecutorFilter.java :239)
at org.apache.mina.filter.executor.ExecutorFilter$ProcessEventsRunnable.run(Execut orFilter.java:283)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at org.apache.mina.util.NamePreservingRunnable.run(NamePreservingRunnable.java:51)
at java.lang.Thread.run(Unknown Source)
You’re right. This error does not appear to be producing the disconnection. I made this change, but the disconnects are still happening. I’ve been working closely with a user trying to figure out the source of the disconnects. He got disconnected, I went to the logs, and saw a couple of these errors, but not for the user in question. He got dropped at 2008.08.28 16:57:41, which is the same in the logs that these errrors appeared.
Again, this only happens for pidgin, and it also seems like all of the users are getting disconnected at the same time…
Compression is turned off and xmpp.client.idle is -1… we do have TLS enabled… that’s the only other thing I can think of?
Is this an Openfire issue or something with our network? Any ideas?
The server config setting will only take effect for new connections. Are you still seeing disconnects after clients newly connect after the config setting change?
Otherwise, please consider starting one of your pidgin clients with the -d flag and see if you can trace what is happening.
Your suggestions, along with the new release of Pidgin, seem to have substantially improved this issue for us. However, we’re still having users report disconnects every once in a while. Sometimes there aren’t any errors or warnings or anything when someone drops. Is there some kind of interoperability issue between Pidgin and Openfire?
There are some issues that can cause trouble with Pidgin and Openfire. In fact, you will find my name a lot in the Pidgin bug trac referencing some of these problems. Invariably, the Pidgin devels point their finger at Openfire for causing the troubles and the openfire devels are more keen on supporting issues with spark[web]. I don’t believe the pidgin devels test against openfire for various reasons, so what’s my point…
If you have troubles, try to open up the XML debug console on Pidgin and see if you can catch the disconnect in action. Putting openfire into debug mode may help as well…
I have an install base of 2900 users, with most of them running Pidgin, but for the most part, things work just great! If they continue to look bad for you, check out other clients, like Spark (hi winsrev ). Thankfully, there are many quality options out there.
Yes, apparently this is still happening under 2.5.1. I’ve asked the users who are having these issues to start debugging their Pidgin sessions and send me the logs when they crash, but I’ve yet to receieve one
I had a user send me one of his log files. Here are some of the errors in the logs:
jabber: XML parser error for JabberStream 01DB3DF8: Domain 1, code 5, level 3: Extra content at the end of the document
jabber: Recv (ssl)(316):
jabber: XML parser error for JabberStream 01DB4048: Domain 1, code 100, level 1: xmlns: URI vcard-temp is not absolute
The 1st error seems to be causing pidgin to crash. I don’t have enough data about the 2nd error, but it seems to be related to connection issues with our chat rooms. The 3rd error seems harmless, everything still runs fine even though pidgin spits out those errors.
Same issue here with Pidgin, random disconnects after upgrading the version of Pidgin to 2.5.2 from 2.3.1. The old version works fine but the new version disconnects. It looks like the main change in Pidgin was the way it handles SSL certificates. Any ideas?
I have the xmpp.idle set to -1 for my server settings as well. When the disconnects occur it does not do all users at once, but tons of users get disconnected and eventually reconnect…sometimes dozens of times. I have TLS as well.
Debug from Openfire while the disconnect dance was going on:
2009.08.17 08:23:03 ConnectionHandler: Closing connection that has been idle: org.jivesoftware.openfire.nio.NIOConnection@11a1774 MINA Session: (SOCKET, R: /10.135.200.149:1249, L: /10.135.210.50:5222, S: 0.0.0.0/0.0.0.0:5222)
2009.08.17 08:24:05 309638 (01/05/00) - #48 registered a statement as closed which wasn’t known to be open. This could happen if you close a statement twice.
2009.08.17 08:24:05 309638 (01/05/00) - Connection #48 tested: OK
2009.08.17 08:24:12 ConnectionHandler: Closing connection that has been idle: org.jivesoftware.openfire.nio.NIOConnection@1dbbfb8 MINA Session: (SOCKET, R: /10.135.210.12:1139, L: /10.135.210.50:5222, S: 0.0.0.0/0.0.0.0:5222)
2009.08.17 08:24:12 309638 (01/05/00) - Connection #49 tested: OK
I’m also familiar with this kind of a problem with Openfire+Pidgin. In a company I work we deployed Openfire 3.6.4 server and all of our users are connecting via Pidgin. After a while we noticed that server randomly closes connections. I’ve tried to change configuration of Openfire, Pigdin, also tried to manage traffic to the server but nothing helped. After some research I found this ticket on Pidgin website: http://developer.pidgin.im/ticket/10767
Just out of curiosity is this Openfire server on a VM or bare metal? Mine is on a VM…not sure if there is an underlying issue there. My issue also seems to happen regardless of iptables enabled or disabled. With the lastest Pidgin 2.6.6 now, I can confirm this issue still happens at random.
Yes, Pidgin developers finally managed to implement keepalives packets This is a bit late now, but you’ve could eliminate this issue by modifying idle settings. http://www.igniterealtime.org/community/docs/DOC-2053
Can’t really say, beacause I’ve already moved to ejabberd-2.1.3 server and the problem has gone - no random disconnections. I think that it’s not the problem of Pidgin client but Openfire. Pidgin devs also claim that Openfire uses strange keepalive policy. OF require that keepalive packets are sent at some specific time interval no matter if ther is or isn’t any packet transmission between client and serwer. On the other side Pidgin sends keepalives only when there is no traffic to/from server. Thus if e.g. one or more contacts on our list changes status often Pidgin sees that there is traffic from server so won’t send keepalive packet, in result Openfire think that if no keepelive was sent session is dead and closes connection. IMO Pidgins policy is quite logical and bandwith-saving. Let’s imagine Openfire with 1KK users logged at a time and sending simultainously keepalive packets… Looks like packet flooding for me
PS: sorry for my english, haven’t used it for a while