Bosh connection not stable

Bosh connection disconnecting from openfire. Seems openfire firing it

2016.10.10 19:15:33 org.jivesoftware.openfire.muc.spi.MultiUserChatServiceImpl - Internal server error

java.lang.NullPointerException

at org.jivesoftware.openfire.muc.spi.LocalMUCRoom.sendPrivatePacket(LocalMUCRoom.j ava:1077)

at org.jivesoftware.openfire.muc.spi.LocalMUCUser.process(LocalMUCUser.java:293)

at org.jivesoftware.openfire.muc.spi.LocalMUCUser.process(LocalMUCUser.java:197)

at org.jivesoftware.openfire.muc.spi.MultiUserChatServiceImpl.processPacket(MultiU serChatServiceImpl.java:352)

at org.jivesoftware.openfire.component.InternalComponentManager$RoutableComponents .process(InternalComponentManager.java:606)

at org.jivesoftware.openfire.spi.RoutingTableImpl.routeToComponent(RoutingTableImp l.java:406)

at org.jivesoftware.openfire.spi.RoutingTableImpl.routePacket(RoutingTableImpl.jav a:248)

at org.jivesoftware.openfire.MessageRouter.route(MessageRouter.java:139)

at org.jivesoftware.openfire.spi.PacketRouterImpl.route(PacketRouterImpl.java:83)

at org.jivesoftware.openfire.net.StanzaHandler.processMessage(StanzaHandler.java:3 78)

at org.jivesoftware.openfire.net.ClientStanzaHandler.processMessage(ClientStanzaHa ndler.java:113)

at org.jivesoftware.openfire.net.StanzaHandler.process(StanzaHandler.java:232)

at org.jivesoftware.openfire.net.StanzaHandler.process(StanzaHandler.java:199)

at org.jivesoftware.openfire.nio.ConnectionHandler.messageReceived(ConnectionHandl er.java:181)

at org.apache.mina.core.filterchain.DefaultIoFilterChain$TailFilter.messageReceive d(DefaultIoFilterChain.java:690)

at org.apache.mina.core.filterchain.DefaultIoFilterChain.callNextMessageReceived(D efaultIoFilterChain.java:417)

at org.apache.mina.core.filterchain.DefaultIoFilterChain.access$1200(DefaultIoFilt erChain.java:47)

at org.apache.mina.core.filterchain.DefaultIoFilterChain$EntryImpl$1.messageReceiv ed(DefaultIoFilterChain.java:765)

at org.apache.mina.core.filterchain.IoFilterAdapter.messageReceived(IoFilterAdapte r.java:109)

at org.apache.mina.core.filterchain.DefaultIoFilterChain.callNextMessageReceived(D efaultIoFilterChain.java:417)

at org.apache.mina.core.filterchain.DefaultIoFilterChain.access$1200(DefaultIoFilt erChain.java:47)

at org.apache.mina.core.filterchain.DefaultIoFilterChain$EntryImpl$1.messageReceiv ed(DefaultIoFilterChain.java:765)

at org.apache.mina.filter.codec.ProtocolCodecFilter$ProtocolDecoderOutputImpl.flus h(ProtocolCodecFilter.java:407)

at org.apache.mina.filter.codec.ProtocolCodecFilter.messageReceived(ProtocolCodecF ilter.java:236)

at org.apache.mina.core.filterchain.DefaultIoFilterChain.callNextMessageReceived(D efaultIoFilterChain.java:417)

at org.apache.mina.core.filterchain.DefaultIoFilterChain.access$1200(DefaultIoFilt erChain.java:47)

at org.apache.mina.core.filterchain.DefaultIoFilterChain$EntryImpl$1.messageReceiv ed(DefaultIoFilterChain.java:765)

at org.apache.mina.core.filterchain.IoFilterEvent.fire(IoFilterEvent.java:74)

at org.apache.mina.core.session.IoEvent.run(IoEvent.java:63)

at org.apache.mina.filter.executor.OrderedThreadPoolExecutor$Worker.runTask(Ordere dThreadPoolExecutor.java:769)

at org.apache.mina.filter.executor.OrderedThreadPoolExecutor$Worker.runTasks(Order edThreadPoolExecutor.java:761)

at org.apache.mina.filter.executor.OrderedThreadPoolExecutor$Worker.run(OrderedThr eadPoolExecutor.java:703)

at java.lang.Thread.run(Thread.java:745)

Hi Munkhuu,

I’ve seen something occasionally similar using 4.0.1 - have you tried using HEAD to see if that is different?

The key seems to be that under certain pathological network conditions the BOSH connection time out path isn’t adding the response to the list of previous response bodies - and in some cases the client is re-requesting the RID of that timed out connection - and when the server sees a RID that isn’t a “new RID” but it doesn’t have a response for it - it throws an Exception - this closes the session.

The next incoming request from the client (with the next valid RID) is now made against a previously closed session - and you see the trace you have above (404 - Invalid SID value).

Personally I find it hard to reproduce this without using something 3G or an international backbone link where the network packets can be out-of-order or delayed / dropped.

I’m looking into it a little and will try and provide a patch to cope with this edge case soon.

(Openfire Devs, this seems to be the onTimeout code in the AsyncListener created in HttpSession.java - the “deliverBody” response on time out is never stored in the sentElements array, and any re-request of the RID causes the “unexpected RID” path which in turn closes the session).

Regards

Dan