Openfire over Amazon problems: degradation and lost connections & customers don't connect after starting

Hi all.

First of all, sorry for my english.

We have deployed an Openfire XMPP server, initially using a standalone solution on Amazon.

In production enviroment we have installed and have running the version Openfire 3.10.0, with this plugins:

  • Clustering Plugin 1.3.0 Jive Software
  • DB Access 1.1.0 Daniel Henninger
  • Email Listener 1.1.0 Jive Software
  • Monitoring Service 1.4.2 Jive Software
  • Presence Service 1.6.0 Jive Software
  • Search 1.6.0 Ryan Graham
  • User Import Export 2.4.0 Ryan Graham
  • User Service 2.0.2 Roman Soldatow, Justin Hunt

The infraestructure is:

  • One server m3.xlarge, where openfire is running
  • One database:
    • SQL Server SE 10.50.2789.0.v1
    • Instance Class db.m1.large
    • Storage TypeStorage Magnetic
    • IOPSFixed disabled
    • Storage 300 GB

We currently have two major problems:

  • Suddenly Openfire just degraded and lost client connections (from 6,6k to 2k approx.)
  • When it restarts, suddenly it does not accept all connections from existing customers (6,6k)

To neutralize this we are preemptively restarting the server every 15 days, but every time we encounter the problem of concurrency.

We believe that the current instance has a theoretical limit of 15k connections, does it?

In addition, for the expected growth we have tried to configure and deploy into production with the version 3.11.0 using Hazelcast for clustering with 2 nodes (2 * m3.xlarge). The problem is that this solution is more unstable than the current standalone. This kept operating eight hours the last time we tried it and then fell.

We would like to know if somebody has experienced similar problems using Amazon platform or this volumetric of connections (currently near to 7k).

Thanks in advance.

Best regards.

there are some issues with 3.10.0 and 3.10.1 that were resolved in the latest release 3.10.2

Thanks for your answer speedy.

This is one of the option that we are analysing, but this require to migrate from coherence cluster plugin to hazelcast plugin and we need to validate the change.

In order to give more information, yesterday Openfire fell again. These are the traces we got when it started to degrade:

2015.08.26 20:44:35 - WARN - org.jivesoftware.openfire.nio.NIOConnection - No ACK was received when sending stanza to: org.jivesoftware.openfire.nio.NIOConnection@36576063 MINA Session: (SOCKET, R: /10.11.15.20:53634, L: /10.11.2.207:5222, S: 0.0.0.0/0.0.0.0:5222)

2015.08.26 20:44:37 - WARN - org.jivesoftware.openfire.nio.NIOConnection - No ACK was received when sending stanza to: org.jivesoftware.openfire.nio.NIOConnection@2b53bd4a MINA Session: (SOCKET, R: /10.11.15.20:53638, L: /10.11.2.207:5222, S: 0.0.0.0/0.0.0.0:5222)

2015.08.26 20:44:40 - WARN - org.jivesoftware.openfire.nio.NIOConnection - No ACK was received when sending stanza to: org.jivesoftware.openfire.nio.NIOConnection@49627284 MINA Session: (SOCKET, R: /10.11.24.48:32205, L: /10.11.2.207:5222, S: 0.0.0.0/0.0.0.0:5222)

2015.08.26 20:44:41 - WARN - org.jivesoftware.openfire.nio.NIOConnection - No ACK was received when sending stanza to: org.jivesoftware.openfire.nio.NIOConnection@46cf7ec3 MINA Session: (SOCKET, R: /10.11.24.48:34535, L: /10.11.2.207:5222, S: 0.0.0.0/0.0.0.0:5222)

2015.08.26 20:44:44 - WARN - org.jivesoftware.openfire.nio.NIOConnection - No ACK was received when sending stanza to: org.jivesoftware.openfire.nio.NIOConnection@34a52497 MINA Session: (SOCKET, R: /10.11.15.20:53650, L: /10.11.2.207:5222, S: 0.0.0.0/0.0.0.0:5222)

2015.08.26 20:44:44 - WARN - org.jivesoftware.openfire.nio.NIOConnection - No ACK was received when sending stanza to: org.jivesoftware.openfire.nio.NIOConnection@46055081 MINA Session: (SOCKET, R: /10.11.15.20:53647, L: /10.11.2.207:5222, S: 0.0.0.0/0.0.0.0:5222)

2015.08.26 20:44:47 - WARN - org.jivesoftware.openfire.nio.NIOConnection - No ACK was received when sending stanza to: org.jivesoftware.openfire.nio.NIOConnection@31ef5e5e MINA Session: (SOCKET, R: /10.11.15.20:53645, L: /10.11.2.207:5222, S: 0.0.0.0/0.0.0.0:5222)

2015.08.26 20:44:47 - WARN - org.jivesoftware.openfire.nio.NIOConnection - No ACK was received when sending stanza to: org.jivesoftware.openfire.nio.NIOConnection@5a1c86ee MINA Session: (SOCKET, R: /10.11.15.20:53658, L: /10.11.2.207:5222, S: 0.0.0.0/0.0.0.0:5222)

2015.08.26 20:44:48 - WARN - org.jivesoftware.openfire.nio.NIOConnection - No ACK was received when sending stanza to: org.jivesoftware.openfire.nio.NIOConnection@6dea17a MINA Session: (SOCKET, R: /10.11.15.20:53649, L: /10.11.2.207:5222, S: 0.0.0.0/0.0.0.0:5222)

2015.08.26 20:44:49 - WARN - org.jivesoftware.openfire.nio.NIOConnection - No ACK was received when sending stanza to: org.jivesoftware.openfire.nio.NIOConnection@175d6060 MINA Session: (SOCKET, R: /10.11.24.48:34537, L: /10.11.2.207:5222, S: 0.0.0.0/0.0.0.0:5222)

2015.08.26 20:44:51 - WARN - org.jivesoftware.openfire.nio.NIOConnection - No ACK was received when sending stanza to: org.jivesoftware.openfire.nio.NIOConnection@48cfe519 MINA Session: (SOCKET, R: /10.11.15.20:53642, L: /10.11.2.207:5222, S: 0.0.0.0/0.0.0.0:5222)

Thanks.