Openfire 4.0.1 and problem with increasing size of non-identified sockets

We use Openfire 4.0.1 just with BOSH.

currently ~5000 users created, online users count ~2500

OS Redhat, RAM gived for openfire 10GB

When users are online, we got increasing size of open files for openfire process. By using lsof we can see too many lsof can’t identify protocol, and they are increases.

MAx openfiles for openfire process is 100k.

lsof list is attached
lsof.txt.zip (159505 Bytes)

Added new lsof result with 22623 opened files. When I posted it was 19169

when it will rich max openfiles we need to restart openfire. But its not solution.

Does anyone have same problem or may be got answer to resolve it
lsof2.zip (178336 Bytes)

Added new lsof result with 22623 opened files. When I posted it was 19169

when it will rich max openfiles we need to restart openfire. But its not solution.

Does anyone have same problem or may be got answer to resolve it
lsof2.zip (178336 Bytes)

updated to 4.0.2 but problem still exists

Restarted servers 5 times from morning. Every time it crushes with “too many opened files” even with 100k in limits

We use just BOSH with strophe.js

Are you restarting the operating system or just Openfire? Have you re-execed the shell you are using after making the limit changes? What do you currently have set in /etc/security/limits.conf ?

Auch. I don’t have an immediate answer for you, sorry.

Is there any clue in the log files?

When you navigate in the admin console (particularly the ‘sessions’ tab), does anything seem off? (for example, is the pagination wrong, are you missing sesssions, users, something like that?)

Yes, after setting limits servers are restarted. Added aditional max user process 95k

Tomorrow will monitor.

We figured up some relations… when got can not create native thread in error.log open files increases fast. Today we increased max user process (soft and hard limit) set up to 95k. Tomorrow will monitor.

When you navigate in the admin console (particularly the ‘sessions’ tab), does anything seem off? (for example, is the pagination wrong, are you missing sesssions, users, something like that?)
In session tab everything is seem ok. sometimes we got duplicated session for one user

Just after we increased max process limit we get error. Added screenshot of jvisual fron one of servers with 4gb xmx. Look to the number of threads. Also error.log and all.log attached

error.log.zip (5900 Bytes)
all.log.zip (18625 Bytes)

The log is full of warnings related to Hazelcast, which is what we use for clustering (@Tom Evans does this ring any bell?). Are you indeed running an Openfire cluster? If not, try removing the hazelcast / clustering plugin.

In case you are running a cluster: do you have the opportunity to run on a single node, instead of a cluster (shut every other node down, and remove the cluster plugin). The load of 2500 concurrent users is typically easily handled with one server (especially of the specifications that you gave).

From monday we turn off clusted and using just one server, with disabled cluster option. Hazelcast plugin is not removed.

We have some good news, but not enough for goal. Just now we got errors. From jconsole we can see about 68k Jetty-QTP-BOSH threads. I think its not good…

Try removing the plugin complete (and restart Openfire to be sure, if you can).

Are you sure that you have 68,000 live threads (your JConsole shows only a couple hundred threads)? You cannot base that on the thread name alone, especially not when your server is in a loop of trying to create a thread (and generate a new name), but failing to do so.

It would be very odd that you have so many threads from that thread pool. The size of that pool is limited, by default to 200. You can override this default by setting the httpbind.client.processing.threads property.

what is difference between

xmpp.httpbind.worker.threads

xmpp.client.processing.threads

We setup this values

xmpp.client.processing.threads=32

xmpp.httpbind.worker.threads=400

client is used for tcp-connections (non-BOSH connections to 5222), httpbind is used for BOSH.

But, again, are you sure that you actually have 68k concurrent threads. I think this is a false lead.

I think I mistaken, numbers in threads is just incremental value. And its not live threads

And what is difference between httpbind-worker and Jetty-QTP-BOSH. In sources I found

final int processingThreads = JiveGlobals.getIntProperty(HTTP_BIND_THREADS, HTTP_BIND_THREADS_DEFAULT);

final QueuedThreadPool tp = new QueuedThreadPool(processingThreads);

tp.setName(“Jetty-QTP-BOSH”);

And as you said HTTP_BIND_THREADS=xmpp.httpbind.worker.threads is number of TCP connections which Openfire+Jetty can accept

Did you tune RHEL? Tuning the OS kernel may be a good idea to make sure that it is not the limit.

Eg: Linux TCP/IP tuning for scalability or Red Hat Enterprise Linux Network Performance Tuning Guide

xmpp.httpbind.client.requests.wait

What does this parameter means?

If we will set 30secs will this close connections when user will close browser and his status shown as online?

We got now some erros

2016.04.11 12:13:50 org.eclipse.jetty.server.HttpChannel - /http-bind/

java.lang.IllegalStateException: state=EOF

at org.eclipse.jetty.server.HttpInput.setReadListener(HttpInput.java:376)

at org.jivesoftware.openfire.http.HttpBindServlet.doPost(HttpBindServlet.java:152)

at javax.servlet.http.HttpServlet.service(HttpServlet.java:707)

at org.jivesoftware.openfire.http.HttpBindServlet.service(HttpBindServlet.java:116 )

at javax.servlet.http.HttpServlet.service(HttpServlet.java:790)

at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:812)

at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:587)

at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:22 1)

at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:11 27)

at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:515)

at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185 )

at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:106 1)

at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)

at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandler Collection.java:189)

at org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.jav a:110)

at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97)

at org.eclipse.jetty.server.Server.handleAsync(Server.java:549)

at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:348)

at org.eclipse.jetty.server.HttpChannel.run(HttpChannel.java:262)

at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:635 )

at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:555)

at java.lang.Thread.run(Thread.java:745)

and also

2016.04.11 12:19:12 org.eclipse.jetty.server.HttpInput - java.lang.OutOfMemoryError: unable to create new native thread

2016.04.11 12:19:12 org.jivesoftware.openfire.http.HttpBindServlet - Error reading request data from [192.168.72.24]

java.lang.OutOfMemoryError: unable to create new native thread

at java.lang.Thread.start0(Native Method)

at java.lang.Thread.start(Thread.java:714)

at java.util.concurrent.ThreadPoolExecutor.addWorker(ThreadPoolExecutor.java:950)

at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1357)

at org.jivesoftware.openfire.http.HttpSessionManager.execute(HttpSessionManager.ja va:370)

at org.jivesoftware.openfire.http.HttpSession$HttpPacketSender.init(HttpSession.ja va:1275)

at org.jivesoftware.openfire.http.HttpSession$HttpPacketSender.access$100(HttpSess ion.java:1262)

at org.jivesoftware.openfire.http.HttpSession.forwardRequest(HttpSession.java:591)

at org.jivesoftware.openfire.http.HttpBindServlet.handleSessionRequest(HttpBindSer vlet.java:250)

at org.jivesoftware.openfire.http.HttpBindServlet.processContent(HttpBindServlet.j ava:205)

at org.jivesoftware.openfire.http.HttpBindServlet$ReadListenerImpl.onAllDataRead(H ttpBindServlet.java:423)

at org.eclipse.jetty.server.HttpInput.run(HttpInput.java:443)

at org.eclipse.jetty.server.handler.ContextHandler.handle(ContextHandler.java:1175 )

at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:355)

at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:257)

at org.eclipse.jetty.io.AbstractConnection$2.run(AbstractConnection.java:544)

at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:635 )

at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:555)

at java.lang.Thread.run(Thread.java:745)
warn.log.1.zip (18046 Bytes)