REST and Sessions link stop working after > 80k online users

REST API and the (Sessions link inside OpenFire admin area) stops working after > 80k online users.

I’ve tested this prior to v4 OpenFire and it had the same issue. I currently have the latest version 4.0.3 and it also has this issue.

It is not exactly 80k users, but somewhere between 80k - 90k.

I have 4 tsung servers being used to load test this 2 node cluster. After approximately 80k users are online the issue happens. Even when all 110k users are online and the load on the CPUs goes down to zero I cannot use REST and it times out if I click on the Sessions link within OpenFire. I’ll also add that even when there’s 20k users or less, the Sessions link is terribly slow to respond and many times, will timeout.

As soon as I start dropping clients via tsung servers, once it gets below 80k users online mark, then REST and Sessions link start working again. I’ve tested without the Hazelcast plugin on a single server and I never have this issue. Sessions link responds very fast and REST work the entire time.

Update: I used a single tsung server to load test Node 1 of the 2 node cluster. I left Node 2 alone with no connections. I logged into Node 2’s OpenFire web portal to see if clicking on the Sessions link made any difference and it did not. It timed out, even after all users were logged in and idle.

@Tom Evans

I am unsure why you tagged Tom on your post. Please refrain from doing that…

Are you HTTP proxying your requests to Openfire REST plugin or going direct to Openfire web console port?

Tom is the one who works with the Hazelcast plugin isn’t he? This appears to be a Hazelcast issue so I thought he should be tagged.

I have a webserver which runs a php script every time I visit that particular webpage or refresh it. It uses CURL to send the REST request to the OpenFire server. My webserver also uses REST to modify users and rosters the same way. Here is what the code looks like:

<?php $secret = 'MyPass'; $url = "http://myopenfireserver:9090/plugins/restapi/v1/system/statistics/sessions"; $headers = array( 'http' => array( 'method' => "GET", 'header' => "Authorization: " . $secret . "\r\n" ) ); $context = stream_context_create($headers); $response = file_get_contents($url, false, $context); echo $response; ?>

Update:

I’m testing MySQL instead of MSSQL and also making some changes to the Tsung server config files. This may have something to do with this problem.

It appears this was an issue with the tsung server settings when load balancing. I’m not sure why it does not have any issues when using a single Openfire server as opposed to a cluster. I tested around 130k users after changing each tsung server to login 26,000 users. Before I had them set to login 32,000 users and this is when the session link and REST api would quit working. Since I lowered each tsung server down to 26k uers I also added an additional tsung server, so I have 5 tsung servers load testing at 26,000 = 130k online with no issues.

Also, I never used more than around 4GB RAM on each of the cluster servers.