Introducing Hazelcast ... a new way to cluster Openfire!

I tried #2, but the folder is erased and recreated on each restart. Copying a configuration file inside the JAR isn’t very promising either. I should probably take a closer look at #3, but it still looks like the configuration file must be placed inside a JAR (or not, if the plugin can add a folder to the classpath). Maybe it makes more sense to patch Openfire to include the conf folder to the plugins classpaths? It would certainly be much cleaner than placing configuration files inside archives.

Edit: It seems Hazelcast already offers an alternative: set the configuration file by setting the hazelcast.config system property. Unfortunately, this is bypassed by the clustering plugin, which goes directly to ClasspathXmlConfig.

The plugin is enable on both OF nodes and they use the same DB : the list of users is the same on both servers.

Other ideas ?

Edit: it’s better without French in the message ^^

Not sure what ‘plugin is enabled’ means, but I was talking about going into OF’s console (the one that runs on ports 9094/9095) and making sure clustering is enabled. Simply copying the JAR under plugins folder won’t do it.

Ne t’inquiete pas, je compred un peu

I meant I uploaded the jar file then enabled the clustering in the tab “Clustering” (By default, OF admin panel runs on ports 9090 and 9091).

Check nohup.out. You should find something like this:

INFO: [10.46.37.118]:5701 [OF37-cluster]

Members [1] {

Member [AA.BB.CC.DD]:5701 this

}

If nodes can see each other, there should be more than one member in the group.

I don’t find this file.

Strange, if you start openfire with service openfire start, this gets created under /opt/openfire/logs

Alternatively, the info we’re looking for should be on stdout. Check whether you have this redirected.

I installed the .deb provided by the OF team.

I’ve got the logs in /var/log/openfire : debug.info, erro.log, info.log and warn.log. But none of this files have an entry like that.

Where is stdout ?

Well, stdout is the console. On Fedora/CentOS/RHEL, it goes to nohup.out. But I don’t know about Debian. Maybe you should remove --quiet from the start command line?

Hi,

I have been looking for a clustering plugin for some time now and it is was really good news to see this post the other day.

2 questions:

So I have 2 Openfire Servers with exacty the same setup, I have installed the clustering plugin on both and I see the local node on each, but I don’t see a second node in any of them!

What do I need to configure else to see the other openfire node?

Currently they use the same database that is pointing to the same data node in a MySQL Cluster Replication (synchronized - Galera).

But at some point I want one openfire server to point to one data node and another openfire server to point at another data node so in case the data node fails I am still in business.

Is this kind of setup with a database cluster setup possible with the clustering plugin?

I really appreciate any feedback.

Thank you!

To ensure that the member nodes in an Openfire cluster are able to find each other, try using TCP-based discovery in lieu of the default UDP/multicast configuration (described above) and see if you get a better result. Upside for this approach is a reliable point-to-point communication path between servers; the main drawback for using TCP is the need for static configuration for the well-known cluster member(s).

As for your database cluster, in theory it should work just fine, but I personally have not used a multi-master setup for MySQL. Perhaps you could give it a whirl and report back with your findings.

Please also note that the new Hazelcast clustering plugin is only compatible with Openfire 3.7.2 Beta (not yet released) and newer. If you would like to try it using an older Openfire, try using this custom build which has been back-ported by Dele (one of our helpful @community advocates).

I met a critical bug while using hazelcast based cluster.

I always got the following exception. Then openfire server hangs, which means the process still exists but it does not work.

I use openfire 3.7.2 beta, hazelcast 2.3.1 (or hazelcast 2.4 still has this error.)

2012.10.29 14:57:47org.jivesoftware.util.cache.CacheFactory - Hazelcast Instance is not active!

java.lang.IllegalStateException:Hazelcast Instance is not active!

atcom.hazelcast.impl.FactoryImpl.initialChecks(FactoryImpl.java:711)

atcom.hazelcast.impl.MProxyImpl.beforeCall(MProxyImpl.java:102)

atcom.hazelcast.impl.MProxyImpl.access$000(MProxyImpl.java:49)

atcom.hazelcast.impl.MProxyImpl$DynamicInvoker.invoke(MProxyImpl.java:64)

at$Proxy0.getLocalMapStats(Unknown Source)

atcom.hazelcast.impl.MProxyImpl.getLocalMapStats(MProxyImpl.java:258)

atcom.jivesoftware.util.cache.ClusteredCache.getCacheSize(ClusteredCache.java:1 40)

atorg.jivesoftware.util.cache.CacheWrapper.getCacheSize(CacheWrapper.java:73)

atcom.jivesoftware.util.cache.ClusteredCacheFactory.updateCacheStats(ClusteredC acheFactory.java:344)

atorg.jivesoftware.util.cache.CacheFactory$1.run(CacheFactory.java:636)

Can anybody give my some advice?

I succed to see the stdout via a script I created.

Both nodes can see each other:

INFO: [192.168.1.11]:5701 [openfire]

Members [2] {

Member [192.168.1.11]:5701 this

Member [192.168.1.12]:5701

}

No clue about my first problem ? the fact I can’t see online contact when I enabled the plugin (even with only one server) ?

Strange, once I got the two nodes seeing each other, users, sessions, MUC rooms, all started working correctly. What does the openfire console say? Does it list both servers?

Does it mean if the clustering plugin is enabled, we have to have two node at least for OF can work ?

We have successfully tested our cluster using a single node, so that should not be an issue. Do you have any other plugins installed that may be conflicting with the cluster’s cache configuration?

I don’t know what appened, but OF crashed then each time I tried to go to the admin panel it asked me to setup OF, so I installed OF from scratch and now there is no plugin but the clustering one installed and enabled.

No, I guess it can work with a single node, too (you obviously have two nodes now, as indicated in the output). I meant that when everything is ok, you can see both nodes in the OF’s console. That’s a good way of checking whether everything is in order or not. It is possible hazelcast instances can see each other, but OF didn’t connect them properly, for some reason.

Based on this exception, it appears that the cluster failed to start, but the cache statistics thread started anyway. Were there any other errors reported in the console or error logs? My guess is that you have a configuration issue, perhaps short on memory, or maybe a classpath conflict. Are there any other plugins deployed in your test cluster?

I can see the both node in the admin console of OF. I can join MUC from other servers but I can’t see anyone, as in my roster: I can see only contacts from my own server.

Maybe all connections goes to one server due to DNS setting ?

I think the “master” don’t share the connections with the other node when I recieved one.