Skip navigation
8716 Views 36 Replies Latest reply: Jan 13, 2010 8:44 AM by Guus der Kinderen RSS
ddiggler Bronze 17 posts since
Nov 11, 2009
Currently Being Moderated

Dec 16, 2009 1:09 PM

Openfire service crashing daily

The service will run for a day, maybe two, then it will stop with the error message below.  The error is identical to post http://www.igniterealtime.org/community/message/198731#198731.  I posted a response in that forum 6 days ago with no response.  I figured I try creating my own thread.  I also posted a thread in http://kraken.blathersource.org/node/266.  I have a feeling the problems may be related.

 

We are running Openfire 3.6.4 w/ 1.6.0_17 JVM

Windows Server 2003 SP2 (Virtual Server in VMware ESX)

Dual 2.3Ghz Procs / 3GB of RAM

-Xms1280m -Xmx1280m  (You can see all arguments in the error msg below.  We tried 256/256, 512/512, 1024/1024, 1024/1536 and 512/1024 as recommended in the linked article..same problems)

 

Plugins:

Client Control: Only allow Spark client

Kraken IM Gateway 1.1.2: We control IM Gateway access through groups.  Only 170 users are allowed to used it.

Monitoring Service:  Full archiving is enabled.  User to User and group chats

Red5:  Used for video conferencing test (3 users) and where JWCHAT (~200 JWCHAT users) is setup .  HTTP Binding is enabled for this.

Search: Used for easy searching...duh

User Import Export:  Used for migration of user from old IM solution

 

 

We have 3000 accounts w/ a max of 1500 concurrent users.  We average 1300 to 1400 concurrent users.

 

Running SQL 2005 DB (on a dedicated SQL server)

Microsoft SQL Server Management Studio  9.00.4035.00
Microsoft Analysis Services Client Tools 2005.090.4035.00
Microsoft Data Access Components (MDAC)  2000.086.3959.00 (srv03_sp2_rtm.070216-1710)
Microsoft MSXML    2.6 3.0 6.0
Microsoft Internet Explorer  7.0.5730.13
Microsoft .NET Framework  2.0.50727.3082
Operating System    5.2.3790

 

Error Message...I excluded the process list due to length.  If this will help, I can post it.  If anyone has a solution, I am in desperate need of help. Everything worked great during our testing of 400-500 concurrent users.

 

#
# An unexpected error has been detected by Java Runtime Environment:
#
# java.lang.OutOfMemoryError: requested 835968 bytes for Chunk::new. Out of swap space?
#
#  Internal Error (414C4C4F434154494F4E0E43505000C7), pid=3188, tid=3168
#
# Java VM: Java HotSpot(TM) Server VM (1.6.0_03-b05 mixed mode)
# If you would like to submit a bug report, please visit:
http://java.sun.com/webapps/bugreport/crash.jsp
#

---------------  T H R E A D  ---------------

Current thread (0x4801f800):  JavaThread "CompilerThread1" daemon [_thread_in_native,]

Stack: [0x483b0000,0x48400000)
[error occurred during error reporting, step 110, id 0xc0000005]


Current CompileTask:
C2:2692      org.jivesoftware.openfire.plugin.SearchPlugin.replyDataFormResult(Ljava/util/Co llection;Lorg/xmpp/packet/IQ;)Lorg/xmpp/packet/IQ; (416 bytes)

 

VM Arguments:
jvm_args: -Dexe4j.isInstall4j=true -Dexe4j.isService=true -Dexe4j.moduleName=C:\Program Files\Openfire\bin\openfire-service.exe -Dexe4j.processCommFile=C:\WINDOWS\TEMP\e4j_p3188.tmp -Dexe4j.tempDir= -Dexe4j.unextractedPosition=0 -Dexe4j.consoleCodepage=cp0 -Xrs -Xms1024m -Xmx1024m
java_command: <unknown>
Launcher Type: generic

Environment Variables:
PATH=C:\WINDOWS\system32;C:\WINDOWS;C:\WINDOWS\System32\Wbem;c:\program files\openfire\jre\bin
OS=Windows_NT
PROCESSOR_IDENTIFIER=x86 Family 6 Model 15 Stepping 8, GenuineIntel

 

---------------  S Y S T E M  ---------------

OS: Windows Server 2003 family Build 3790 Service Pack 2

CPU:total 2 (4 cores per cpu, 1 threads per core) family 6 model 15 stepping 7, cmov, cx8, fxsr, mmx, sse, sse2, sse3, ssse3

Memory: 4k page, physical 2096584k(1255816k free), swap 4194303k(4194303k free)

vm_info: Java HotSpot(TM) Server VM (1.6.0_03-b05) for windows-x86, built on Sep 24 2007 22:20:35 by "java_re" with unknown MS VC++:1310

 

 

 

We get the errors I posted below for a few hours then the OpenFire admin console will stop responding for 30-45 minutes, then the service crashes.

 

------------------------------------------------------------------error--------- ---------------------------------------------------

2009.12.16 11:36:00 [org.jivesoftware.openfire.nio.ConnectionHandler.exceptionCaught(ConnectionHand ler.java:110)
]
java.lang.OutOfMemoryError: unable to create new native thread
at java.lang.Thread.start0(Native Method)
at java.lang.Thread.start(Unknown Source)
at net.kano.joscar.flap.AsynchronousFlapProcessor.<init>(AsynchronousFlapProcessor .java:29)
at net.kano.joscar.flap.ClientFlapConn.init(ClientFlapConn.java:82)
at net.kano.joscar.flap.ClientFlapConn.<init>(ClientFlapConn.java:73)
at net.sf.kraken.protocols.oscar.AbstractFlapConnection.<init>(AbstractFlapConnect ion.java:102)
at net.sf.kraken.protocols.oscar.LoginConnection.<init>(LoginConnection.java:41)
at net.sf.kraken.protocols.oscar.OSCARSession.logIn(OSCARSession.java:120)
at net.sf.kraken.protocols.oscar.OSCARTransport.registrationLoggedIn(OSCARTranspor t.java:95)
at net.sf.kraken.BaseTransport.processPacket(BaseTransport.java:398)
at net.sf.kraken.BaseTransport.processPacket(BaseTransport.java:199)
at org.jivesoftware.openfire.component.InternalComponentManager$RoutableComponents .process(InternalComponentManager.java:619)
at org.jivesoftware.openfire.spi.RoutingTableImpl.routePacket(RoutingTableImpl.jav a:260)
at org.jivesoftware.openfire.PresenceRouter.handle(PresenceRouter.java:164)
at org.jivesoftware.openfire.PresenceRouter.route(PresenceRouter.java:70)
at org.jivesoftware.openfire.spi.PacketRouterImpl.route(PacketRouterImpl.java:76)
at org.jivesoftware.openfire.net.StanzaHandler.processPresence(StanzaHandler.java: 337)
at org.jivesoftware.openfire.net.ClientStanzaHandler.processPresence(ClientStanzaH andler.java:85)
at org.jivesoftware.openfire.net.StanzaHandler.process(StanzaHandler.java:254)
at org.jivesoftware.openfire.net.StanzaHandler.process(StanzaHandler.java:176)
at org.jivesoftware.openfire.nio.ConnectionHandler.messageReceived(ConnectionHandl er.java:133)
at org.apache.mina.common.support.AbstractIoFilterChain$TailFilter.messageReceived (AbstractIoFilterChain.java:570)
at org.apache.mina.common.support.AbstractIoFilterChain.callNextMessageReceived(Ab stractIoFilterChain.java:299)
at org.apache.mina.common.support.AbstractIoFilterChain.access$1100(AbstractIoFilt erChain.java:53)
at org.apache.mina.common.support.AbstractIoFilterChain$EntryImpl$1.messageReceive d(AbstractIoFilterChain.java:648)
at org.apache.mina.common.IoFilterAdapter.messageReceived(IoFilterAdapter.java:80)
at org.apache.mina.common.support.AbstractIoFilterChain.callNextMessageReceived(Ab stractIoFilterChain.java:299)
at org.apache.mina.common.support.AbstractIoFilterChain.access$1100(AbstractIoFilt erChain.java:53)
at org.apache.mina.common.support.AbstractIoFilterChain$EntryImpl$1.messageReceive d(AbstractIoFilterChain.java:648)
at org.apache.mina.filter.codec.support.SimpleProtocolDecoderOutput.flush(SimplePr otocolDecoderOutput.java:58)
at org.apache.mina.filter.codec.ProtocolCodecFilter.messageReceived(ProtocolCodecF ilter.java:185)
at org.apache.mina.common.support.AbstractIoFilterChain.callNextMessageReceived(Ab stractIoFilterChain.java:299)
at org.apache.mina.common.support.AbstractIoFilterChain.access$1100(AbstractIoFilt erChain.java:53)
at org.apache.mina.common.support.AbstractIoFilterChain$EntryImpl$1.messageReceive d(AbstractIoFilterChain.java:648)
at org.apache.mina.filter.executor.ExecutorFilter.processEvent(ExecutorFilter.java :239)
at org.apache.mina.filter.executor.ExecutorFilter$ProcessEventsRunnable.run(Execut orFilter.java:283)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at org.apache.mina.util.NamePreservingRunnable.run(NamePreservingRunnable.java:51)
at java.lang.Thread.run(Unknown Source)

2009.12.16 11:35:51 [org.jivesoftware.util.log.util.CommonsLogFactory$1.error(CommonsLogFactory.jav a:92)
] error sending msg: MSG 59 U 92
====================
     Chunk Debug   
====================
MIME-Version: 1.0
Content-Type: text/x-msmsgscontrol
TypingUser: <user>@hotmail.com

 

====================
Binary Chunk Debug
====================
00000000h: 4D 49 4D 45 2D 56 65 72 73 69 6F 6E 3A 20 31 2E ; MIME-Version: 1.
00000010h: 30 0D 0A 43 6F 6E 74 65 6E 74 2D 54 79 70 65 3A ; 0..Content-Type:
00000020h: 20 74 65 78 74 2F 78 2D 6D 73 6D 73 67 73 63 6F ;  text/x-msmsgsco
00000030h: 6E 74 72 6F 6C 0D 0A 54 79 70 69 6E 67 55 73 65 ; ntrol..TypingUse
00000040h: 72 3A 20 6A 72 61 31 39 37 36 40 68 6F 74 6D 61 ; r: <user>@hotma
00000050h: 69 6C 2E 63 6F 6D 0D 0A 0D 0A 0D 0A             ; il.com......

java.net.SocketException: Software caused connection abort: socket write error
at java.net.SocketOutputStream.socketWrite0(Native Method)
at java.net.SocketOutputStream.socketWrite(Unknown Source)
at java.net.SocketOutputStream.write(Unknown Source)
at java.io.ByteArrayOutputStream.writeTo(Unknown Source)
at net.sf.jml.net.Session.sendMessage(Session.java:511)
at net.sf.jml.net.Session.access$1300(Session.java:30)
at net.sf.jml.net.Session$MsgSender.run(Session.java:476)
at java.lang.Thread.run(Unknown Source)

------------------------------------------------------------------error--------- ---------------------------------------------------

  • LG KeyContributor 6,132 posts since
    Dec 13, 2005
    Currently Being Moderated
    Dec 19, 2009 10:12 AM (in response to ddiggler)
    Re: Openfire service crashing daily

    Hi,

     

    your "unable to create new native thread" error shows that the Java heap (Xmx) is not your problem. You may want to try to set "-XX:ThreadStackSize=128" to have more space for native threads.

     

    LG

      • vidgizmo Bronze 1 posts since
        Dec 18, 2009
        Currently Being Moderated
        Dec 22, 2009 12:45 PM (in response to ddiggler)
        Re: Openfire service crashing daily

        We resolved this issue by telling people to not use Empathy/Telepathy based IM clients for now (which are apparently the default with the latest release of Ubuntu Linux). Since doing this we've been running w/o problem for almost a week.

         

        I'm monitoring threads the Empathy/Telepathy and Openfire forums to watch for a fix, but none yet.

         

        This search shows lots of threads in the forums related to this topic:

        http://www.igniterealtime.org/community/search.jspa?peopleEnabled=true&userID=&c ontainerType=&container=&q=empathy

          • sixthring KeyContributor 3,797 posts since
            Apr 2, 2007
            Currently Being Moderated
            Dec 22, 2009 3:15 PM (in response to ddiggler)
            Re: Openfire service crashing daily

            I would suggest disabling some of the plugins you are running.  They may be the cause of your issues.  Then enable them one at a time.  Or remove them one at a time.  Start with removing client control.

          • LG KeyContributor 6,132 posts since
            Dec 13, 2005
            Currently Being Moderated
            Dec 23, 2009 12:33 PM (in response to ddiggler)
            Re: Openfire service crashing daily

            Hi,

             

            the thread stack size applies to every thread and 128k may be too small - you'll get Java errors in your log file if this is the case. Xss should be the same as ThreadStackSize and thus for the native stack size while Xoss should set the java stack size for every thread. As you have a problem with the native thread count setting the ThreadStackSize should be enough.

            You may want to create some stack traces while the server is running - there you may be able to see an increasing number of java threads (every java thread can be mapped to a native thread). Maybe the name helps you to identify which part of Openfire or plugins does start these threads.

             

            LG

              • Guus der Kinderen KeyContributor 771 posts since
                Sep 8, 2005
                Currently Being Moderated
                Dec 30, 2009 1:11 PM (in response to ddiggler)
                Re: Openfire service crashing daily

                Would you mind loading a java-monitor plugin in your setup? That will give us more information, and will possibly allow you to anticipate a bit.

                 

                More information on java-monitor in this blogpost: New Openfire monitoring plugin

              • Guus der Kinderen KeyContributor 771 posts since
                Sep 8, 2005
                Currently Being Moderated
                Dec 30, 2009 1:22 PM (in response to ddiggler)
                Re: Openfire service crashing daily

                Alhough you're writing that you're using Java 1.6.0_17, your crashreport says otherwise:

                 

                Java HotSpot(TM) Server VM (1.6.0_03-b05) for windows-x86, built on Sep 24 2007 22:20:35 by "java_re" with unknown MS VC++:1310

                 

                03 is pretty old. It shouldn't hurt to update.

      • Guus der Kinderen KeyContributor 771 posts since
        Sep 8, 2005
        Currently Being Moderated
        Dec 31, 2009 11:29 AM (in response to ddiggler)
        Re: Openfire service crashing daily

        The reference to swap space surprised me. I didn't see that before. I've done a bit of googling, and found this:

        3.1.4 Detail Message: request <size> bytes for <reason>. Out of swap space?

        The detail message request <size> bytes for <reason>. Out of swap space? appears to be an OutOfMemoryError. However, the HotSpot VM code reports this apparent exception when an allocation from the native heap failed and the native heap might be close to exhaustion. The message indicates the size (in bytes) of the request that failed and the reason for the memory request. In most cases the <reason> part of the message is the name of a source module reporting the allocation failure, although in some cases it indicates a reason.

        When this error message is thrown, the VM invokes the fatal error handling mechanism, that is, it generates a fatal error log file, which contains useful information about the thread, process, and system at the time of the crash. In the case of native heap exhaustion, the heap memory and memory map information in the log can be useful. See Appendix C, Fatal Error Log for detailed information about this file.

        If this type of OutOfMemoryError is thrown, you might need to use troubleshooting utilities on the operating system to diagnose the issue further. See 2.16 Operating-System-Specific Tools.

        The problem might not be related to the application, for example:

        • The operating system is configured with insufficient swap space.

        • Another process on the system is consuming all memory resources.

        If neither of the above issues is the cause, then it is possible that the application failed due to a native leak, for example, if application or library code is continuously allocating memory but is not releasing it to the operating system.

        I've found this at http://java.sun.com/javase/6/webnotes/trouble/TSG-VM/html/memleaks.html#gbyvj

         

        Did you rule out causes outside of Openfire?

          • sixthring KeyContributor 3,797 posts since
            Apr 2, 2007
            Currently Being Moderated
            Jan 4, 2010 9:25 AM (in response to ddiggler)
            Re: Openfire service crashing daily

            Since your hands are tied as to disabling the possible offiending software I do not know what help you possibly think we can provide.

          • Guus der Kinderen KeyContributor 771 posts since
            Sep 8, 2005
            Currently Being Moderated
            Jan 4, 2010 9:39 AM (in response to ddiggler)
            Re: Openfire service crashing daily

            Interesting - there's a substantial spike on all graphs on the end. Can you explain this with user patterns (did everyone come into the office) or could this indicate the point where your problem starts?

             

            It's hard to tell for sure (as there's not a full history worth of data) but it looks like the amount of threads is growing out of control (see thread1.jpg). I advise you to make regular threaddumps, and compare them.

             

            HeapMem.jpg shows that the JVM is using but a fraction of the memory that you've made available to it. You can turn this back a notch or two, if you want. You'd expect more of a jigsaw-like pattern in this graph, which would be just fine.

          • Guus der Kinderen KeyContributor 771 posts since
            Sep 8, 2005
            Currently Being Moderated
            Jan 4, 2010 10:19 AM (in response to ddiggler)
            Re: Openfire service crashing daily

            I've found this whitepaper, in which you might VMWare specific hints, tips and/or clues: http://www.vmware.com/resources/techresources/1087

          • LG KeyContributor 6,132 posts since
            Dec 13, 2005
            Currently Being Moderated
            Jan 4, 2010 10:58 AM (in response to ddiggler)
            Re: Openfire service crashing daily

            Hi,

             

            looking at Stanza1.jpg I wonder whether the users are really exchanging 1500 presence packets per minute/second. I wonder whether this is an Openfire issue caused by one client. The client may be hard to identify, anyhow "Server Settings", "Message Audit Policy" allows one to audit presence packets to verify that these packets are fine.

             

            LG

            • LG KeyContributor 6,132 posts since
              Dec 13, 2005
              Currently Being Moderated
              Jan 5, 2010 8:52 AM (in response to ddiggler)
              Re: Openfire service crashing daily

              Hi,

               

              I assume that you did restart your server at 2010.01.04-16:xx (y.m.d:H:M) and that at 2010.01.05-14:xx a "Full GC" did occur. Thread1.jpg shows that the number of "runnable" and "timed waiting" threads does not change while the number of "waiting" threads drops, likely caused by the "Full GC".


              Attaching VisualVM to your Openfire process allows you to get stacktraces and heapdumps, so you really may want to install it on a client machine and connect it to Openfire.

               

              LG

            • LG KeyContributor 6,132 posts since
              Dec 13, 2005
              Currently Being Moderated
              Jan 5, 2010 12:24 PM (in response to ddiggler)
              Re: Openfire service crashing daily

              Hi,

               

              you may want to monitor the VM size of your process with "pslist -m 1234" where 1234 should be your Openfire pid. "pslist -t 1234" could also be interesting but I think that it is a VM memory problem.

               

              LG

            • Guus der Kinderen KeyContributor 771 posts since
              Sep 8, 2005
              Currently Being Moderated
              Jan 5, 2010 1:02 PM (in response to ddiggler)
              Re: Openfire service crashing daily

              The thread usage pattern is a bit odd, as others also mentioned. This might indicate a problem, but also might simply be a result of object that linger for an extreme long time, awaiting for finalization. As you assigned a lot more memory to Openfire than that it needs, it can take a long, long time before stuff is cleaned out.

               

              I'd be interested in the thread dumps, as I've mentioned earlier. That might give us a clue as to what all those 'extra' threads are about.

               

              For now, I'm still going with a problem that relates to the operating system itself. Any luck preparing a non-virtual host yet?

                • Guus der Kinderen KeyContributor 771 posts since
                  Sep 8, 2005
                  Currently Being Moderated
                  Jan 5, 2010 1:50 PM (in response to ddiggler)
                  Re: Openfire service crashing daily

                  As I replied to you by mail:

                   

                  This wasn't exactly what I'm looking for, but does give me some information. At first sight, there appear to be to much Timer threads, which appear to relate to the library that implements the Yahoo Messenger protocol. Any chance that you'd be able to run a couple of days with the Yahoo gateway disabled?

More Like This

  • Retrieving data ...

Bookmarked By (0)

Legend

  • Correct Answers - 10 points
  • Helpful Answers - 5 points