Mirth Connect
  1. Mirth Connect
  2. MIRTH-4266

SFTP File Reader gets stuck in Polling state. Must be halted and restarted.

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 2.2.3, 3.0.3, 3.1.1, 3.2.2, 3.3.2, 3.4.2, 3.5.2
    • Fix Version/s: 3.6.0
    • Component/s: Server
    • Labels:
      None

      Description

      See forum post https://www.mirthcorp.com/community/forums/showthread.php?t=11371 for history of issue.

      All of the seemingly unrelated errors have to do with the server terminating the connection. My thread dump showed that the Jsch session thread was not running when the channel was stuck polling.

      After 8 occurances of these errors is when the channel gets stuck. org.apache.commons.pool2.impl.GenericObjectPoolConfig.DEFAULT_MAX_TOTAL is also 8.

      The issue appears to be in the releaseConnection method of com.mirth.connect.connectors.file.FileConnector. If the FileSystemConnection is not connected, it neither returns it to the pool nor invalidates it.

      I was able to verify this with the following code:

      var donkey = com.mirth.connect.donkey.server.Donkey.getInstance();
      var stuckChannel = '9113f3c3-2024-424e-8916-592ec983a822';
      var channel = donkey.getDeployedChannels().get(stuckChannel);
      var sourceConnector = channel.getSourceConnector();
      var f = sourceConnector.getClass().getDeclaredField("fileConnector");
      f.setAccessible(true);
      var fileConnector = f.get(sourceConnector);
      f = fileConnector.getClass().getDeclaredField("pools");
      f.setAccessible(true);
      var pools = f.get(fileConnector);
      var pool = pools.values().iterator().next();
      
      logger.info(JSON.stringify({
      	pool:{
      		getNumIdle: pool.getNumIdle(),
      		getNumActive: pool.getNumActive(),
      		getNumWaiters: pool.getNumWaiters()
      	}
      }));
      

      Which logged the following when my channel was stuck:

      INFO  (deploy:?): {"pool":{"getNumIdle":0,"getNumActive":8,"getNumWaiters":1}}
      

      I was able to reproduce this in a test environment by setting up a sftp server with artifical network delay. Once the file reader started reading a file I would kill the sftp server to sever the connection and then restart the sftp server. After 8 iterations of this the channel was stuck in a polling state.

        Activity

        Hide
        Kirby Knight added a comment - - edited

        Thanks Tony! I would like to add, I also saw this issue in 3.5.0

        Show
        Kirby Knight added a comment - - edited Thanks Tony! I would like to add, I also saw this issue in 3.5.0
        Hide
        Nick Rupley added a comment -

        Surefire way to reproduce:

        Setup an SFTP container:

        docker pull atmoz/sftp

        Create a File Writer to write to it:

        Set a breakpoint at FileDispatcher:166, right after the line

        fileSystemConnection = fileConnector.getConnection(fileSystemOptions);



        Now, do this 9 times:

        1. Start up a container:
          docker run -p 2222:22 -d atmoz/sftp user:pass:::upload
        2. Send a message, verify dispatcher is suspended at the breakpoint
        3. Kill the container:
          docker kill $(docker ps -q)
        4. Resume the dispatcher thread

        On the 9th time, the breakpoint won't be reached, and instead your File Writer will be hung until you halt it. Verify with a threaddump:

        "File Writer Process Thread on file writer (d9a94b15-59a0-4f9e-b474-c8d53c868705), Destination 1 (1)" #62 prio=5 os_prio=31 tid=0x00007feaa5743000 nid=0x8d07 waiting on condition [0x000070000ba42000]
           java.lang.Thread.State: WAITING (parking)
        	at sun.misc.Unsafe.park(Native Method)
        	- parking to wait for  <0x00000007bdfbe728> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
        	at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
        	at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
        	at org.apache.commons.pool2.impl.LinkedBlockingDeque.takeFirst(LinkedBlockingDeque.java:583)
        	at org.apache.commons.pool2.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:442)
        	at org.apache.commons.pool2.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:363)
        	at com.mirth.connect.connectors.file.FileConnector.getConnection(FileConnector.java:174)
        	at com.mirth.connect.connectors.file.FileDispatcher.send(FileDispatcher.java:165)
        	at com.mirth.connect.donkey.server.channel.DestinationConnector.handleSend(DestinationConnector.java:822)
        	at com.mirth.connect.donkey.server.channel.DestinationConnector.process(DestinationConnector.java:476)
        	at com.mirth.connect.donkey.server.channel.DestinationChain.doCall(DestinationChain.java:121)
        	at com.mirth.connect.donkey.server.channel.DestinationChain.call(DestinationChain.java:63)
        	at com.mirth.connect.donkey.server.channel.Channel.process(Channel.java:1729)
        	at com.mirth.connect.donkey.server.channel.Channel.dispatchRawMessage(Channel.java:1191)
        	at com.mirth.connect.donkey.server.channel.SourceConnector.dispatchRawMessage(SourceConnector.java:192)
        	at com.mirth.connect.server.controllers.DonkeyEngineController.dispatchRawMessage(DonkeyEngineController.java:1071)
        	at com.mirth.connect.server.api.servlets.MessageServlet.processMessage(MessageServlet.java:94)
        Show
        Nick Rupley added a comment - Surefire way to reproduce: Setup an SFTP container: docker pull atmoz/sftp Create a File Writer to write to it: Set a breakpoint at FileDispatcher:166, right after the line fileSystemConnection = fileConnector.getConnection(fileSystemOptions); Now, do this 9 times: Start up a container: docker run -p 2222:22 -d atmoz/sftp user:pass:::upload Send a message, verify dispatcher is suspended at the breakpoint Kill the container: docker kill $(docker ps -q) Resume the dispatcher thread On the 9th time, the breakpoint won't be reached, and instead your File Writer will be hung until you halt it. Verify with a threaddump: "File Writer Process Thread on file writer (d9a94b15-59a0-4f9e-b474-c8d53c868705), Destination 1 (1)" #62 prio=5 os_prio=31 tid=0x00007feaa5743000 nid=0x8d07 waiting on condition [0x000070000ba42000] java.lang. Thread .State: WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0x00000007bdfbe728> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039) at org.apache.commons.pool2.impl.LinkedBlockingDeque.takeFirst(LinkedBlockingDeque.java:583) at org.apache.commons.pool2.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:442) at org.apache.commons.pool2.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:363) at com.mirth.connect.connectors.file.FileConnector.getConnection(FileConnector.java:174) at com.mirth.connect.connectors.file.FileDispatcher.send(FileDispatcher.java:165) at com.mirth.connect.donkey.server.channel.DestinationConnector.handleSend(DestinationConnector.java:822) at com.mirth.connect.donkey.server.channel.DestinationConnector.process(DestinationConnector.java:476) at com.mirth.connect.donkey.server.channel.DestinationChain.doCall(DestinationChain.java:121) at com.mirth.connect.donkey.server.channel.DestinationChain.call(DestinationChain.java:63) at com.mirth.connect.donkey.server.channel.Channel.process(Channel.java:1729) at com.mirth.connect.donkey.server.channel.Channel.dispatchRawMessage(Channel.java:1191) at com.mirth.connect.donkey.server.channel.SourceConnector.dispatchRawMessage(SourceConnector.java:192) at com.mirth.connect.server.controllers.DonkeyEngineController.dispatchRawMessage(DonkeyEngineController.java:1071) at com.mirth.connect.server.api.servlets.MessageServlet.processMessage(MessageServlet.java:94)

          People

          • Assignee:
            Nick Rupley
            Reporter:
            Tony Germano
          • Votes:
            1 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development