Professional OPC
Development Tools

logos

Random disconnect and deadlock when reading items in a loop.

More
08 Oct 2015 10:33 #3606 by support
Regarding your question to sync/async: The API of EasyDAClient is synchronous anyway. That is, when you call EasyDAClient.ReadXXXXX, your code blocks until the read results are available (or until an error is detected), no matter whether the underlying OPC read is sync or async.

And, the second part of the reasoning I have explained already: Async operations have an implementation that is much more complicated on both the server and client side - and prone to deadlocks, if there is an implementation bug.

Please Log in or Create an account to join the conversation.

More
08 Oct 2015 08:18 #3602 by sebpinski
What is equally interesting too, is that the issue is solved instantly after a restart of the windows service. So having a brand new EasyDAClient removes the deadlock.

Please Log in or Create an account to join the conversation.

More
08 Oct 2015 08:04 #3601 by sebpinski
Based on what you've said regarding the null-ness of returned DAVtqResults and their properties, its fairly safe to assume that the multiple and single read both throw the same error (I can confirm officially at a later time).

1) The server used for the OPC Server is dedicated to the OPC server is running Windows Server 2012, has 64GB of memory and 2x4 cores. The OPC server is running as a High Priority process.

2) There are actually two separate clients connecting to the OPC server and the other (which doesn't use OPC labs) in unaffected.

At the moment, the code is doing everything asynchronously so based on your reasoning, why would there be blocking? All subscriptions are asynchronous and all reads have been forced to be asynchronous. I don't understand why I should therefore modify the code to have synchronous reads and then separate so that we have two EasyDAClients, one for synchronous methods and one for asynchronous methods?

The point here is that the code does not recover, whilst the failed reads are occurring. The OPC server doesn't appear to be under any additional load, the other clients reads are occurring and there is nothing to say after hours of waiting that the client will recover.

Please Log in or Create an account to join the conversation.

More
07 Oct 2015 08:32 #3595 by support
Thank you for details. With regard to the null-ness of certain properties, this is how it should work:

- The returned DAVtqResult should never be null.
- If the DAVtqResult.Exception is null, the results indicates a success, and then the DAVtqResult.Vtq will never be null.
- DAVtq.Quality is never null
- DAVtq.Value can be null if there is no data value - e.g. with a bad quality, but in other cases as well.

The "Read not completed" is a timeout that indicates that the operation was not completed in time. Possible reasons include:

1) It simply takes the OPC server quite long to perform the Read - longer than the timeout we have. The solution in this case is either to improve the server's performance, or increase the timeout on the client side.

2) The OPC server completely blocks inside the Read, and never returns.

3) An internal bug in QuickOPC which causes it to "appear" such as that the OPC server has blocked (as in #2), while in reality, it had not.

At this moment I cannot tell which of the above is the case here. Because you have hinted at the relation of this issue to a fact that the code combines both Subscriptions and the Reads, this point my think to some kind of deadlock (either on server or client side, #2 or #3), which is far easier to happen when the two are combined:

Can you modify your code that you use separate instances of EasyDAClient for subscriptions and for reads, AND set their .Isolated property to True right after creation? This will cause the reads and subscriptions be performed on separate connections, too. And, at the same time, please switch to the Synchronous reads (because async reads communicate back via callbacks, similarly to subscriptions).

Best regards

Please Log in or Create an account to join the conversation.

More
06 Oct 2015 09:34 #3594 by sebpinski
Thanks for the response. I've reviewed the error logging for this issue.

We actually have two threads one that reads multiple items on a timed loop, the other subscribes to multiple tags. When the subscription is fired, we do some single item reads that error before the multiple item read occurs later on in subscription code can be reached.

We only instantiate a single instance of the EasyDAClient that is used throughout the code.

When the deadlock happens:

-The timed thread does a ReadMultipleItems, we actually get a null reference exception when processing the returned DAVtqResult, I'll update the code to handle this better, but one of the following must be null in order for this to occur:
  1. DAVtqResult
  2. DAVtqResult.Vtq
  3. DAVtqResult.Vtq.Quality

-The code fired after the subscription always has the following error when attempting a single read:

An OPC operation failure with error code -1073430509 (0xC004C013) occurred, originating from 'OpcLabs.EasyOpcRaw.DataAccess.RawEasyDAClient'. The inner exception contains details about the problem.

Read not completed. This error indicates that it could not be verified that the requested read operation was completed during the timeout period. It is possible that the read operation will actually succeed or fail, but later. Increase the timeout period if you want to obtain positive or negative indication of the operation outcome. Other reason for this error may be that under heavy loads, topic request or response queue is overflowing. Check the event log for queue overflow errors (if event logging is supported by the product and enabled).

Please Log in or Create an account to join the conversation.

More
05 Oct 2015 16:24 #3593 by support
We couldn't address precisely this problem, because it was not reproducible. But in the meantime, we have fixed other issues that could have been related.

I'd like to be sure about the symptoms. Are you saying that the EasyDAClient.ReadMultipleItems calls, from some moment on, start to return a non-null object in the .Exception property of the elements of the result array? Or, are you using a different method call, or are the symptoms different?

If there is an exception thrown or contained in .Exception, please post its details.

Best regards

Please Log in or Create an account to join the conversation.

More
05 Oct 2015 08:01 #3589 by sebpinski
I realise that this is a fairly old thread but could support please confirm whether this is still an ongoing issues or whether it was apparently resolved?

I'm using versions: 5.34.274.1 in a windows service reading from a TOPServer V5.12.142.0.

My configuration includes the following:
// Prevent creation of subscriptions on read
EasyDAClient.SharedParameters.Topic.SlowdownWeight = 0.0f;
EasyDAClient.SharedParameters.Topic.SpeedupWeight = 0.0f;
 
_opcClient = new EasyDAClient();
// Force asynchronous read/writes
_opcClient.InstanceParameters.Mode.DesiredMethod = OpcLabs.EasyOpc.DataAccess.Engine.DAReadWriteMethod.Asynchronous;
_opcClient.InstanceParameters.Mode.AllowAsynchronousMethod = true;
_opcClient.InstanceParameters.Mode.AllowSynchronousMethod = false;
 
// Reduce timeouts on Async methods
_opcClient.InstanceParameters.Timeouts.ReadItem = 15000;
_opcClient.InstanceParameters.Timeouts.WriteItem = 15000;
 
// Prevent creation of subscriptions on read
_opcClient.InstanceParameters.UpdateRates.ReadAutomatic = Timeout.Infinite;
_opcClient.InstanceParameters.UpdateRates.WriteAutomatic = Timeout.Infinite;

I'm hitting the exact same issue, the service is running 24/7 and at some point this error begins to occur and the only way to resolve is by restarting the service.

I was previously doing single reads, but the issue would start within a matter of hours. After switching to multiple reads it now takes a few days to occur. Looking at the TOP Server, there do not appear to be any overflow errors as suggested in the inner exception. Could you please assist?

Please Log in or Create an account to join the conversation.

More
02 May 2013 08:46 #1309 by Jarimatti
I agree that the repro is the hard part: this only happens rarely and I also can't reproduce this reliably. Our software runs basically 24 hours a day, 7 days a week. In this case the software was running for weeks before this happened. And in some sites there is no problem at all, even with same OPC server manufacturer (Siemens OPC server).

I'm cyclically reading multiple items sequentially with single ReadItem calls and sleep for e.g. 5 seconds between read cycles (read items, sleep, then read again).

I'll inspect the EasyDAClient.ClientMode.Isolated property, thanks for the tip.

Please Log in or Create an account to join the conversation.

More
30 Apr 2013 16:04 #1308 by support
There can be two reasons for this error - a legitimate one (blocking by OPC server), and then "the problem" - a bug in the QuickOPC component. We had similar report in the past, but were unable to reproduce the behavior. The repro is, actually, the only problem here - if we could reproduce it, we would be able to determine the reason, and if it is on our side, fix it.

It is new to learn that this can happen with just a single-item, repeated read. Perhaps we should attempt to reproduce again, and set up a really long lasting test to see if it happens.

With regard to the app domain: It is worth trying, with a "but": The background processing in QuickOPC is written in C++, and is driven by process-scoped, not appdomain-scoped, objects. You can, however, try to set the EasyDAClient.ClientMode.Isolated property to 'true', before calling any OPC operations on it. This will cause it to use some (but not all) separate objects in the background.

Please Log in or Create an account to join the conversation.

More
29 Apr 2013 13:14 #1307 by Jarimatti
Hi,

We have a similar problem than what's stated in QuickOPC.COM Random disconnect from OPC server , but with QuickOPC.NET.

We're using QuickOPC Classic (.NET) 5.12, build 1333.1.

The exception log is:
2013-04-24 20:02:33.7513 DEBUG OPCPlugin.OPCPlugin Reading OPC server 'OPC.SimaticNET' tag 'S7:[xxx]M41.0' / 'xxx.M41.0' = False {System.Boolean} @24.4.2013 17:02:33; Good GoodNonspecific LimitOk (192) 
2013-04-24 20:03:33.7525 ERROR OPCPlugin.OPCPlugin Worker task has failed, trying restart after 5 seconds. System.AggregateException: One or more errors occurred. ---> OpcLabs.EasyOpc.OpcException: OPC operation failure. ---> System.Runtime.InteropServices.COMException: Read not completed. This error indicates that it could not be verified that the requested read operation was completed during the timeout period. It is possible that the read operation will actually succeed or fail, but later. Increase the timeout period if you want to obtain positive or negative indication of the operation outcome. Other reason for this error may be that under  heavy loads, topic request or response queue is overflowing. Check the event log for queue overflow errors (if event logging is supported by the product and enabled). 
   --- End of inner exception stack trace ---
   at OpcLabs.EasyOpc.OperationResult.ThrowIfFailed()
   at OpcLabs.EasyOpc.DataAccess.EasyDAClient.ReadItem(ServerDescriptor serverDescriptor, DAItemDescriptor itemDescriptor)
   at OpcLabs.EasyOpc.DataAccess.EasyDAClient.ReadItem(String machineName, String serverClass, String itemId, VarType dataType)
   at OpcLabs.EasyOpc.DataAccess.EasyDAClient.ReadItem(String machineName, String serverClass, String itemId)
...
   at System.Threading.Tasks.Task.InnerInvoke()
   at System.Threading.Tasks.Task.Execute()
   --- End of inner exception stack trace ---
---> (Inner Exception #0) OpcLabs.EasyOpc.OpcException: OPC operation failure. ---> System.Runtime.InteropServices.COMException: Read not completed. This error indicates that it could not be verified that the requested read operation was completed during the timeout period. It is possible that the read operation will actually succeed or fail, but later. Increase the timeout period if you want to obtain positive or negative indication of the operation outcome. Other reason for this error may be that under  heavy loads, topic request or response queue is overflowing. Check the event log for queue overflow errors (if event logging is supported by the product and enabled). 
   --- End of inner exception stack trace ---
   at OpcLabs.EasyOpc.OperationResult.ThrowIfFailed()
   at OpcLabs.EasyOpc.DataAccess.EasyDAClient.ReadItem(ServerDescriptor serverDescriptor, DAItemDescriptor itemDescriptor)
   at OpcLabs.EasyOpc.DataAccess.EasyDAClient.ReadItem(String machineName, String serverClass, String itemId, VarType dataType)
   at OpcLabs.EasyOpc.DataAccess.EasyDAClient.ReadItem(String machineName, String serverClass, String itemId)
...
   at System.Threading.Tasks.Task.InnerInvoke()
   at System.Threading.Tasks.Task.Execute()<---

Basically we just read single items repeatedly in a loop. The issue occurs rarely but when it does, the library throws
OPC exception. Currently our code tries to restart the whole subsystem which lives in a separate task but same application
domain. The same exception gets raised again, but after some retries the ReadItem call just deadlocks. CPU usage seems
to be normal.

Only an application restart solves the problem. As a workaround I'm planning to run the OPC client code in a separate application domain: do you think this would work?

I just saw this in a client site today where the OPC reading task had been stuck for 5 days.

Please Log in or Create an account to join the conversation.

Moderators: support
Time to create page: 0.238 seconds

      

 Recommend this on Google