Professional OPC
Development Tools

logos

Unable to reconnect to OPC Server

More
28 Mar 2019 12:22 #7282 by veda
Hi again

Did some initial tests, and it seems like the new build solved the issue. Will do some more tests to verify.

Please Log in or Create an account to join the conversation.

More
27 Mar 2019 11:16 #7279 by support
Hello,

please download the latest build of QuickOPC 2018.3 (should be 5.54.1133.1 or later) from our Web site or from NuGet, re-build your project and retest.
I refer to my previous post for more info on the "internal error" and the fix.

Thank you for reporting this issue.
Best regards

Please Log in or Create an account to join the conversation.

More
25 Mar 2019 10:17 #7271 by support
I could not reproduce the internal error, but based on the provided call stack I think I can see a situation in which it can appear (it is timing-dependent, which explains that it might be problematic to reproduce).

There should be a new build with fix for this soon. I will let you know here.

Note, however, that the internal error (our bug) happened inside the handling for some "normal" error. With the information at hand, I cannot determine whether fixing the internal error will also fix the reported problem - i.e. the inability to reconnect. If it does not, we will need to re-do the logs etc. and continue with the analysis, but even then it will be a step forward.

Regards

Please Log in or Create an account to join the conversation.

More
24 Mar 2019 14:02 #7268 by support
From: V.
Sent: Thursday, March 21, 2019 3:04 PM
To: Z.
Subject: RE: Additional information to Online forum post #7200 - www.opclabs.com/forum/ua-connections/2616-unable-to-reconnect-to-opc-server#7200

Here are some new information. “VisualStudioCallStack.txt” is just a copy/paste of the callstack in VS.

Rider gave a lot more information. Below are some screenshots. Looks like it is the collection returned from UAEngineBase.BuildActivityEventReport that is modified during the ForEach loop… Hopefully you can figure out the problem based on this.

If you would like me to do a quick test on a modified test program to verify a potential fix, just let me know.






Best regards
...
[there were more emails exchanged in between]

File Attachment:

File Name: VisualStud...tack.txt
File Size:6 KB
Attachments:

Please Log in or Create an account to join the conversation.

More
19 Mar 2019 18:58 #7232 by support
Hello.
The minidump gave better results - I was able to see the calls stacks etc. Unfortunately, it still did not give the cause for the issue.

But, when going through the information you have provided, I realized that you were right - the problem comes from the "INTERNAL ERROR..." that is in the log. I have simply overlooked that, I thought you were referring to a different error (my fault). So, things are now clearer. Such internal error error should never occur, and it can easily prevent further reconnections etc.

In this case, the internal error from an unexpected InvalidOperationException. According to Microsoft recommendations, this is one of the exceptions that should never happen in a correct program, because the calling code can always check the preconditions for any operation (they are/should be documented), and simply not make the call in a case that would lead to this exception. This probably means that there is a bug either in our code, or in some of the libraries we are using. We know that in some cases, the OPC Foundatiock stack/SDK (which we are using) throws unwanted exceptions, so this may (or may not) be the case.

This exception occurs on a working thread, and that's why it is does not manifest itself as an exception to your code, it is only logged.

I would like to ask you to obtain a call stack of the InvalidOperationException, and send it to us. It should be doable by running the code under debugger, and setting it so that it breaks into the debugger *when thrown* (and not just when unhandled, which is the default) - in Visual Studio, this is in Exception settings/window. Can you do that? The only problem I can foresee with it is if there were multiple exceptions like that, and some them handled fine, and only then the one that causes the internal error would come. In such case, debugging would still be possible but more demanding. But hopefully that won't be the case.

Thank you, and best regards

Please Log in or Create an account to join the conversation.

More
18 Mar 2019 08:39 #7225 by support
From: Z.
Sent: torsdag 14. mars 2019 13.12
To: V.
Subject: RE: Additional information to Online forum post #7200 - www.opclabs.com/forum/ua-connections/2616-unable-to-reconnect-to-opc-server#7200

Hello.

Unfortunately, the Threads.txt dos not contain the information I was looking for. I wanted to get the call stacks, but there is just [External Code] placeholder. This may have to do with the fact it runs under .NET Core.
There is a chance that the minidump will work better. Can you please upload it to This email address is being protected from spambots. You need JavaScript enabled to view it., into the ‘exchange’ folder, and let me know then?

The exception at 09:11:55.583 is just a natural consequence of the disconnection, per se there is nothing wrong with it.

Thank you
Z

[in reply to this, the minidump has been uploaded to our FTP server]

Please Log in or Create an account to join the conversation.

More
18 Mar 2019 08:38 #7224 by support
From: V.
Sent: Tuesday, March 12, 2019 10:11 AM
To: Z.
Subject: RE: Additional information to Online forum post #7200 - www.opclabs.com/forum/ua-connections/2616-unable-to-reconnect-to-opc-server#7200

Forgot to add timeline for ReconnectProblemLog.txt:
09.03: Connected all ok
09.04: Disconnected VPN to servers
09.11: Reconnected. Did not reconnect.

As you can see, the logs go at least 20 minutes after is should reconnect.


Best regards,

Please Log in or Create an account to join the conversation.

More
18 Mar 2019 08:37 #7223 by support
[adding parts of email conversation so that we have things in one place]
[The two file that were attached to the email are not posted here]

From: V.
Sent: Tuesday, March 12, 2019 10:08 AM
To: Z.
Subject: Additional information to Online forum post #7200 - www.opclabs.com/forum/ua-connections/2616-unable-to-reconnect-to-opc-server#7200

Hi

Attached is two files; ReconnectProblemLog.txt and Threads.txt.

Threads.txt is a copy of the running threads based on the instructions provided in alternative 2 ( kb.opclabs.com/Troubleshooting_program_hangs ). I also have a minidump, but that is 500mb, so need some FTP or similar to upload that file. But in general it does not seem like the client application is hanging.

ReconnectProblemLog.txt is same output as provided earlier.
[Warning] -> Output from EasyUAClient.LogEntry without any filter (Note: The application connects to 5 OPC servers)
[Error] -> Exception text from DataChangeNotification and ServerConditionChanged when called with endpointdescriptor to the server with connection issues (192.168.23.249).

Note that the result of this test was connection issue to several servers, but not all. Could this be related to the Exception in your program logged 09:11:55.583?


Best regards,

Please Log in or Create an account to join the conversation.

More
12 Mar 2019 08:51 #7200 by veda
First of all... You are off course right about the 5 log items. Forgot that it was a static call. My observation is still valid though with regards to duplicate calls to ServerConditionChanged for the same connection state, but not relevant for the issue under discussion.

I have reproduced the issue once more and removed the filters from the log. Will send the complete log on email together with thread list (alternative 2 in the link you provided). Also have a minidump, but the is 500mb, so need some place to upload if required.

One thing I noticed in the log which might be related to the problem:

Warning(3201): The status subscription for an OPC-UA session on endpoint URL "opc.tcp://192.168.23.211:4845" is in failure. Further such warnings on this session will not be logged.
OPC-UA service result - Error establishing a connection: BadNotConnected = BadNotConnected.
---- SERVICE RESULT ----
Status Code: {BadNotConnected} = 0x808A0000 (2156527616)
Description: Error establishing a connection: BadNotConnected
Status Code: {BadNotConnected} = 0x808A0000 (2156527616)
Description: BadNotConnected
Additional Info: <ExceptionTrace>
---- REMARKS ----
Some possible causes of this error are that the OPC-UA server is not running, or is not configured to listen on the specified port. Also, the network connection may be broken (cable unplugged?).

+ The SDK action called was "DiscoveryClient.GetEndpoints".
+ Following (5) events were gathered during the action on activity ID [115], in the order of first occurrence:
SDK trace: GetEndpoints Called. RequestHandle=1, PendingRequestCount=1
SDK trace: Channel 0 in Connecting state.
[83] Exception: {Opc.Ua.ServiceResultException} BadNotConnected
[50] Exception: {Opc.Ua.ServiceResultException} BadNotConnected
SDK trace: GetEndpoints Completed. RequestHandle=1, PendingRequestCount=0, StatusCode=Bad
+ Events starting with activity ID in [] may not necessarily be related to the current action.
+ The error occurred when preselecting the endpoint for discovery URL "opc.tcp://192.168.23.211:4845".
2019-03-12 09:11:55.587 +01:00 [Warning] Error(1): INTERNAL ERROR. The OPC-UA engine might be in an unstable state.
An exception of type 'System.InvalidOperationException' from source 'System.Private.CoreLib' has occurred in OPC-UA guarded operation 'SessionConnector.ConnectFunc'. The exception descend follows.
(1) {System.InvalidOperationException} System.Private.CoreLib(ThrowInvalidOperationException_InvalidOperation_EnumFailedVersion) -> Collection was modified; enumeration operation may not execute.

Please Log in or Create an account to join the conversation.

More
11 Mar 2019 16:50 - 11 Mar 2019 16:50 #7199 by support
Thank you for all the answers. The reason I asked about the debugger was that it may have influence on the behavior, but I do not think there is a relation to the issue you mentioned. Repeating events or log entries generated after at least several seconds are not necessarily wrong, they may be indications of ongoing reconnection attempts. But the 5-time precise repetition of each LogEntry is not normal; I suspect you forgot that it is a static event, and hooked to it multiple times? (If not, let me know, I'd like to investigate, but would need some code that shows it, because it's not happening in our tests). But it is not a major issue.

In general, although you have filtered out parts of the logs with good intent, it is complicating the analysis in the end. Yes, it is good to reduce the problem to minimal size - but by reducing the scenario itself, and not by reducing the information available about it - that can always be done afterwards. For example, I might have been able to tell whether there was a "block" related to certain activity within our component by looking at what happens on other connections (things are not completely independent inside!). But if that information is filtered out, I cannot tell whether the activity is blocked just for the one problematic server.

Anyway, I was thinking about what is the most efficient next step. I do not want to force you to too much work (such as the Wireshark + extended tracing etc.) too soon, unless necessary. So, I think the next steps should be to determine whether there isn't any deadlock inside the component which would cause it not to reconnect. You do not need to collect any new logs, but instead, a snapshot of the program state - in the state when it already stopped reconnecting to the server. For instructions, please see kb.opclabs.com/Troubleshooting_program_hangs . These instructions were for .NET Framework, and things may somewhat differ under .NET Core - I am not sure. If you run into problems, let me know.

Best regards
Last edit: 11 Mar 2019 16:50 by support.

Please Log in or Create an account to join the conversation.

Moderators: support
Time to create page: 0.217 seconds

      

 Recommend this on Google