Not logged in. · Lost password · Register
Forum: MatriX RSS
Page: previous  1  2 
Avatar
Lightbarrier #16
Member since Jul 2014 · 52 posts
Group memberships: Members
Show profile · Link to this post
In reply to post ID 9372
Quote by Alex:
try a lower interval, because what your logs showed is that the disconnect appeared at exactly 2 minutes or a bit below. I would try 60 seconds for a test.

The BOSH logs fro the very beginning of your session would be also very helpful. There we can see some of the BOSH configuration of your server, like inactive value setting.

I'll update the client to ping every 60 seconds and log the bosh payloads. Once that test is completed I'll post the results.

Quote by Alex:
  • Your stacktrace also mentions the SslStream class. Have you made some tests without HTTPS to exclude problems on the SSL
layer?

I've thought of that myself, I'm currently running a test with just http to see if the error still arises.

Quote by Alex:
  • Do you have HTTP proxies between the client and the servers?

I don't know about any proxies however, I don't manager the server so it's hard for me to know for sure. It'll probably take me a little time to find out.

Quote by Alex:
  • Is there a firewall? Have you tried to disable it? Many firewalls hook into sockets and SSL and can cause strange behavior.

There's a firewall however, we have an firewall rule to open up the port which the chat server uses for https. Also the client is connecting locally, so it wouldn't have the usual troubles of a remote connection.

On another note my most recent test has retrieved a different error from previous tests which you'll see in the attachment.

For this test I used (System.Threading.Timer) to trigger the ping instead of (System.Timers.Timer) so it wouldn't worry about making itself synchronous for other threads. And from the test it seemed to have worked as I didn't see the timer stop for six minutes like it has in previous tests. However, it did receive a different error dirrectly from the chat server which I've noted below.

  1. 12/23/2015 12:55:20 PM - RECV: <stream:error xmlns:stream="http://etherx.jabber.org/streams">
  2.  <conflict xmlns="urn:ietf:params:xml:ns:xmpp-streams" />

Do you know what this would mean? I can see that MatriX is calling the closed event afterwards, and that my client isn't the one calling for the Chat connection to be closed. As of now I would assume it would apply to the following below shown in the OnError event.

  1. 12/23/2015 12:55:20 PM - Matrix.OnError(object sender, ExceptionEventArgs error)
  2. 12/23/2015 12:55:20 PM - error.Exception.GetType(): Matrix.Net.BoshException
  3. 12/23/2015 12:55:20 PM - error.Exception.Message: BoshException
  4. 12/23/2015 12:55:20 PM - error.Exception.GetBaseException().GetType(): System.Net.WebException
  5. 12/23/2015 12:55:20 PM - error.Exception.GetBaseException().Message: The remote server returned an error: (404) Not Found.

This error strikes me as odd, since the only thing the client is doing is pinging the chat server, so it's not like it's requesting something out of the ordinary unless the chat server lost the connection for some reason.

Finally, I added the stack trace to MatriX's OnClose event. I'm not sure if this would help with anything, but I though it might help show if the call is coming from some odd location.

Thanks for your Assistance.
The author has attached one file to this post:
Chat Trace Client - 12_23_10_22_45.txt 77.8 kBytes
You have no permission to open this file.
This post was edited 2 times, last on 2015-12-23, 22:27 by Lightbarrier.
Avatar
Lightbarrier #17
Member since Jul 2014 · 52 posts
Group memberships: Members
Show profile · Link to this post
Quote by Alex:
Your stacktrace also mentions the SslStream class. Have you made some tests without HTTPS to exclude problems on the SSL
layer?

Just finished test using HTTP instead with the client that's still pinging in two minute intervals and it still errors out. Interestingly enough this test was also done with (System.Timers.Timer) and it had a delay between the last ping and when the error occurred. I'm still not quite sure why that pause would occur though.

I'm starting to run out of ideas. However, I've attached the trace log from the http test in case you're interested in looking at it.

Presently I'm running the test which'll log the BOSH payloads with the one minute timer interval, and I'll let you know how that goes.

Thanks for your Assistance.
The author has attached one file to this post:
Chat Trace Client - 12_23_13_53_11.txt 44.9 kBytes
You have no permission to open this file.
Avatar
Alex #18
Member since Feb 2003 · 4327 posts · Location: Germany
Group memberships: Administrators, Members
Show profile · Link to this post
In reply to post #16
Quote by Lightbarrier:
  1. 12/23/2015 12:55:20 PM - RECV: <stream:error xmlns:stream="http://etherx.jabber.org/streams">
  2.  <conflict xmlns="urn:ietf:params:xml:ns:xmpp-streams" />

Usually this means that another session for this user and the same resource connected to the server. Then the server kicks the previous session.

Alex
Avatar
Lightbarrier #19
Member since Jul 2014 · 52 posts
Group memberships: Members
Show profile · Link to this post
In reply to post ID 9372
Quote by Alex:
try a lower interval, because what your logs showed is that the disconnect appeared at exactly 2 minutes or a bit below. I would try 60 seconds for a test.

The BOSH logs fro the very beginning of your session would be also very helpful. There we can see some of the BOSH configuration of your server, like inactive value setting.

I've attached the BOSH log where we managed to duplicate our issue. I also updated the client so it ends up pinging every 54 seconds instead of two minutes and it's still experiencing the bug. Presently the last thing I see in the BOSH log is the following text below which is interesting since the BOSH error occurs six minutes afterwards and no event occur in between that.

  1. 12/29/2015 1:24:15 PM - SEND BOSH: <body xmlns:xmpp="urn:xmpp:xbosh" xmlns:stream="http://etherx.jabber.org/streams" rid="1271447334" key="f6533fd7bd5d2b4f971d178aff8870a3569130ff" sid="1c81982e" to="nycwsgw4vm" xmlns="http://jabber.org/protocol/httpbind" />

I've also attached the log for the "OnSendXml" and "OnReceiveXml" events in the post below as I couldn't post more then two files.

Hopefully this info will give you a lead into what's going wrong, because I'm not sure what else I can test.

Thanks for your Assistance.
The author has attached one file to this post:
Chat Trace BOSH - 12_29_12_20_34.txt 87.4 kBytes
You have no permission to open this file.
This post was edited on 2015-12-29, 20:10 by Lightbarrier.
Avatar
Lightbarrier #20
Member since Jul 2014 · 52 posts
Group memberships: Members
Show profile · Link to this post
Quote by Lightbarrier:
I've also attached the log for the "OnSendXml" and "OnReceiveXml" events in the post below as I couldn't post more then two files.

Here it is. Let me know if there's anything else I can give you.
The author has attached one file to this post:
Chat Trace Client - 12_29_12_20_34.txt 63.8 kBytes
You have no permission to open this file.
Avatar
Alex #21
Member since Feb 2003 · 4327 posts · Location: Germany
Group memberships: Administrators, Members
Show profile · Link to this post
The 6 minutes match the Wait value in the session request your server sends back:

  1. <body xmlns="http://jabber.org/protocol/httpbind" xmlns:stream="http://etherx.jabber.org/streams" from="nycwsgw4vm" authid="1c81982e" sid="1c81982e" secure="true" requests="2" inactivity="30" polling="5" wait="300" hold="1" ack="1271447179" maxpause="300" ver="1.6">

You can pass a lower value in the OnCreateBoshSession event. This should help to detect network failures faster.
The Exception is still the same, WebException in the .NET Framework. I don't see anything wrong in MatriX.

In you last log you pin the server fine at 1:24:14 and get a result. Then after 6 minutes the exception, but there is also no timer event of your pings. To me it looks like something locks up in your app.

Alex
Avatar
Lightbarrier #22
Member since Jul 2014 · 52 posts
Group memberships: Members
Show profile · Link to this post
From task manager I've noticed that the client's handles keep going up the longer I try to keep the connection running.

Example:
Time       Handles:
7:01pm          463
7:01pm          489
7:04pm          502
7:09pm          549
7:12pm          612
7:16pm          660
7:20pm          699
7:26pm          775
7:30pm          823
7:35pm          867
7:38pm          911

This would suggest to me that there's a memory leak. I've looked over my code, but I can't find a instance where it's dynamically creating new memory that needs to be disposed or dynamically registers events that it doesn't release. Presently my client subscribes to all Matrix events before it calls Opens, afterwards the client no longer has a need to subscribe to any more events and they'll be released when the client is closed.

I've noticed that when the client calls Matrix to ping the server it seems that the handles are going up, I realize this may be a long shot, but would it be possible that Matrix isn't releasing memory that it no longer needs?

Thanks for your Assistance.
Avatar
Alex #23
Member since Feb 2003 · 4327 posts · Location: Germany
Group memberships: Administrators, Members
Show profile · Link to this post
Memory and Handles going up for a while and then later down is nothing you should be worried about. This is how the .NET runtimes works until the GC runs. There are no known issues related to memory in MatriX.
Also this is not related related to your problem. If there is a memory leak you would get out of memory exceptions. Your connections drops for some other reason.

Alex
Avatar
Lightbarrier #24
Member since Jul 2014 · 52 posts
Group memberships: Members
Show profile · Link to this post
We appear to have resolved the issue for the environment that we were testing on. The problem was that the machine's time was jumping ahead periodically which was tricking either MatriX or the chat server into thinking that the session had been lost. We've since updated the machine's time and the connections haven't been dropped since. I'll let you know if we continue experiencing problems on  other environments that we can't resolve ourselves.

Thanks for all your time and assistance.
Avatar
Alex #25
Member since Feb 2003 · 4327 posts · Location: Germany
Group memberships: Administrators, Members
Show profile · Link to this post
Great to hear that its working now and the problem was inside of MatriX.

For BOSH MatriX needs to consider timespamps for inactivity for example. So a change of the time could cause the problems you have seen.
Close Smaller – Larger + Reply to this post:
Verification code: VeriCode Please enter the word from the image into the text field below. (Type the letters only, lower case is okay.)
Smileys: :-) ;-) :-D :-p :blush: :cool: :rolleyes: :huh: :-/ <_< :-( :'( :#: :scared: 8-( :nuts: :-O
Special characters:
Page: previous  1  2 
Forum: MatriX RSS