Not logged in. · Lost password · Register
Forum: MatriX RSS
DLowndes #1
Member since Apr 2013 · 1 post
Group memberships: Members
Show profile · Link to this post
Subject: Race condition in SynchronousConnect/EndConnect
In investigating issues with the communications in our product's use of the MatriX code I've noticed that there's a race situation in the code (at least in the V1.4.4.0 of the source code that we have) between SynchronousConnect & EndConnect.

SynchronousConnect has a 5 second timeout, and should that expire (which it does if there's no connection), it closes the socket and sets the socket member variable to null.

EndConnect will then try to use that socket object, resulting in an ArgumentNullExeception - albeit that it is handled in an exception handler and doesn't seem to cause any obvious subsequent issues.

Why does SynchronousConnect have a timeout value when EndConnect always gets called (eventually)?

I note that the MS documentation for Socket.BeginConnect http://msdn.microsoft.com/en-us/library/tad07yt6.aspx says "At the very minimum, you must pass the Socket to BeginConnect through the state parameter." - which isn't done in the version of the MatriX code we have. Presumably the intention is that the socket object can be obtained in EndConnect via ar.AsyncState?

Also, since ClientSocket has a Socket member (which is IDisposable), shouldn't ClientSocket also be IDisposable?

Have there been any changes to this code in the latest version to resolve comms issues?
Avatar
Alex #2
Member since Feb 2003 · 4322 posts · Location: Germany
Group memberships: Administrators, Members
Show profile · Link to this post
Quote by DLowndes:
Why does SynchronousConnect have a timeout value when EndConnect always gets called (eventually)?
because there is no timeout for an connection attempt configurable in the .NET sockets, and the default timeout is very high (30 seconds when I remember correctly). Many of our users have a cluster of XMPP servers for load balancing and fail-over, also sometiems several IP addresses per host in the DNS. When one or more servers are down and MatriX has to try to connect to several IP addresses this can take very long with a default timeout of 30 seconds until we get a socket connection to a XMPP host.
Because of this reason we have implemented an own "timeout timer" to abort an connection attempt to a given IP address after 5 seconds. This did not cause any problems in the past. It was planned to make this value configurable, but nobody has asked for this yet.

Current version is 1.5.3.3. We made some improvements to the socket codes from the version you are running. So I suggest that you first try the latest codes or builds. When the problem still persists then we need a test case for debugging. You can also contact me directly by email.

Alex
Close Smaller – Larger + Reply to this post:
Verification code: VeriCode Please enter the word from the image into the text field below. (Type the letters only, lower case is okay.)
Smileys: :-) ;-) :-D :-p :blush: :cool: :rolleyes: :huh: :-/ <_< :-( :'( :#: :scared: 8-( :nuts: :-O
Special characters:
Forum: MatriX RSS