Re: beacon still dies at 1.0
Hi, John -- Thanks very much for this information.
This helps confirm some things we had suspected. We're looking at/working
on the TCP socket code now, and expect to have a much happier update
available soon.
Mitch
At 09:59 AM 8/27/2004 -0500, John Kristoff wrote:
> I was able to grab a tcpdump of a failure scenario. I ran tcpdump as
> follows:
>
> tcpdump -i hme0 -s 1500 -w beacon.cap host 141.142.98.209 and tcp
>
> In my last test it didn't take my Beacon client long to die. Sometimes
> it takes a few hours, other times a few minutes. I don't think I've
> seen it last more than a couple of days. I'll make note of the general
> conversation and then include details at the time of the failure.
>
> Client sets up TCP connection to 141.142.98.209:tcp:10004. The 3-way
> handshake occurs normally. My client then PSH's to the beacon server
> the so called authentication string (line 2702 in the current beacon
> script). The data is as follows:
>
> beacon.dast.nlanr.net|233.4.200.23|10002|1.0
>
> In addition, the string terminates with a LF (0x0a) at the end.
>
> Then my client PSH's my client info and other beacons my client sees
> over a few TCP segments. The last beacons I see are sent in a final
> FIN/PSH/ACK segment, which prompts the server to send a FIN/ACK of
> it's own. I ACK the server's FIN and the session is closed.
>
> That is when things are all working properly. My last test got to the
> third report (180 seconds into it) when it encountered a problem. A
> 3-way handshake was setup, my client sent the initial authentication
> string and the server responded with a couple of RST's. The data in
> the authentication string looks the same as it does above. The only
> additional piece of data that I can think of that may help identify the
> problem is that my source TCP port on the failed connection was 54382.
>
> Other than that I don't think I see a problem from my perspective and
> I would have to point the finger at the server side. I know that is
> not very helpful, but perhaps this at least rules out other problems
> (e.g. port filtering getting in the way).
>
> Mitch, furthermore, I noticed you changed 'close($socket) line to help
> hack around the failed TCP session problem, but you did not include the
> following:
>
> use POSIX qw(:errno_h);
> $SIG{PIPE} = 'IGNORE';
>
> We need that too in order to for the script not to crash and burn when
> it tries to write to a socket (all those 'print $socket' lines) that has
> been unexpectedly closed due to what appears to be the mysterious RSTs
> from the server. See my earlier post for the patch details.
>
> John
>
>
--
Mitch Kutzko | mitch@dast.nlanr.net | mitch@ncsa.uiuc.edu | 217-333-1199
Project: http://dast.nlanr.net/ | Personal: http://hobbes.ncsa.uiuc.edu/