Re: large numbers of sockets in CLOSE_WAIT state?


-----BEGIN PGP SIGNED MESSAGE-----


On Jul 28, 2006, at 18:15, Eli Dart wrote:

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi all,

I just set up a beacon central server on Solaris10. There are large
numbers of sockets stuck in CLOSE_WAIT state involving the beacon
server talking to itself on TCP port 10004. The number is slowly
growing, at a rate of approximately one socket per 3 minutes (see quick
dirty script output, below).


This only seems to appear on the central server (though I don't have a
Solaris10 client just now).

my experience with solaris (9) and MacOS, is

1) the central server code has lots of tcp checksum failures, and so the tcp state gets funky (both OSes)
2) the client will only stay up a few hours, I would run it in a shell script to respawn it when it closed (solaris)
3) the central server will continue to display old and bad data if it doesn't get new data in from the broken tcp connections (both OSes)
4) the majority of the time the "local_loss.html" page's data would be accurate even if the central server isn't, but not always. (both OSes)


for my local campus I finally gave up on beacon and switched to dbeacon instead as I had only solaris and MacOS clients and servers. If you want to run the NLANR beacon I suggest sticking to linux and then it mostly works most of the time. The caveat with dbeacon is there is no central server that gets a unicast update, so you can't tell who is trying to use the beacon if it's not working -- not a problem on my campus where I run all the beacon clients and know what is supposed to be there.

It's no longer under development so there's not much hope for bug fixes.


Thoughts?

		--eli


dart@beacon % while ( 1 )
while? set count = `netstat -an | grep 10004 | grep CLOSE_WAIT | wc -l`
while? echo -n "$count "
while? date
while? sleep 60
while? end
60 Fri Jul 28 15:21:02 PDT 2006
60 Fri Jul 28 15:22:02 PDT 2006
61 Fri Jul 28 15:23:02 PDT 2006
61 Fri Jul 28 15:24:02 PDT 2006
61 Fri Jul 28 15:25:02 PDT 2006
62 Fri Jul 28 15:26:03 PDT 2006
62 Fri Jul 28 15:27:03 PDT 2006
62 Fri Jul 28 15:28:03 PDT 2006
63 Fri Jul 28 15:29:03 PDT 2006
63 Fri Jul 28 15:30:03 PDT 2006
63 Fri Jul 28 15:31:03 PDT 2006
64 Fri Jul 28 15:32:03 PDT 2006
64 Fri Jul 28 15:33:03 PDT 2006
64 Fri Jul 28 15:34:04 PDT 2006
65 Fri Jul 28 15:35:04 PDT 2006
65 Fri Jul 28 15:36:04 PDT 2006
65 Fri Jul 28 15:37:04 PDT 2006
66 Fri Jul 28 15:38:04 PDT 2006
66 Fri Jul 28 15:39:04 PDT 2006
66 Fri Jul 28 15:40:04 PDT 2006
67 Fri Jul 28 15:41:05 PDT 2006
67 Fri Jul 28 15:42:05 PDT 2006
67 Fri Jul 28 15:43:05 PDT 2006
68 Fri Jul 28 15:44:05 PDT 2006
68 Fri Jul 28 15:45:05 PDT 2006
68 Fri Jul 28 15:46:05 PDT 2006
69 Fri Jul 28 15:47:05 PDT 2006
69 Fri Jul 28 15:48:06 PDT 2006




- --
Eli Dart Office: (510) 486-5629
ESnet Network Engineering Group Fax: (510) 486-6712
Lawrence Berkeley National Laboratory
PGP Key fingerprint = C970 F8D3 CFDD 8FFF 5486 343A 2D31 4478 5F82 B2B3
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.4 (FreeBSD)


iD8DBQFEypqcLTFEeF+CsrMRAismAJ9dl4+v93bA+g6864jM2A7ov4LHzQCdGtng
D+H31LIhJgFmK83bPtU+MmM=
=gYAf
-----END PGP SIGNATURE-----



- ----- - -debbie Debbie Fligor, n9dn Network Engineer, CITES, Univ. of Il email: fligor@uiuc.edu <http://www.uiuc.edu/ph/www/fligor> "Every keystroke can be monitored. And the computers never forget."



-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.3 (Darwin)

iQCVAwUBRNNUgJEN6XnnHVONAQGi1AP9HhP+jaK0cSLwKIKKvSjmXjJx+75PuPzE
ocIRB7/QS6HUwBtnwJemVh99Wvr87xBEKBPDyS2LaktsSvNGG4uIXRPnrWMmNQLk
EFvVqyB7zWzoQx7UPEdn2Ym6EExI5pdMUOkdoXDfyNpRoei/lvbYAmHRjR/tvSjP
j+I2+UxT48g=
=0v5i
-----END PGP SIGNATURE-----



Other Mailing lists | Author Index | Date Index | Subject Index | Thread Index