Re: beacon still dies at 1.0
Hi, Debbie -- We're chasing a couple of problems in the Beacon right now.
One of them is the handling of a badly formed Beacon hostname/IP address,
which is what the "UNKN" is refering to. I'm looking at that now, and
should have it under control shortly.
The other is a TCP connection bug, which is proving to be more challenging
as far as figuring out what's really going on is concerned.
I know you're not running Linux boxen, but let me explain what we're doing
here, in the hope that you can map it to whatever the parallel structures
are for FreeBSD and Solaris.
For a Redhat system, there's a table of open connections that the kernel
monitors, called ip_conntrack. It resides in:
/proc/net/ip_conntrack
We have a shell script to munge the content of that file down to a count of
open connections on the various port. Here's that script, again, for
Redhat Linux 9.1 or Fedora Core 2:
cat /proc/net/ip_conntrack | sed 's,^.*sport.,,g' | awk '{print $1}' | sort
| uniq -c | sort -n
If you can get some variation of this running, please let me know what it
returns for ports 10002, 10003, and most especially 10004, which is where
we're seeing a problem.
As far as the "Blind Beacons" output, Blind Beacons are Beacons that are
reporting properly to the Central Server (ie, everything's working as it
should), but those particular Beacons only see themselves, and are only
seen by themselves -- They do not see, nor are they seen by, any other
Beacons. I added the Blind Beacon code just as a way to clean up the
Central Server HTML matrix.
Finally, I notice that aitsbuoy1, uic-node3-buoy, and uisbuoy1 at:
http://dclmr-buoy.gw.uiuc.edu/beacon/beacon_info.html
are indicating that their hostnames aren't resolving correctly to FQDNs.
That would indicate boxes that are either misconfigured, or only partially
configured as far as hostname resolution goes. Don't know that that's a
huge problem, or even related to the rest of what you're seeing, but it
might be related to the "UNKN" error your seeing. (I'll know more here in
a bit.) You might want to check that out, too, just in case.
Mitch
At 02:11 PM 8/25/2004 -0500, debbie fligor wrote:
> well, I updated my mac beacons to 1.0-0 and it's still exiting, but
> since no one else is reporting problems, I thought it was only a Mac
> thing.
>
> FWIW, here's the error I had on my Mac this morning:
> 350 faranth> Use of uninitialized value in scalar chop at ./beacon line
1876.
> Use of uninitialized value in concatenation (.) or string at ./beacon
> line 1897.
> HOST_LOOKUP delete failed 0x2910cce4, UNKN,
>
>
> so i installed the 1.0-0 version on 10 solaris boxes (our monitoring
> buoys) yesterday and this morning, and I can't keep them up. some of
> them seem to have more problems than others, I've had to restart some
> of them more than once, and some not yet at all.
>
> if anyone has any ideas
> <http://dclmr-buoy.gw.uiuc.edu/beacon/central_loss.html> for my
> server and other info (this is our on-campus/eventually-multi-campus
> monitoring, hence our own page & group). there should be 5 "good"
> and 5 "blind" beacons. they're all pretty much the same:
>
> 100 hab-buoy> uname -a
> SunOS hab-buoy 5.8 Generic_108528-15 sun4u sparc SUNW,UltraAX-i2
>
>
>
> I can't find anything in anything that's syslogging.
> --
>
> -debbie
> Debbie Fligor, n9dn Network Engineer, CITES, Univ. of Il
> email: fligor@uiuc.edu <http://www.uiuc.edu/ph/www/fligor>
> "Every keystroke can be monitored. And the computers never forget."
>
>
--
Mitch Kutzko | mitch@dast.nlanr.net | mitch@ncsa.uiuc.edu | 217-333-1199
Project: http://dast.nlanr.net/ | Personal: http://hobbes.ncsa.uiuc.edu/