Re: Trying to get beacon 1.1-0 working


Hi, Håvard -- We're looking into this now.  We'll keep you posted on what
we find out, and if/when we have a patch available.

Thanks for your report!

Mitch

At 10:41 PM 8/25/2005 +0200, Havard Eidnes wrote:
> Hi,
> 
> I'm trying to get the beacon software, version 1.1-0 working on
> my own central server running NetBSD 2.0_STABLE, and I'm finding
> that the beacon script appears to make some non-portable
> assumptions about the host's TCP stack.
> 
> For a long time I had problems that the beacon server rejected all
> (!) the TCP reports.  After a bit of digging, I found the reason.
> In my small setup (http://beacon.nordu.net/), the clients typically
> end up sending the reports in *two* TCP segments -- the first
> contains the "line 0" identity of the central beacon server, the
> second segment contains the beacon info about the sender, and then
> the reports for the other beacons it sees.  The time between them
> the first and second TCP segments can be considerable -- upwards to
> 25ms does not appear to be uncommon.  Clients in my case are both
> NetBSD, FreeBSD, and Solaris.
> 
> This particular behaviour appears to interact quite badly with the
> following piece of code:
> 
>           while (defined ($line = <$fh>) && ($line ne $ENDMESSAGE)) {
>           push(@lines, $line);
> 
> What happens is that these two TCP segments end up as two separate
> sets of lines.  The first set of lines consists of a single line, so
> it validates as being sent to the correct beacon server, but the
> report itself is otherwise empty.  When the beacon server comes
> around to process the second TCP segment, it rejects the report
> because the first line does not match the beacon centralserver/
> group/port/version line (it was already processed in the first
> round).
> 
> It appears that placement of constructs such as
> 
>   my $oldfh = select($client);
>   $| = 0;
>   select($oldfh); 
> 
> and
> 
>   $oldfh = select($client);
>   $| = 1;
>   select($oldfh);
> 
> before and after the first and last $client print statements in
> send_tcp_report() makes the data (in my case, with few participants)
> fit in a single segment.  However, I can see the same problem
> cropping up when the number of participants grows, as one can then
> no longer rely on the data fitting in a single TCP segment, though I
> have no observations about what would happen in that case.
> 
> In order to inter-operate with the unmodified 1.1-0 clients others
> have installed to participate in "my" group, I also have an ugly
> workaround for the server part of the code which in my local copy
> presently looks like this:
> 
>           my $count = 0;
>           $line = "";
>           while ($count < 100 && ($line ne $ENDMESSAGE)) {
>                 $line = <$fh>;
>                 if (!defined($line)) {  
>                     usleep(20000); 
>                     $count++;
>                     $line="";
>                     next;
>                 }
>                 if ($DEBUG>2) {
>                     if ($count > 0) {
>                         printf("Re-read %d times\n", $count);
>                     }   
>                 }
>                 $count = 0;
>                 if ($line ne $ENDMESSAGE) {
>                     if ($DEBUG>4) {
>                           printf("Adding line: %s", $line);
>                     }
>                     push(@lines, $line);
>                 }
>           }
>           if ($DEBUG>4) {
>               printf("Processing %d lines\n", scalar(@lines));
>           }
> 
> In my current setup, this code often ends up reporting "Re-read"
> values of up towards 10, and this appears to result in bad receive
> stats for the multicast data at the central server, since it is most
> probably dropping the UDP packets while processing the TCP-received
> data.
> 
> It seems to me that it would probably have been better to send the
> unicast reports using unicast UDP with an application-level framing
> than using TCP.  That way, the central beacon would have a fighting
> chance to participating on a reasonably level field in processing
> the multicast packets, instead of being bogged down and unresponsive
> while working around the above problem.
> 
> Comments?
> 
> I wonder: is 1.3 alpha any better in this regard?
> 
> Regards,
> 
> - Håvard
> 
> 
--
Mitch Kutzko | mitch@dast.nlanr.net | mitch@ncsa.uiuc.edu | 217-333-1199
Project: http://dast.nlanr.net  |  Personal: http://hobbes.ncsa.uiuc.edu 



Other Mailing lists | Author Index | Date Index | Subject Index | Thread Index