> o In 1.2.2, you may want to distinguish that the cost metric is round trip
> latency for 40-byte UDP packets (UNIX traceroute seems to use this by
> default). This will give us a good notion of whether the n-gon (or just
> "polygon"?) inequality holds for TCP acks, which should face the exact
> same latency, since they're also 40 bytes. And hopefully it will be close
> enough for the 576 byte and 1500 byte packets too.
Remember, this is not strictly true. There's actually a European server
putting out 100 byte packets, and there was a Polish one doing 20. Where
they don't tell me, I'm assuming 40 (TCPdump was showing non-40, even when
they were producing 40, and I haven't yet taken the time to figure out the
proper conversion metric between the numbers).
"Universitaet Karlsruhe", "Karlsruhe,
Deutschland","http://www.ira.uka.de/I32/cgi-bin/trace.cgi?", true, 100,
50, "129.13.10.100"
(Site, Location, URL, (explicitly says sizes), packet size (bytes),
maxHops, sourceIP)
That said, you're right; it should be included. :>
> o When the word "server" is used (lotsa places) you may wish to
> distinguish whether you mean "traceroute server", "nexus", or "web
> server". It's a little unclear sometimes (eg: for the definition of "Q").
Fixed. All such occurrences have been changed to source, or "web server"
where appropriate. Still not sure if it sounds right, though.
> o A nit: in 2.2.2, it seems that the stride algorithm would want to pick
> the nodes whose key values are *less* than or equal the current time.
Ha. Fixed.
> o Another nit: in 3.2, is that m = 60 *seconds*, i assume?
Indeed. Fixed.
> o In 3.2, we may want to use something like delta_min = 10min and
> delta_max = 20min to provide a guarantee to our poor traceroute servers
> that we won't ever hit them twice in the same 10min period. If we don't
> care about this, we may want to use exponential inter-measurement times,
> so we can enjoy the PASTA.
Tom and I were discussing the various distributions that one might want to
use in the measurements, and a pure randomization seemed more appropriate
than an exponential distribution, at the time. Can you provide a reason
as to why an exponential distribution would provide "better" data, than
the uniform?
Inasmuch as 10 minute restrictions, I look at it this way: If I was
looking at my server logs, what would I get more pissed at? 3 traces,
ping ping ping, then nothing for 45 minutes, or one every 15 minutes? I
actually don't know. Keeping a zero min bound still sounds like a good
idea, although quantifying that feeling in a logical manner isn't easy;
"getting measurements near one another" is the closest I come up with.
And don't you mean min = 10, max = 40? Expected value of a uniform random
variable is (b - a)/2.
> o In 3.2, where did l=60 (max # simultaneous traceroutes=60) come from? It
> sounds reasonable, but i was just wondering...
I made it up. We actually hadn't decided on any bounds for that. $m$ was
similarly derived. Which reminds me, does anyone know what happens when I
terminate the connection to a CGI script? Does execution cease?
> o Along with the max number of traceroutes in progress (l=60), we might
> want to state bounds on a peak expected rate of traceroutes. Something
> like:
> max <= 60 traceroutes/2 secs = 30 traceroutes per second.
> max <= T servers / 15 min <= 150/15min = 10 traceroutes/min
Can you provide more detailed reasoning as to why this would be a good
thing? I can imagine someone asking me what these values were, but I
can't imagine why they would.
> o A bigger point: do we want to design in a bound on how often a site can
> be tracerouted *to*? For instance, without this protection, we may end up
> tracerouting to www.cnn.com 60 times within a minute interval.
This is a good point. And actually, we could traceroute to cnn.com 60
times in a much smaller interval; imagine all traces started at such a
time as to cause their final 3 packets to all hit the site at ~once.
Some options I see at this point are:
1.Providing a min bound (only) on the delta between accesses
2.Providing a min/max, just like in the source case, and generate
another random variable. In this case, whenever a source selects a
target, it looks to a list of those targets whose timers
have expired, among those targets that it has not yet hit up in the
current loop iteration. Pick one at random from this list.
This would seem to synchronize the sources, though.
Could do the min or min/max scheme, and just pass on the source if it
doesn't have any ready targets.
it randomizes the target vector, then does a linear search for an
acceptable target
> o Can you get that berkeley URL to work?
> http://www.net.berkeley.edu/cgi-bin/traceroute?www.cs.washington.edu
> I don't see it on the list.
So added. Turns out, our department has a traceroute script, too. But
it's non HTTP-GET, and they do a bunch of pretty-printing on the output,
so it can't be used.
> o One big remaining question is where to traceroute *to*. Here are some
> thoughts: 75% of flows are HTTP, so we could concentrate on web traffic.
Actually, I'll wager that although that figure may be accurate, the
remaining 25% is largely made up of people d/ling large files listed at a
web server (ie, new distributions of Communicator, Opera, HotJava, Lynx,
etc). So the web server metric is good, IMO.
> It seems like we'd like to get a notion of paths from clients to servers,
> and vice versa. So we could:
>
> To get servers:
> o traceroute to the (say) 100 most popular web sites, as ranked by
> excite or yahoo or whoever does that sort of thing
http://www.zdnet.com/pcmag/special/web100/
Whew. Thanks!
_____________________________________________________________________________
"The human mind is a 400,000-year-old legacy application...and you expected
to find structred programming?" -- Randall Davis, 1996 AAAI Pres. Address