from david

Tom Anderson (tom@emigrant)
Wed, 8 Jul 1998 13:16:01 -0700 (PDT)

Received: from june.cs.washington.edu (june.cs.washington.edu [128.95.2.4]) by emigrant.cs.washington.edu (8.8.5+CS/7.2ws+) with ESMTP id JAA01503 for <tom@emigrant.cs.washington.edu>; Tue, 7 Jul 1998 09:56:56 -0700 (PDT)
Received: from salsa.lcs.mit.edu (salsa.lcs.mit.edu [18.31.0.37]) by june.cs.washington.edu (8.8.7+CS/7.2ju) with ESMTP id JAA13360 for <tom@cs.washington.edu>; Tue, 7 Jul 1998 09:56:55 -0700
Received: from juniper.lcs.mit.edu (juniper.lcs.mit.edu [18.31.0.36])
by salsa.lcs.mit.edu (8.8.5/8.8.5) with SMTP id MAA32051
for <tom@cs.washington.edu>; Tue, 7 Jul 1998 12:55:30 -0400
Received: by juniper.lcs.mit.edu (SMI-8.6/SMI-SVR4)
id MAA17477; Tue, 7 Jul 1998 12:55:13 -0400
Date: Tue, 7 Jul 1998 12:55:13 -0400
Message-Id: <199807071655.MAA17477@juniper.lcs.mit.edu>
From: "David J. Wetherall" <djw@juniper.lcs.mit.edu>
To: tom@cs.washington.edu
In-reply-to: <199807062302.QAA32500@emigrant.cs.washington.edu>
(tom@cs.washington.edu)
Subject: Re: xbone stuff
Reply-to: djw@lcs.mit.edu
Status: R

hi tom,

a little more detour routing feedback that i'd been meaning to send.

1. minor nit -- were the "long term averages" mean or median based?
it might be important to use medians. if congestion causes O(rtt)
latency noise at different points, then congestion values could be
outliers. paths with different short-term congestion-likelihoods
(which you'll find if you're looking) might bias the mean in different
ways, even if there is no long-term congestion variation.

2. i think the right way to think about the "faster dogleg routes" is
not necessarily that internet routing is broken, but rather that the
existence of ASes might be costing us a lot. this is because ASes are
fundamental in such a large distributed system -- they're not a
temporary measure that will ever go away. given this perspective,
some logical questions are:

-how much more would you expect to pay because of the existence of
ASes? i don't know the answer, but presumably there's some difference.
simulations with random graphs, or an examination of whatever you can
recover of the real topology from the traceroute data might be
useful. [i think the merging of traceroute data would be a cool
project -- one that's useful because there's no other way to recover
topology data, and not trivial either, given that routing changes will
make the process ambiguous/inconsistent]

-if you observed worse that expected, are the routing metrics screwed?
perhaps pathchar can fill in the picture somewhat so that you can make
a reasonable first cut at a decent metric and see if that produces
what you observe.

-is there a better incentive structure than "early exit" that would
help to align local/global optima? fishing, but this would be way cool ...

3. given this characterization ASes, and that they're organized as
well as can be expected, you're back to detour as an overlay -- as
minshall said, a superior vpn solution. this seems to be a great
context to explore how much better you can make it by using higher
layer information, because you're not forced to sell the "right"
solution for the entire Internet. for me, this addresses the
difficulty i was having matching the characterization phase of the
project to the solution phase.

hope this is useful.

cheers,

djw