RE: comments

Stefan Savage (savage@cs.washington.edu)
Tue, 11 Aug 1998 15:50:30 -0700

Hi Tom,
It looks like a good first start. At the same time I think it
needs some more focus and refinement to be a compelling pitch. Comments
follow:

Big Picture
------------
The story presented here is a bit too diffuse. There's some advocacy
about the Mbone, some stuff about some applications we work on
(including ambivalence about parallel computing), some kow-towing to
active networks, and a bit of requirements description. What it lacks
though is a thread that ties it all together. As Eric suggested, I
think this proposal needs to be structured around a clear problem
statement about the difficulties in doing research on wide-area
distributed networks and services. I think the problems need to be well
defined (limited availability of remote computation resources, hard to
administrate, hard to scale, etc.) loosely backed up (how long did it
take to get npd going? the mbone? etc...) and then we can go into
synergies the come out of a single platform (distributed measurement
efforts, 6bone, web caching, etc...) and the coincident changes that
make this possible (e.g. GigaPops) I think the Mbone is a fine example
that such things can be done and can be useful, but I don't think it
should be the principle argument for the xBone. Also, a statement of
explicit goals (again ala Eric suggestion) seems necessary. Finally,
the document is broken into motivation, applications and local hardware,
but there isn't really much of a roadmap laid out and there should be.

Details
--------
- Dedicated reliable bandwidth? This seems like overkill and changes
the proposal into one for a private network (something much more
expensive and less realistic). Perhaps there is a need for different
experiments to be isolated from one another, but if we're doing
experiments concerning the commodity Internet then they should be using
the same bandwidth as everyone else. Ideally, what we DO want is to
make them highly configurable with respect to which bandwidth they use.
We'd like a mechanism (either through multi-homing or local loose source
routing) so particular hosts can choose which of the bandwidth resources
available at the gigapop are used (ie. route over Sprint, or MCI,
or....). This is a technical point though and belongs in the technical
section.

- Selection by program committee? I think this is a very dangerous
option. Part of the success of the Mbone came from its
"self-administering" nature. Anyone could join, and there was
self-policing about what got broadcast when... as a consequence it was
very easy to do experiments (you didn't need to wait 4 months for the
program committee to meet). If self-policing isn't good enough, then I
think you should just have computer scheduled slots that anyone with a
password can get access to. If this becomes a problem you can become
more strict, but its dangerous to start strict.

- Contributions. I'm uncomfortable with the whole paragraph "We are
seeking contributions from...". First, we've got some term and jargon
problems. Abilene, SuperNet and HSCC aren't "networking bandwidth
providers" nor are the even really agencies one can talk to... "light a
lambda" is going to make some eyes glaze over and makes us sound
unrealistic to those that DO understand (no one is going to give us a
whole wavelength for this stuff) and Project Abeline is going to be able
to light their fibers just fine without the contributions we secure.
Also, none of this ever gets backed up... we don't explain what we need
from whom... why the contributions are really necessary.. .what role
these organizations play in our plan (do we require their involvement?
are there existing relationships? etc...)

- GigaPOPS. The paper doesn't really push hard enough on the synergy
that potentially comes from the emergence of GigaPOPs via the nascent
Internet2, InternetNG, InternetFoobar efforts. GigaPOPs are going to
provide a place where large amounts of regional connectivity will be
centralized under quasi-academic control. May applications of the xBone
don't really work if its just a bunch of universities each with some
PC's because they've all got 10Mbps links and we can't really force or
observe much traffic on them. The GigaPOP synergy lets this happen.

- Sacred cows. In a broad appeal like this, I'd be careful not to get
on either side of various sacred cow issues, such as RSVP vs DiffServ,
how well Squid is working, or global parallel computing. The various
people who care about these efforts are going to be necessary
collaborators.

- Applications.
In general, it might be better to concentrate on how these applications
would use the xBone instead of how they would have to work otherwise.

Also, perhaps to justify some of these we should have actual citations
(eg for telecollaboration: Active Services from Berkeley, the various
SRM work including MS BARC, etc.) In citing universities we should be
careful not to slight anyone especially if we include ourselves (We say
something about congestion control from Berkeley, Washington, Harvard
and elsewhere... but no one has heard of any congestion control stuff
from us, while ISI/USC, UCL and Arizona are MUCH better known for this).
We do talk about ourselves and applications that concern US quite a bit.
An alternative idea might be to just talk about all of the work
abstractly.

The IETF standards paragraph is weak and sounds like a catch-all.

Thee paragraphs have different styles. Some are very abstract and
academic (e.g. wide area distributed systems") while others are very
very concrete (e.g. the DNS section). Some are rhetorical, others are
declarative.

- Requirements
Generally speaking I think Amin is right that this would be better if
there were a working prototype in place at one or two universities.
Barring that, I think we need to talk explicitly about administrative
interfaces and explicit requirements. We do talk a bit about rebooting,
but this is only one of the problems. I think we need to make it clear
that we've done our homework here, and talk about software leading,
security, rebooting, remote debugging, remote consoles, etc. A picture
would make this look better. I'd be happy to help out with requirements
here since I have some ideas about what I think one would want (I'm into
software neutrality).

Also, I think we need to talk about storage (for things like collecting
traces) and about at least pay lip-service to issues of privacy.

- List of potential participants (ie. list of people we know). What is
the value in including this explicitly? It reads a bit like hubris
until we've talked to some of them. If there is another reason then we
should explain it.

- xBone
Given the similarity with Touch's effort we really need a different
name. We could do rBone for Research, or maybe skip "bone" altogether.
xNet?

-Stefan
----Original Message-----
From: tom@emigrant [mailto:tom@emigrant]
Sent: Friday, August 07, 1998 10:43 AM
To: syn@cs
Subject: comments

My plan is to get comments from you, then send it out a bit
more widely to get comments from Ed, Corbato, Wetherall, McCanne, etc.,
and then to recruit participants (see below), and then to approach
the industrial partners.

tom
------
The xBone: Combining High Performance Communication and
Computation for Wide Area Distributed Systems and Networking Research

1. Introduction

We propose to develop a international infrastructure for novel research
in
wide area distributed systems and applications. Today, wide area
systems
research is limited the lack of a general-purpose testbed for conducting
long-
term experiments under real traffic load; without such a testbed, wide
area
systems research is limited to paper design studies and small-scale
demonstrations, without the reality check of real users. The MBone (the

multicast backbone) has shown the value of a wide area testbed for
attracting users to new applications (such as Internet-based
teleconferencing), as well as the new problems that appear when systems
are deployed on the wide scale. A huge amount of research has been
motivated and enabled by the MBone, leading directly to new products in
industry. A "paper MBone" would have simply not been effective. The
MBone, however, was painstakingly put together piece by piece; it is
hard
to imagine how the research community could do something of the same
scale more than once or twice, despite the many applications, such as
new
telecollaboration tools, new approaches to web caching, new name
services,
and new security architectures to name a few, that could benefit from
widespread deployment. By analogy with the MBone, we call the proposed
testbed the xBone, for the "anything-backbone".

Our proposal is for a university, industrial, and government partnership
to
place a rack of roughly twenty PC's at 50-100 geographically separate
sites,
with dedicated, reliable gigabit bandwidth between sites and to large
numbers of early adopters. A single PC per site might be enough to
support
a single experiment such as the MBone, but we would like to use the
infrastructure to support many simultaneous experiments, each running
continuously to attract real users. Proposals to use the framework
would be
chosen by a program committee based on the potential for impact, the
benefit of demonstrating a system under long-term use, and the ability
to
attract users. We explicitly do not envision this infrastructure being
used as
a poor-man's parallel computer; geographically distributed computation
is
required to optimize latency, bandwidth, and availability to dispersed
end
clients, but geographic distribution is a hindrance to effective
parallel
computing.

We are seeking contributions from emerging networking bandwidth
providers, (such as Abilene, SuperNet, and HSCC), networking switch
providers (e.g., to light a lambda provided by Abilene), PC and
workstation
vendors (such as Intel and Sun), government (e.g., to fund glue hardware

and software), and university operations (for connectivity to the local
gigaPOP, as well as hardware installation and maintenance). On the
software side, our plan is for the system to be completely remotely
operated
and self-managed, with secure remote loading of software, remote
debugging and rebooting, and self-configuration. Most of this software
needed exists today in various forms, although it has not been put
together
for this purpose. For example, DARPA's ongoing Active Networks
program is developing software to share geographically distributed
resources among multiple applications; we expect to use their work in
the
long term as it becomes robust, although in the short term to get
started we
plan to provide each experiment dedicated resources (e.g., a PC at every

site).

We should point out that the infrastructure we are proposing is emerging
as
the standard platform of choice for web services. Heavily used web
services such as Altavista and Netscape have long had several
geographically distributed replicas, to improve latency, bandwidth and
availability to end-users; Amazon.com now has a site on each coast.
However, the tools used to operate these distributed web sites are
extremely
primitive. If the computer science research community is to figure out
solutions for what web services will need in the future, we will need an

experimental testbed for validating those solutions.

2. Applications

Our goal is to enable novel wide-area distributed systems and networking

research. There is a huge pent-up demand for a deployment mechanism for

new research ideas; we enumerate a few here. The scope of the list here

suggests that even twenty PC's per site would be quickly utilized.

a. Telecollaboration. The MBone is a huge success story for widespread
deployment, but it is suffering from its own popularity. Its protocols
are
based on manual coordination, and although there are several proposals
for
self-managing multicast networks (e.g., for building the multicast tree,
for
address allocation, for reliable multicast transmission), it is unclear
how to
deploy those new ideas given the widespread use of the MBone today.
Either we must restrict ourselves to backwardly compatible changes (a
serious limitation in a fast moving research field), conduct a "flag
day"
where everyone agrees to upgrade at the same time, or, with the xBone,
provide a framework for incrementally deploying an alternate, parallel
"Mbone2" structure to operate in parallel to the original Mbone while it
was
being phased out. Similarly, various proposals for telecollaboration
tools
require computation inside of the network to be effective, for example,
to
mix audio streams on the fly.

b. Real time. Providing end-to-end real time performance is an active
research area, with a proposed Internet standard in RSVP and several
other
competing efforts at other universities. Without widespread deployment
with real users, however, it is unclear whether RSVP or its competitors
would work at a large scale, for example, to support Internet telephony
in
the face of resource limits and rouer failures. Similarly, Internet
switch
manufacturers are moving towards providing quality of service by
providing prioritized classes of service; however, there has been no
widespread prototyping effort to demonstrate that priority classes will
be
sufficient to provide reasonable performance for real-time traffic.

c. Worldwide web caching. There are at least four major research
efforts
proposing novel approaches to managing a distributed collection of web
caches (at UCLA, Washington, Wisconsin, and Texas); this is in addition
to
the existing hierarchical Harvest/Squid caches. In fact, the research
efforts
have been motivated by the surprising fact that the Squid caches do not
work -- with hit rates under 50%, going through the Squid cache
hierarchy
hurts end client response time. Without being able to deploy and
measure a
distributed web cache, the research community would not have been able
to
determine the real problems that needed addressing; similarly, if we are

unable to deploy and measure the proposed solutions, we will be unable
to
determine the next set of problems in operating a cooperating set of web

caches. The xBone provides the opportunity for a bake-off between
competing approaches; we could deploy two competing web caching
services and allow users to vote with their feet.

d. IETF standards. There is an existing mechanism, via the IETF, for
new
Internet standards to be proposed and adopted for use. The xBone would
be complementary to that process, enabling proposed standards to be
implemented, deployed, and tested under real use before and during being

adopted for the real Internet. For example, IPv6 and mobile IP
standards
have been tested on a small scale, but with the xBone, clients could
begin to
count on these services being continuously available.

e. Internet measurement. A number of research efforts have begun to
focus
on measuring characteristics of the Internet, both to understand its
behavior
and to use as input into large scale simulations of the Internet.
Measurement
efforts are ongoing at LBL, Pittsburgh, Cornell, ISI, Michigan, and
Washington, among other places. Because there is no direct way to ask
routers to report on their buffer occupancy, link latencies/bandwidths,
utilization, or drop rate, measurements must be taken from multiple
sites to
be effective at capturing the state of the Internet. LBL and Pittsburgh
have
developed a new suite of software, for example, to be deployed at
participating sites for use in a widespread Internet measurement study.

f. Internet operation. The ongoing Internet measurement efforts have
begun
to illustrate that the Internet has substantial operational problems,
including
high drop rates (5-6%), persistent congestion, poor route selection, and

route oscillations, just to name a few examples. Researchers at
Berkeley,
Washington, Harvard, and elsewhere have proposed new approaches to
routing and congestion control to address these problems, but without a
deployment strategy that would allow them to be tested against
substantial
numbers of users, there would be no way of validating these approaches
to
the degree that would be necessary to think about using them in the real

Internet.

g. Distillation and compression. As more of the Web becomes graphics-
based, and as end-host displays become more heterogeneous (from PDA's
to reality engines), there is an increasing need for
application-specific
compression to take place inside of the network to optimize around
bottleneck links. For example, it makes little sense to ship a full
screen
picture across the Internet to a PDA; it also makes little sense to ask
users to
manually select among small and large versions of the same image.
Various
proposals exist to address this problem, but they are unlikely to have
much
of an impact without a framework for providing reliable service to real
users. In the long run, one would hope compression and distillation
would
be supported by both servers and clients, but before there is widespread

adoption, there is need for translators embedded in the network to
handle
legacy systems.

g. Wide area distributed systems. A number of projects, such as Globe,
Globus, Legion, WebOS, ProActive, and DARPA's Quorum, have recently
been started to provide a software framework to support applications
that
can make effective use of remote computational and storage resources.
These systems face a huge research agenda; to just illustrate one
example,
we have only very limited understanding of how to provide cache and
replica consistency across the wide area. To focus this work on the
real
problems of next-generation distributed applications, we need a strategy
for
how applications can be developed for these frameworks and then be
tested
in real use.

h. Naming. Similarly, a number of proposals have recently been
developed
for replacing the Internet's Domain Naming System (DNS). Although DNS
is effective at mapping individual machine names to IP addresses, as
services become replicated, there is an increasing need to carefully
control
the mapping from names to instances of a service (e.g., binding clients
on
the East Coast to the Amazon.com replica in Delaware vs. binding West
Coast clients to the one in Seattle). Point solutions to this problem
have
started to be deployed, but without a generic framework the solutions
are
likely to be ad hoc (for example, Cisco's Distributed Director selects
the
closest replica based on hop count, ignoring link latency, bandwidth,
server
load, etc.)

i. Wide area security. There is an obvious need for a national
infrastructure
for the secure, authenticated, accountable, and revocable access to
remote
resources; several proposals have been made for how to provide the
needed
framework, including wide area versions of Kerberos, MIT's SDSI and
Washington's CRISIS system. Nevertheless, it is unlikely that such a
system will be deployed in the near future, because it would rely on
physically secure computers spread around the country. The security
issues
for the xBone, the MBone, the Active Networks backbone, etc., are
similar,
and are unlikely to be solved without the ability to deploy a reliable,
continuously available framework for authentication and key
distribution.

j. Distributed databases. An active area of research in the database
community is how to integrate geographically distributed data sets, for
example, to integrate various Web databases or NASA's EOSDIS into a
usable system capable of supporting queries that span multiple sites.
The
Mariposa project at Berkeley, for example, has proposed an architecture
for
dynamically moving data and computation around to minimize network
bandwidth and local computation cost.

k. Active networks. Finally, DARPA's Active Networks program can be
seen as providing an architecture for applications that can benefit from

computing in the network. For example, what virtual machine do these
applications use? How are resources allocated among applications that
share a physical machine? Ideally, we could use the Active Network
framework for operating the xBone; however, it is still in the process
of
being standardized. Currently, there are four separate proposals for
the
Active Network architecture, and the quickest way to resolve which is
the
most appropriate would be to provide support for each of them across a
geographically distributed set of computers, and let application
developers
vote with their feet.

3. Local Hardware Installation

As a research community, over the past few years we have gained lots of
experience with assembling and operating machine-room-area clusters.
Since our graduate students have wanted to avoid having to run to the
machine room every time something goes wrong with our local clusters, we

have also gained considerable experience with remote cluster operation.

Our strategy is to simpify operations by (i) throwing hardware at the
problem (e.g., using two extra PCs to serve as fail-safe monitors on the

cluster operations) and (ii) having a standard baseline hardware and
software configuration, avoiding the management problems of trying to
turn
random collections of PC's running random collections of software into a

usable system.

At each site, we envision:

2 control PC's to serve as fail-safe reboot engines, monitoring the
operation
of the other PC's. The control PC's would control reboot serial lines
(X.10) for all of the machines in the cluster (including each other);
new
experiments (including potentially new OS kernels) are downloaded over
the Internet to the control PC and then installed on the relevant PC in
the
rack. The control PC's would also have the responsibility for passively

monitoring the cluster's Internet connection for illegal use by the
experimental PC's. As a fail-safe, the two PC's will not be used to run

experimental software. The control PC's would also have GPS's to
provide time synchronization for the rest of the cluster.

20 PC's to serve as experimental apparatus. Each PC would be configured

with a reasonable amount of memory and disk, a machine-room area
network connection, and a wide-area network connection.

A high-speed machine room area network, such as Myrinet or fast switched

Ethernet, connecting all of the PC's in the cluster. This network would
be
dedicated to the cluster and isolated from any other machines at the
site.

A high-speed Internet connection to the local gigaPOP (the local
connection
point to Abilene, HSCC, SuperNet, etc.). This connection would be
something like Gigabit Ethernet, that can be passively monitored by the
control PC's.

A local operator would be needed at each site to install the system and
to
replace any failed hardware components, and to reboot the control PC's
in
the event that both crash. Otherwise, the local operator would have no
software responsibilities for the cluster.

All told, the list price of the system described above would be roughly
$150K (?) per site; the machine-room footprint is about 5 square feet
(2
racks).

4. List of potential participants:

West Coast
----------
Thomas Anderson (UW) tom@cs.washington.edu
David Wetherall (UW)
Deborah Estrin (USC) estrin@cs.usc.edu
Lixia Zhang (UCLA)
Darrell Long (UCSC)
Steve McCanne (Berkeley) mccanne@cs.berkeley.edu
David Culler (Berkeley)
Sally Floyd (LBL)
Joe Pasquale (UCSD)
Mendel Rosenblum (Stanford)
Mary Baker (Stanford)
Bob Braden (ISI-West)
SDSC
Oregon?

Mid-Country
-----------
John Hartman (Arizona) jhh@cs.arizona.edu
John Carter (Utah) retrac@cs.utah.edu
Mike Dahlin (Texas-Austin) dahlin@cs.utexas.edu
Pei Cao (Wisconsin) cao@cs.wisc.edu
Garth Gibson (CMU) garth.gibson@cs.cmu.edu
Jon Turner (Washington U)
Raj Jain (Ohio State)
Farnam Jahanian (Michigan)
Mathis (Pittsburgh Supercomputing)
Dirk Grunwald (Colorado)
Gary Minden (Kansas)
Roy Campbell, Illinois
Minnesota?

East Coast
----------
Jonathan Smith (Penn) jms@cs.upenn.edu
Jeff Chase (Duke) chase@cs.duke.edu
John Guttag (MIT) guttag@lcs.mit.edu
Hari Balakrishnan (MIT)
Larry Peterson (Princeton) llp@cs.princeton.edu
B. R. Badrinath (Rutgers) badri@cs.rutgers.edu
Jim Kurose (UMass)
Margo Seltzer (Harvard)
ISI-East
Andrew Grimshaw (Virginia)
S. Keshav (Cornell)
Ian Foster (Argonne)
Ellen Zegura (Georgia Tech), ewz@cc.gatech.edu
David Kotz (Dartmouth)
Yechiam Yemini (Columbia)

International
------------
Roger Needham (Cambridge)
Gerry Neufeld (U British Columbia)
Ken Sevcik (Toronto) kcg@cs.toronto.edu
Andy Tannenbaum
Marc Shapiro (INRIA)

Industry
--------
Intel
Fred Baker (Cisco)
Peter Newman (Nokia)
Jim Gray (Microsoft BARC)
Microsoft Redmond
Chuck Thacker (Microsoft Cambridge)
Mike Schroeder, DEC SRC
Jeff Mogul, DEC NSL
Scott Shenker, Xerox PARC
Greg Papadopolous (SUN)
Mike Schwartz (@Home)
K.K. Ramakrishnan (AT&T)
Eric Brewer (Inktomi)
Srini Seshan (IBM?)
Bellcore
TIS