RE: comments

Amin Vahdat (vahdat@cs.washington.edu)
Fri, 7 Aug 1998 12:19:36 -0700

I think the proposal looks really good. A few comments:

1. It seems to me that people would be much more willing to
participate if you could point out that the hardware/software testbes
has already been deployed at 3-5 sites. "We have it standardized and
deployed and we have been able to run some interesting measurements."

2. The proposal is currently vague on exactly what software would be
run on the cluster and how it would be administered. I think that
people would want the next level of detail before considering
participation. Again, some pointer to proof that it *does* work
would be very important here.

3. It was never clear who was going to pay for the hardware. Is
each site expected to come up with 150K?

4. I would be more careful in the paragraph about "poor man's
supercomputer." A number of high-visibility projects (Globus,
Legion, Netsolve, PVM, Condor) are trying to do exactly this, and
some of them appear on the list of people you are targeting. I think
that it would be better to say something like "other projects are
looking at issues of parallel program's running across the wide area.
The goal of this deployment is to explore a different set of
issues."

Amin

On Friday, August 07, 1998 10:43 AM, Tom Anderson [SMTP:tom@emigrant]
wrote:
> My plan is to get comments from you, then send it out a bit
> more widely to get comments from Ed, Corbato, Wetherall, McCanne,
> etc.,
> and then to recruit participants (see below), and then to approach
> the industrial partners.
>
> tom
> ------
> The xBone: Combining High Performance Communication and
> Computation for Wide Area Distributed Systems and Networking
> Research
>
> 1. Introduction
>
> We propose to develop a international infrastructure for novel
> research in
> wide area distributed systems and applications. Today, wide area
> systems
> research is limited the lack of a general-purpose testbed for
> conducting long-
> term experiments under real traffic load; without such a testbed,
> wide area
> systems research is limited to paper design studies and small-scale
>
> demonstrations, without the reality check of real users. The MBone
> (the
> multicast backbone) has shown the value of a wide area testbed for
> attracting users to new applications (such as Internet-based
> teleconferencing), as well as the new problems that appear when
> systems
> are deployed on the wide scale. A huge amount of research has been
>
> motivated and enabled by the MBone, leading directly to new
products
> in
> industry. A "paper MBone" would have simply not been effective.
> The
> MBone, however, was painstakingly put together piece by piece; it
is
> hard
> to imagine how the research community could do something of the
same
>
> scale more than once or twice, despite the many applications, such
> as new
> telecollaboration tools, new approaches to web caching, new name
> services,
> and new security architectures to name a few, that could benefit
> from
> widespread deployment. By analogy with the MBone, we call the
> proposed
> testbed the xBone, for the "anything-backbone".
>
> Our proposal is for a university, industrial, and government
> partnership to
> place a rack of roughly twenty PC's at 50-100 geographically
> separate sites,
> with dedicated, reliable gigabit bandwidth between sites and to
> large
> numbers of early adopters. A single PC per site might be enough to
> support
> a single experiment such as the MBone, but we would like to use the
>
> infrastructure to support many simultaneous experiments, each
> running
> continuously to attract real users. Proposals to use the framework
> would be
> chosen by a program committee based on the potential for impact,
the
>
> benefit of demonstrating a system under long-term use, and the
> ability to
> attract users. We explicitly do not envision this infrastructure
> being used as
> a poor-man's parallel computer; geographically distributed
> computation is
> required to optimize latency, bandwidth, and availability to
> dispersed end
> clients, but geographic distribution is a hindrance to effective
> parallel
> computing.
>
> We are seeking contributions from emerging networking bandwidth
> providers, (such as Abilene, SuperNet, and HSCC), networking switch
>
> providers (e.g., to light a lambda provided by Abilene), PC and
> workstation
> vendors (such as Intel and Sun), government (e.g., to fund glue
> hardware
> and software), and university operations (for connectivity to the
> local
> gigaPOP, as well as hardware installation and maintenance). On the
>
> software side, our plan is for the system to be completely remotely
> operated
> and self-managed, with secure remote loading of software, remote
> debugging and rebooting, and self-configuration. Most of this
> software
> needed exists today in various forms, although it has not been put
> together
> for this purpose. For example, DARPA's ongoing Active Networks
> program is developing software to share geographically distributed
> resources among multiple applications; we expect to use their work
> in the
> long term as it becomes robust, although in the short term to get
> started we
> plan to provide each experiment dedicated resources (e.g., a PC at
> every
> site).
>
> We should point out that the infrastructure we are proposing is
> emerging as
> the standard platform of choice for web services. Heavily used web
>
> services such as Altavista and Netscape have long had several
> geographically distributed replicas, to improve latency, bandwidth
> and
> availability to end-users; Amazon.com now has a site on each coast.
>
> However, the tools used to operate these distributed web sites are
> extremely
> primitive. If the computer science research community is to figure
> out
> solutions for what web services will need in the future, we will
> need an
> experimental testbed for validating those solutions.
>
> 2. Applications
>
> Our goal is to enable novel wide-area distributed systems and
> networking
> research. There is a huge pent-up demand for a deployment
mechanism
> for
> new research ideas; we enumerate a few here. The scope of the list
> here
> suggests that even twenty PC's per site would be quickly utilized.
>
> a. Telecollaboration. The MBone is a huge success story for
> widespread
> deployment, but it is suffering from its own popularity. Its
> protocols are
> based on manual coordination, and although there are several
> proposals for
> self-managing multicast networks (e.g., for building the multicast
> tree, for
> address allocation, for reliable multicast transmission), it is
> unclear how to
> deploy those new ideas given the widespread use of the MBone today.
>
> Either we must restrict ourselves to backwardly compatible changes
> (a
> serious limitation in a fast moving research field), conduct a
"flag
> day"
> where everyone agrees to upgrade at the same time, or, with the
> xBone,
> provide a framework for incrementally deploying an alternate,
> parallel
> "Mbone2" structure to operate in parallel to the original Mbone
> while it was
> being phased out. Similarly, various proposals for
> telecollaboration tools
> require computation inside of the network to be effective, for
> example, to
> mix audio streams on the fly.
>
> b. Real time. Providing end-to-end real time performance is an
> active
> research area, with a proposed Internet standard in RSVP and
several
> other
> competing efforts at other universities. Without widespread
> deployment
> with real users, however, it is unclear whether RSVP or its
> competitors
> would work at a large scale, for example, to support Internet
> telephony in
> the face of resource limits and rouer failures. Similarly,
Internet
> switch
> manufacturers are moving towards providing quality of service by
> providing prioritized classes of service; however, there has been
no
>
> widespread prototyping effort to demonstrate that priority classes
> will be
> sufficient to provide reasonable performance for real-time traffic.
>
> c. Worldwide web caching. There are at least four major research
> efforts
> proposing novel approaches to managing a distributed collection of
> web
> caches (at UCLA, Washington, Wisconsin, and Texas); this is in
> addition to
> the existing hierarchical Harvest/Squid caches. In fact, the
> research efforts
> have been motivated by the surprising fact that the Squid caches do
> not
> work -- with hit rates under 50%, going through the Squid cache
> hierarchy
> hurts end client response time. Without being able to deploy and
> measure a
> distributed web cache, the research community would not have been
> able to
> determine the real problems that needed addressing; similarly, if
we
> are
> unable to deploy and measure the proposed solutions, we will be
> unable to
> determine the next set of problems in operating a cooperating set
of
> web
> caches. The xBone provides the opportunity for a bake-off between
> competing approaches; we could deploy two competing web caching
> services and allow users to vote with their feet.
>
> d. IETF standards. There is an existing mechanism, via the IETF,
> for new
> Internet standards to be proposed and adopted for use. The xBone
> would
> be complementary to that process, enabling proposed standards to be
>
> implemented, deployed, and tested under real use before and during
> being
> adopted for the real Internet. For example, IPv6 and mobile IP
> standards
> have been tested on a small scale, but with the xBone, clients
could
> begin to
> count on these services being continuously available.
>
> e. Internet measurement. A number of research efforts have begun
to
> focus
> on measuring characteristics of the Internet, both to understand
its
> behavior
> and to use as input into large scale simulations of the Internet.
> Measurement
> efforts are ongoing at LBL, Pittsburgh, Cornell, ISI, Michigan, and
>
> Washington, among other places. Because there is no direct way to
> ask
> routers to report on their buffer occupancy, link
> latencies/bandwidths,
> utilization, or drop rate, measurements must be taken from multiple
> sites to
> be effective at capturing the state of the Internet. LBL and
> Pittsburgh have
> developed a new suite of software, for example, to be deployed at
> participating sites for use in a widespread Internet measurement
> study.
>
> f. Internet operation. The ongoing Internet measurement efforts
> have begun
> to illustrate that the Internet has substantial operational
> problems, including
> high drop rates (5-6%), persistent congestion, poor route
selection,
> and
> route oscillations, just to name a few examples. Researchers at
> Berkeley,
> Washington, Harvard, and elsewhere have proposed new approaches to
> routing and congestion control to address these problems, but
> without a
> deployment strategy that would allow them to be tested against
> substantial
> numbers of users, there would be no way of validating these
> approaches to
> the degree that would be necessary to think about using them in the
> real
> Internet.
>
> g. Distillation and compression. As more of the Web becomes
> graphics-
> based, and as end-host displays become more heterogeneous (from
> PDA's
> to reality engines), there is an increasing need for application-
> specific
> compression to take place inside of the network to optimize around
> bottleneck links. For example, it makes little sense to ship a
full
> screen
> picture across the Internet to a PDA; it also makes little sense to
> ask users to
> manually select among small and large versions of the same image.
> Various
> proposals exist to address this problem, but they are unlikely to
> have much
> of an impact without a framework for providing reliable service to
> real
> users. In the long run, one would hope compression and
distillation
> would
> be supported by both servers and clients, but before there is
> widespread
> adoption, there is need for translators embedded in the network to
> handle
> legacy systems.
>
> g. Wide area distributed systems. A number of projects, such as
> Globe,
> Globus, Legion, WebOS, ProActive, and DARPA's Quorum, have recently
>
> been started to provide a software framework to support
applications
> that
> can make effective use of remote computational and storage
> resources.
> These systems face a huge research agenda; to just illustrate one
> example,
> we have only very limited understanding of how to provide cache and
>
> replica consistency across the wide area. To focus this work on
the
> real
> problems of next-generation distributed applications, we need a
> strategy for
> how applications can be developed for these frameworks and then be
> tested
> in real use.
>
> h. Naming. Similarly, a number of proposals have recently been
> developed
> for replacing the Internet's Domain Naming System (DNS). Although
> DNS
> is effective at mapping individual machine names to IP addresses,
as
>
> services become replicated, there is an increasing need to
carefully
> control
> the mapping from names to instances of a service (e.g., binding
> clients on
> the East Coast to the Amazon.com replica in Delaware vs. binding
> West
> Coast clients to the one in Seattle). Point solutions to this
> problem have
> started to be deployed, but without a generic framework the
> solutions are
> likely to be ad hoc (for example, Cisco's Distributed Director
> selects the
> closest replica based on hop count, ignoring link latency,
> bandwidth, server
> load, etc.)
>
> i. Wide area security. There is an obvious need for a national
> infrastructure
> for the secure, authenticated, accountable, and revocable access to
> remote
> resources; several proposals have been made for how to provide the
> needed
> framework, including wide area versions of Kerberos, MIT's SDSI and
>
> Washington's CRISIS system. Nevertheless, it is unlikely that such
> a
> system will be deployed in the near future, because it would rely
on
>
> physically secure computers spread around the country. The
security
> issues
> for the xBone, the MBone, the Active Networks backbone, etc., are
> similar,
> and are unlikely to be solved without the ability to deploy a
> reliable,
> continuously available framework for authentication and key
> distribution.
>
> j. Distributed databases. An active area of research in the
> database
> community is how to integrate geographically distributed data sets,
> for
> example, to integrate various Web databases or NASA's EOSDIS into a
>
> usable system capable of supporting queries that span multiple
> sites. The
> Mariposa project at Berkeley, for example, has proposed an
> architecture for
> dynamically moving data and computation around to minimize network
> bandwidth and local computation cost.
>
> k. Active networks. Finally, DARPA's Active Networks program can
be
>
> seen as providing an architecture for applications that can benefit
> from
> computing in the network. For example, what virtual machine do
these
>
> applications use? How are resources allocated among applications
> that
> share a physical machine? Ideally, we could use the Active Network
>
> framework for operating the xBone; however, it is still in the
> process of
> being standardized. Currently, there are four separate proposals
> for the
> Active Network architecture, and the quickest way to resolve which
> is the
> most appropriate would be to provide support for each of them
across
> a
> geographically distributed set of computers, and let application
> developers
> vote with their feet.
>
> 3. Local Hardware Installation
>
> As a research community, over the past few years we have gained
lots
> of
> experience with assembling and operating machine-room-area
clusters.
>
> Since our graduate students have wanted to avoid having to run to
> the
> machine room every time something goes wrong with our local
> clusters, we
> have also gained considerable experience with remote cluster
> operation.
> Our strategy is to simpify operations by (i) throwing hardware at
> the
> problem (e.g., using two extra PCs to serve as fail-safe monitors
on
> the
> cluster operations) and (ii) having a standard baseline hardware
and
>
> software configuration, avoiding the management problems of trying
> to turn
> random collections of PC's running random collections of software
> into a
> usable system.
>
> At each site, we envision:
>
> 2 control PC's to serve as fail-safe reboot engines, monitoring the
> operation
> of the other PC's. The control PC's would control reboot serial
> lines
> (X.10) for all of the machines in the cluster (including each
> other); new
> experiments (including potentially new OS kernels) are downloaded
> over
> the Internet to the control PC and then installed on the relevant
PC
> in the
> rack. The control PC's would also have the responsibility for
> passively
> monitoring the cluster's Internet connection for illegal use by the
>
> experimental PC's. As a fail-safe, the two PC's will not be used
to
> run
> experimental software. The control PC's would also have GPS's to
> provide time synchronization for the rest of the cluster.
>
> 20 PC's to serve as experimental apparatus. Each PC would be
> configured
> with a reasonable amount of memory and disk, a machine-room area
> network connection, and a wide-area network connection.
>
> A high-speed machine room area network, such as Myrinet or fast
> switched
> Ethernet, connecting all of the PC's in the cluster. This network
> would be
> dedicated to the cluster and isolated from any other machines at
the
> site.
>
> A high-speed Internet connection to the local gigaPOP (the local
> connection
> point to Abilene, HSCC, SuperNet, etc.). This connection would be
> something like Gigabit Ethernet, that can be passively monitored by
> the
> control PC's.
>
> A local operator would be needed at each site to install the system
> and to
> replace any failed hardware components, and to reboot the control
> PC's in
> the event that both crash. Otherwise, the local operator would have
> no
> software responsibilities for the cluster.
>
> All told, the list price of the system described above would be
> roughly
> $150K (?) per site; the machine-room footprint is about 5 square
> feet (2
> racks).
>
> 4. List of potential participants:
>
> West Coast
> ----------
> Thomas Anderson (UW) tom@cs.washington.edu
> David Wetherall (UW)
> Deborah Estrin (USC) estrin@cs.usc.edu
> Lixia Zhang (UCLA)
> Darrell Long (UCSC)
> Steve McCanne (Berkeley) mccanne@cs.berkeley.edu
> David Culler (Berkeley)
> Sally Floyd (LBL)
> Joe Pasquale (UCSD)
> Mendel Rosenblum (Stanford)
> Mary Baker (Stanford)
> Bob Braden (ISI-West)
> SDSC
> Oregon?
>
> Mid-Country
> -----------
> John Hartman (Arizona) jhh@cs.arizona.edu
> John Carter (Utah) retrac@cs.utah.edu
> Mike Dahlin (Texas-Austin) dahlin@cs.utexas.edu
> Pei Cao (Wisconsin) cao@cs.wisc.edu
> Garth Gibson (CMU) garth.gibson@cs.cmu.edu
> Jon Turner (Washington U)
> Raj Jain (Ohio State)
> Farnam Jahanian (Michigan)
> Mathis (Pittsburgh Supercomputing)
> Dirk Grunwald (Colorado)
> Gary Minden (Kansas)
> Roy Campbell, Illinois
> Minnesota?
>
> East Coast
> ----------
> Jonathan Smith (Penn) jms@cs.upenn.edu
> Jeff Chase (Duke) chase@cs.duke.edu
> John Guttag (MIT) guttag@lcs.mit.edu
> Hari Balakrishnan (MIT)
> Larry Peterson (Princeton) llp@cs.princeton.edu
> B. R. Badrinath (Rutgers) badri@cs.rutgers.edu
> Jim Kurose (UMass)
> Margo Seltzer (Harvard)
> ISI-East
> Andrew Grimshaw (Virginia)
> S. Keshav (Cornell)
> Ian Foster (Argonne)
> Ellen Zegura (Georgia Tech), ewz@cc.gatech.edu
> David Kotz (Dartmouth)
> Yechiam Yemini (Columbia)
>
> International
> ------------
> Roger Needham (Cambridge)
> Gerry Neufeld (U British Columbia)
> Ken Sevcik (Toronto) kcg@cs.toronto.edu
> Andy Tannenbaum
> Marc Shapiro (INRIA)
>
> Industry
> --------
> Intel
> Fred Baker (Cisco)
> Peter Newman (Nokia)
> Jim Gray (Microsoft BARC)
> Microsoft Redmond
> Chuck Thacker (Microsoft Cambridge)
> Mike Schroeder, DEC SRC
> Jeff Mogul, DEC NSL
> Scott Shenker, Xerox PARC
> Greg Papadopolous (SUN)
> Mike Schwartz (@Home)
> K.K. Ramakrishnan (AT&T)
> Eric Brewer (Inktomi)
> Srini Seshan (IBM?)
> Bellcore
> TIS