Thank you very much for the analysis. Changing ~1000 sites does not
sound like a daunting task - the arpanet had a similar, coordinated,
synchronized switchover in its early days. Similarly, if changing
just 30 sites will impact 30% of the requests (and presumably 30%
of the overall traffic), then having actual impact on the internet
is certainly possible. Thanks again for the data collection and
analysis.
Gun.
From cardwell@cs.washington.edu Mon Jun 22 18:58:11 1998
Received: from saba.cs.washington.edu (saba.cs.washington.edu [128.95.4.58]) by june.cs.washington.edu (8.8.7+CS/7.2ju) with ESMTP id SAA06059; Mon, 22 Jun 1998 18:58:10 -0700
Received: from localhost (cardwell@localhost) by saba.cs.washington.edu (8.8.8+CS/7.2ws+) with SMTP id SAA32355; Mon, 22 Jun 1998 18:58:25 -0700
Date: Mon, 22 Jun 1998 18:58:25 -0700 (PDT)
From: Neal Cardwell <cardwell@cs.washington.edu>
To: Emin Gun Sirer <egs@june>
cc: syn@cs.washington.edu, karlin@cs, levy@cs, molly@cs, nitin@cs, tkl@cs,
wolman@cs
Subject: Re: show me the clients
In-Reply-To: <199806112212.PAA18095@june.cs.washington.edu>
Message-ID: <Pine.LNX.3.96.980622183300.31132I-100000@saba.cs.washington.edu>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Status: O
On Thu, 11 Jun 1998, Emin Gun Sirer wrote:
> I was wondering what percentage of the overall requests came from the
> top 30 clients ? That is, are the clients coming predominantly from
> the top 30 domains, or are they distributed all across the internet
> and compromise the long tail that you mentioned ?
>
> It would be really interesting if there were large clumps of users
> hooked into the internet through common points, e.g. large ISPs, since
> they would all benefit from an infrastructural upgrade at the common
> point (e.g. Kimera, a web cache, a protocol translator). It would be
> less interesting if aol, netcom, boeing etc. made up only a tiny fraction
> of the overall request stream. The number of client hookup points into
> the internet determines the overall inertia against structural changes,
Good question. It looks like the answer is that the clients are not
clumped, but rather very widely spread out, with maybe only 20-30% of your
requests coming from your top 30 client sites. AOL is typically 3-12% of
client load, with a steep drop-off from there. It also looks like the more
popular your site is, the more widely spread out your clients are, as you
would expect.
So the bad news is that it looks like you'd have to deploy services
(cache, Kimera, translator) at O(1,000) end sites to capture a large
fraction of your clients. I guess the good news is that it can't get too
much worse than that, given the number of ASes and prefixes in the
internet routing tables is O(50,000) (i think).
neal
----------
here's the data i was looking at:
For www.travelzoo.com, June 1998 sessions by client origin:
-------------------------------------------------
America O (VA,US): 12.63%
Artemis R (CA,US): 0.80
UNKNOWN: 0.67
Microsoft (WA,US): 0.54
At & T It (FL,US): 0.49
Ibm Corpo (CT,US): 0.43
Prodigy S (NY,US): 0.42
Hewlett-P (CA,US): 0.27
Uunet Tec (VA,US): 0.25
Intel Cor (CA,US): 0.25
The Boein (WA,US): 0.25
Advantis (NY,US): 0.24
Motorola (IL,US): 0.21
Ernst & Y (NJ,US): 0.20
Oracle Co (US): 0.18
Northwest (WA,US): 0.16
Is / Ie (TX,US): 0.15
Texas Ins (US): 0.15
Ans Co + (NY,US): 0.14
Arthur An (IL,US): 0.14
Coopers + (NY,US): 0.13
Webtv Net (CA,US): 0.12
Kpmg Peat (NJ,US): 0.12
Digital E (CA,US): 0.12
Shell Oil (TX,US): 0.12
Ing Barin (NY,US): 0.12
Adaptec, (CA,US): 0.11
Pilot Net (CA,US): 0.11
General E (US): 0.10
Mci Telec (NC,US): 0.1
Applied T (NY,US): 0.10
Fidelity (MA,US): 0.1
Intel Cor (OR,US): 0.1
... ...
total percentage: 25.4% of sessions came from top 250 institutions
a UW site (site, requests, percentage of all requests):
---------------------------------------------------------
com.dec: 870 0.498247543124184%
br.com: 953 0.545781504134882%
ca.on: 984 0.563535152223215%
com.netcom: 1208 0.691819577119557%
uk.ac: 1212 0.694110370421277%
net.prodigy: 1572 0.900281767576112%
com.compuserve: 1941 1.11160744965982%
com.aol: 17547 10.0491375163219%
a USGS site (site, requests, percentage of all requests):
-----------------------------------------------------------
edu.alaska: 1943 1.08594199739553%
com.infoseek: 1994 1.11444587895352%
us.ak: 2126 1.18822063122125%
199.131: 2372 1.32570994226567%
146.63: 2637 1.47381834643953%
164.159: 2814 1.5727435824349%
usgs.wr: 3218 1.79853903634524%
com.aol: 3732 2.0858134504787%
207.123: 4698 2.62571050116531%
gov.blm: 4866 2.71960564041515%
com.alexa: 4957 2.77046550750882%
com.dec: 6698 3.74350977794917%
gov.usgs: 11754 6.56930634965879%
net.alaska: 21671 12.1119140635916%
DOE site (site, requests, percentage of all requests):
-----------------------------------------------------------
gov.lanl: 1853 0.750883392226148%
edu.mit: 1912 0.774791713943009%
edu.wisc: 2017 0.817340422083185%
146.138: 2074 0.840438292216423%
edu.ucsd: 2088 0.84611145330178%
edu.utexas: 2297 0.93080364379032%
com.netcom: 2307 0.934855901708432%
gov.llnl: 2327 0.942960417544656%
com.compuserve: 2554 1.0349466722858%
com.gat: 2893 1.17231821570979%
gov.pppl: 3469 1.40572827179304%
gov.ornl: 4755 1.92684864006224%
com.aol: 6413 2.59871300288521%
gov.doe: 16548 6.70567640289169%
USDA site (site, requests, percentage of all requests):
-------------------------------------------------------
uk.ac: 1218 1.03009083066931%
com.atext: 1237 1.0461595710492%
162.79: 1788 1.51215304206627%
com.alexa: 1972 1.66776610679792%
com.dec: 2564 2.16843422810846%
com.aol: 5861 4.95678354560985%
gov.usda: 10190 8.61791918269312%
Johns Hopkins site (site, requests, percentage of all requests):
-------------------------------------------------------
edu.umd: 771 0.337698742937234%
com.compuserve: 778 0.340764749682449%
com.alexa: 1153 0.505015111033244%
com.dec: 1413 0.618895361569796%
net.att: 1470 0.643861416495116%
net.uu: 1481 0.64867942709474%
com.erols: 1684 0.73759362270597%
com.aol: 7346 3.21755507862117%
edu.jhu: 108626 47.5782926722439%