Fwd: Re: ftp://ftp.ietf.org/internet-drafts/draft-gettys-webmux-00.txt

Geoff Voelker (voelker@cs.washington.edu)
Mon, 14 Sep 1998 12:01:58 -0700 (PDT)

on HTTP/1.1 deployment...

------- start of forwarded message (RFC 934 encapsulation) -------
From: jg@pa.dec.com (Jim Gettys)
Sender: owner-end2end-interest@ISI.EDU
To: Sean Doran <smd@ebone.net>
Cc: Henrik Frystyk Nielsen <frystyk@w3.org>, Greg Minshall <minshall@siara.com>, end2end-interest@ISI.EDU
Subject: Re: ftp://ftp.ietf.org/internet-drafts/draft-gettys-webmux-00.txt
Date: Mon, 14 Sep 1998 11:49:54 -0700

This is a multi-part message in MIME format, created by Pachyderm.
The parts are separated by "--2" lines.
The first part is a covering note, the others are attachments.

- --2
Content-Type: text/plain

At the moment, actual HTTP/1.1 deployment from a traffic perspective
is relatively limited; I'm interested in getting a handle on what it
actually is. That it could have a significant impact, we're pretty
sure (look at our SIGCOM paper on the topic, of a year ago).

What impact HTTP/1.1 actually will be is much less clear; see the attached
mail, particularly since style sheets are deploying roughly concurrently
and may change the content mix significantly.

It is also not clear how good the client implementations actually are;
it would also be interesting to take a TCP dump of IE4 and/or Mozilla
5 (Communicator 4 does not support 1.1) and see if the implementations
are any good (a bad implementation could do as poorly as HTTP/1.0, in
the extreme case).

Brian Carpenter asked me to look into this at the last IETF; if there
are people on this list with access to the right sort of data, please
let me know. Now that I've got the HTTP/1.1 spec in (for hopefully draft
standard), I intend to spend more cycles trying to find out.

Attached is the mail I drafted for Brian and the IAB.
- Jim Gettys

- --2
Content-Type: message/rfc822
Content-Disposition: inline

Received: by src-mail.pa.dec.com; id AA11783; Mon, 31 Aug 1998 14:20:41 -0700
Received: by pachyderm.pa.dec.com; id AA24767; Mon, 31 Aug 1998 14:20:05 -0700
Date: Mon, 31 Aug 1998 14:20:05 -0700
From: jg@pa.dec.com (Jim Gettys)
Message-Id: <9808312120.AA24767@pachyderm.pa.dec.com>
X-Mailer: Pachyderm (client pachyderm.pa-x.dec.com, user jg)
To: brian@hursley.ibm.com
Cc: iab@isi.edu
Subject: Fwd: Do you have any stats on percentage of servers supporting HTTP/1.1?
Mime-Version: 1.0
Content-Type: multipart/mixed; boundary="1"

This is a multi-part message in MIME format, created by Pachyderm.
The parts are separated by "--1" lines.
The first part is a covering note, the others are attachments.

- --1
Content-Type: text/plain

Brian Carpenter asked me at the Chicago IETF to try to get insight into
HTTP/1.1 deployment. Rather than waiting several weeks to try to get
more complete numbers, here is what I know at the moment. Please consider
this a first installment.

See: http://www.w3.org/Protocols/HTTP/Forum/Reports/ for HTTP/1.1
implementation reports. Most of the raw data is publically available;
some is not, due to some "dirty pool" of some vendors' marketing, though
this private data is aggregated as part of the "rollups" of data available
on that page. (Sigh...)

This does not include some HTTP/1.1 implementations for which I do not
have data, but for which I know implementations exist (e.g. Lotus Domino;
promised, but not delivered), and one report I've not yet added to the
page. We do not query which implementations have been deployed, though
this is often evident from the version number in the data, and can be
divined from the protocol stream itself.

Servers
- -------

The attached data is from the June Netcraft survey (this tidbit is not
in the public data from Netcraft). This works out to about 60% of servers
now claim to support HTTP/1.1. I'd expect the penetration to be somewhat
higher since it is now almost 3 months later. Netcraft would be interested
in a press release if it is deemed appropriate (personal affairs and the
HTTP/1.1 draft have prevented me from following up on this).

I have no current insight as to what % of total Internet traffic these
servers are represent; this would require packet level data I don't have.
This also implies, since Apache is around 50% or a bit more of HTTP servers
on the public internet that there is significant server deployment by
vendors. I'll ask the Netcraft people what the current number for a more
up to date number.

Of the HTTP/1.1 servers, the only servers whose implementation quality
I have any information for are Apache and Jigsaw, both of which have pretty
good pipelining and buffering implementations (optimal from the network
point of view). Jigsaw's usage is very small, so Apache is what counts
(greater than 50% penetration by servers in the Internet).

Clients
- -------

As to the fraction of traffic currently 1.1, I'd expect HTTP/1.1 traffic
to be very much lower than 60%: it is at this date mostly confined to
IE 4.0x, as a fraction of that browser's penetration on the public internet.

This penetration is further diluted by the discovery of nasty behavior
of many deployed HTTP/1.0 proxy implementations. Microsoft made the only
reasonable decision: by default IE4 disables use of HTTP/1.1 if the browser
is talking to a proxy. IE4 can talk HTTP/1.1 to a proxy IFF it is enabled
(a relatively obscure dialog box in IE allows one to turn it on), but
to my knowledge, there is no way to do so "automatically" (would have
been nice if some protocol flag had been set it could key against, so
a HTTP/1.1 proxy could enable the downstream client to use HTTP/1.1
automatically if the upstream proxies were "not broken"; I don't know
if they were clever enough to try such a hack). So only IE implementations
directly on the Internet will show up as HTTP/1.1 traffic.

Other major users of HTTP include "Push" technology implementations, for
which HTTP/1.1 provides some potentially significant performance benefits.
I do not know if their implemementations have been updated or not. These
would include Marimba, PointCast, etc. Composing this mail reminds me
that I should ping those vendors to see if there are implementations there
that should be documented.

Mozilla (Netscape 5) claims to be an HTTP/1.1 implementation, but has
not shipped yet (though sources are available).

I do not know the quality of implementations of the clients at this time:
looking at a few TCP dumps from each of the major clients generating traffic
would be a good way to get a handle on this.

Proxies
- -------

We believe a majority of HTTP Internet traffic is through proxies, from the
Web trace data we have to date; I don't have a good handle on what the
current fraction actually is. I will ask if a better number can be
generated from reasonably current trace data.

I do not currently have any insight into what the penetration of HTTP/1.1
is into the proxy installed base. Estimates could be made by examining
the requests at origin servers. We are seeing issues raised that make
me believe that the corner cases of the protocol are being explored (see,
for example, http://www.w3.org/Protocols/HTTP/Issues/#CHUNKEDTRA, the
CHUNKEDTRAILERS issue recently raised in the HTTP working group). There
are 5 implementation reports for caching proxies (IBM, Inktomi, W3C, MIT
AI, Microsoft) as distinguished from special purpose proxies (e.g. firewall
or micropayment proxies). Some of these, at least, have been deployed
(at least in field test), and some are quite agressive implementations.

Note that the FULL benefit of HTTP/1.1 pipelining and buffering can
only occur if the client, server, and all proxies in between implement
it. It would be very difficult to impossible for a proxy to coalesce
a set of HTTP/1.0 requests into a reasonable HTTP/1.1 "batch".
A client can also implement HTTP/1.1 in such a way that it is not
significantly better than HTTP/1.0, from the Internet's perspective.

Bottom line
- -----------

As a result, I'd be surprise if, as a fraction of total traffic, HTTP/1.1
traffic were more than 10-15% of total HTTP traffic. And in the extreme
case, a bad HTTP/1.1 implementation can eliminate any advantage over HTTP/1.0.
So to date, I would not expect that HTTP/1.1's benefits would have
a significant impact in "cooling" Internet traffic.

Deployment is therefore gated by deployment of HTTP/1.1 proxies and clients,
NOT origin servers.

Further insight
- ---------------

One way, short of full packet traces from an IX to get insight into
penetration of HTTP/1.1 traffic, would be to look at the request headers to
a number of "portal sites" in the net; I will ask the AltaVista people
to see if they can get the statistics at that site. I will also ask Brian
Reid if there is some way to get suitably anonymised data somewhere in
his area (if others have access to IX data, we'd be happy to help figure
out how to extract the useful data). I will also ask our Web
characterization group if there is the data they have access to at the
moment can shed insight into current deployment levels. Most of this
data is, however, proxy logs, which do not necessarily contain all the
information we'd like to answer these questions.

Further complicating the situation is the roughly coincident deployment
and uptake of stylesheets in the Web, which may reduce the number of URL's
loaded per page, shortening the TCP packet train (and increasing the
responsiveness of the web). Content changes could therefore have as
significant an impact on Web traffic as what we are doing at the protocol
level.

This presumes that Web site authors do not just add more URL's since
they can get more useful information/unit time. One theory expressed to
us is that the "waiting time" a site author allows is a constant, and
that site authors will fill that time with as much content as they can
fit. If "fill the bag" theory of site design is true, there might not
be as significant a change in traffic as style sheets deploy.

I don't know which theory is true (I guess a hybrid).

See: http://www.w3.org/Protocols/HTTP/Performance/ and
http://www.w3.org/Protocols/HTTP/Performance/Pipeline.html for our work
in this area. (The latter is our SIGCOM 87 paper).

Vanity Host Names
- -----------------

On a related but different topic: Vanity host names.

HTTP/1.1 defines the "Host" header to allow vanity names without the use
of an IP address. This has been widely deployed well in advance of HTTP/1.1,
and exists in most of the current deployed browsers. (Most of HTTP/1.1
functionality can be deployed without an implementation claiming it speaks
"HTTP/1.1").

I believe that Host header deployment is already widespread. If
deployment is already what I think it is, it may possibly be time to
encourage ISP's to "charge extra" if they need to assign an IP address
to a site. I will also try to get data on this topic.

Besides just raw penetration as a fraction of traffic, though, remember
that we also need to get a handle on some browsers that are not mass market,
if we are to avoid disenfranchising the disabled or those over
very low speed links. (e.g. the lynx browser). I believe AltaVista would
be a particularly good site to see if we can get data from, as they have
tried very hard to interoperate with almost any browser on the planet,
and will not have driven away "down-grade" clients as many other sites
have done.

Next Steps
- ----------

If anyone can help with trace data, please let me know. (Particularly
with anonymized traffic at an exchange point for port 80; URL's, IP
addresses, and all content can be removed for this purpose). Alternatively,
we can work with people with access to the traffic to extract the simple
numbers we need.

I will ask our web characterization group what insights they can provide
given the data they already have access to.

I will follow up with my sources.
- Jim Gettys

- --1
Content-Type: message/rfc822
Content-Disposition: attachment

Received: by src-mail.pa.dec.com; id AA04027; Tue, 2 Jun 98 12:43:55 -0700
Received: from mail1.digital.com by pobox1.pa.dec.com (5.65v3.2/1.1.10.5/07Nov97-1157AM)
id AA20575; Tue, 2 Jun 1998 12:43:55 -0700
Received: from ns0.netcraft.co.uk (ns0.netcraft.co.uk [195.188.192.4])
by mail1.digital.com (8.8.8/8.8.8/WV1.0e) with ESMTP id MAA18338
for <jg@pa.dec.com>; Tue, 2 Jun 1998 12:43:53 -0700 (PDT)
Received: (from mhp@localhost)
by ns0.netcraft.co.uk (8.8.8/8.8.8) id UAA01420
for jg@pa.dec.com; Tue, 2 Jun 1998 20:35:41 +0100 (BST)
(envelope-from mhp)
From: Mike Prettejohn <mhp@netcraft.co.uk>
Message-Id: <199806021935.UAA01420@ns0.netcraft.co.uk>
Subject: Re: Do you have any stats on percentage of servers supporting HTTP/1.1?
In-Reply-To: <9806021721.AA25686@pachyderm.pa.dec.com> from Jim Gettys at "Jun 2, 98 10:21:15 am"
To: jg@pa.dec.com (Jim Gettys)
Date: Tue, 2 Jun 1998 20:35:41 +0100 (BST)
Reply-To: mhp@netcraft.co.uk
X-Mailer: ELM [version 2.4ME+ PL28 (25)]
Mime-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
X-UIDL: 9ec3063054b1e0c64af986989d9ed65e

:
: BTW, what is the % of sites that claim to speak 1.1?
: Thanks,
: - Jim
:

zanussi: {8} awk '{print $3}' Results.9806 |egrep HTTP/1.1|wc -l
1476998
zanussi: {9} awk '{print $3}' Results.9806 | egrep HTTP/1.0 | wc -l
933070
zanussi: {10} wc -l Results.9806
2410068 Results.9806

So HTTP/1.1 wins by 1,476998 sites [hostnames] to 933,070.

Bizarely there's a server called HTTP/1.0 which gives HTTP/1.1 responses

www.zwg.com 193.5.163.86 HTTP/1.1 HTTP/1.0

And one called SAIC-HTTP/1.1 which gives a HTTP/1.0 response.

wwwintermedia-szeged.tiszanet.hu 195.228.98.20 HTTP/1.0 SAIC-HTTP/1.1

Mike
- --
Mike Prettejohn http://www.netcraft.com
mhp@netcraft.com Phone +44 1225 447500 Fax +44 1225 448600
Netcraft Rockfield House Granville Road Bath BA1 9BQ England

- --1--

- --2--
------- end -------