The URL is http://www.tornado.be/cgi-bin/traceroute?
And Neal brings up the additional point that I could differentiate the
servers that require this. Fortunately, further analysis of the logs
suggests that this is even less frequent than thought. So if a few data
points are dropped, it's going to vastly increase the amount of data
retrieved.
As for clustering, my fixes haven't been working. So IP use it is,
w/special casing perhaps.
---------- Forwarded message ----------
Date: Fri, 29 May 1998 20:43:10 -0700 (PDT)
From: Neal Cardwell <cardwell@cs.washington.edu>
To: John Snell <geigudr@cs.washington.edu>
Subject: Re: oddities.
Ah-ha. Check this out (had to hack tcpdump to print ASCII. what fun!):
http://www.tornado.be/cgi-bin/traceroute?www.yahoo.com
causes this request:
20:31:20.712062 grad-pc26.cs.washington.edu.29370 >
spring.tornado.be.http: P 1:308(307) ack 1 win 32120 (DF)
45 00 01 5B D8 93 40 00 40 06 C9 81 80 5F 04 90 E..[..@.@...._..
C2 95 50 03 72 BA 00 50 1B EF 36 7D 35 D0 2D E0 ..P.r..P..6}5.-.
50 18 7D 78 B1 16 00 00 47 45 54 20 2F 63 67 69 P.}x....GET /cgi
2D 62 69 6E 2F 74 72 61 63 65 72 6F 75 74 65 3F -bin/traceroute?
77 77 77 2E 79 61 68 6F 6F 2E 63 6F 6D 20 48 54 www.yahoo.com HT
54 50 2F 31 2E 30 0D 0A 43 6F 6E 6E 65 63 74 69 TP/1.0..Connecti
6F 6E 3A 20 4B 65 65 70 2D 41 6C 69 76 65 0D 0A on: Keep-Alive..
55 73 65 72 2D 41 67 65 6E 74 3A 20 4D 6F 7A 69 User-Agent: Mozi
6C 6C 61 2F 34 2E 30 35 20 5B 65 6E 5D 20 28 58 lla/4.05 [en] (X
31 31 3B 20 49 3B 20 4C 69 6E 75 78 20 32 2E 30 11; I; Linux 2.0
2E 33 32 20 69 36 38 36 29 0D 0A 50 72 61 67 6D .32 i686)..Pragm
61 3A 20 6E 6F 2D 63 61 63 68 65 0D 0A 48 6F 73 a: no-cache..Hos
74 3A 20 77 77 77 2E 74 6F 72 6E 61 64 6F 2E 62 t: www.tornado.b
65 0D 0A 41 63 63 65 70 74 3A 20 69 6D 61 67 65 e..Accept: image
2F 67 69 66 2C 20 69 6D 61 67 65 2F 78 2D 78 62 /gif, image/x-xb
69 74 6D 61 70 2C 20 69 6D 61 67 65 2F 6A 70 65 itmap, image/jpe
67 2C 20 69 6D 61 67 65 2F 70 6A 70 65 67 2C 20 g, image/pjpeg,
69 6D 61 67 65 2F 70 6E 67 2C 20 2A 2F 2A 0D 0A image/png, */*..
41 63 63 65 70 74 2D 4C 61 6E 67 75 61 67 65 3A Accept-Language:
20 65 6E 0D 0A 41 63 63 65 70 74 2D 43 68 61 72 en..Accept-Char
73 65 74 3A 20 69 73 6F 2D 38 38 35 39 2D 31 2C set: iso-8859-1,
2A 2C 75 74 66 2D 38 0D 0A 0D 0A *,utf-8....
http://194.149.80.3/cgi-bin/traceroute?www.yahoo.com
(which doesn't work) causes this request:
45 00 01 47 D9 A5 40 00 40 06 C8 83 80 5F 04 90 E..G..@.@...._..
C2 95 50 03 72 C6 00 50 34 20 3A 2E 3E 9E 81 78 ..P.r..P4 :.>..x
50 18 7D 78 88 74 00 00 47 45 54 20 2F 63 67 69 P.}x.t..GET /cgi
2D 62 69 6E 2F 74 72 61 63 65 72 6F 75 74 65 3F -bin/traceroute?
77 77 77 2E 79 61 68 6F 6F 2E 63 6F 6D 20 48 54 www.yahoo.com HT
54 50 2F 31 2E 30 0D 0A 43 6F 6E 6E 65 63 74 69 TP/1.0..Connecti
6F 6E 3A 20 4B 65 65 70 2D 41 6C 69 76 65 0D 0A on: Keep-Alive..
55 73 65 72 2D 41 67 65 6E 74 3A 20 4D 6F 7A 69 User-Agent: Mozi
6C 6C 61 2F 34 2E 30 35 20 5B 65 6E 5D 20 28 58 lla/4.05 [en] (X
31 31 3B 20 49 3B 20 4C 69 6E 75 78 20 32 2E 30 11; I; Linux 2.0
2E 33 32 20 69 36 38 36 29 0D 0A 48 6F 73 74 3A .32 i686)..Host:
20 31 39 34 2E 31 34 39 2E 38 30 2E 33 0D 0A 41 194.149.80.3..A
63 63 65 70 74 3A 20 69 6D 61 67 65 2F 67 69 66 ccept: image/gif
2C 20 69 6D 61 67 65 2F 78 2D 78 62 69 74 6D 61 , image/x-xbitma
70 2C 20 69 6D 61 67 65 2F 6A 70 65 67 2C 20 69 p, image/jpeg, i
6D 61 67 65 2F 70 6A 70 65 67 2C 20 69 6D 61 67 mage/pjpeg, imag
65 2F 70 6E 67 2C 20 2A 2F 2A 0D 0A 41 63 63 65 e/png, */*..Acce
70 74 2D 4C 61 6E 67 75 61 67 65 3A 20 65 6E 0D pt-Language: en.
0A 41 63 63 65 70 74 2D 43 68 61 72 73 65 74 3A .Accept-Charset:
20 69 73 6F 2D 38 38 35 39 2D 31 2C 2A 2C 75 74 iso-8859-1,*,ut
66 2D 38 0D 0A 0D 0A f-8....
The www.tornado.be web server is apparently distinguishing between the
field
Host: www.tornado.be
vs
Host: 194.149.80.3
In the first case, it knows which of the many web sites it's hosting that
you are accessing. In the second, it has no idea that you want
www.tornado.be; it only sees the 194.149.80.3 site, which has no
traceroute CGI. This probably means that to get some of these traceroute
CGIs you'll have to use the domain name, so that Java puts the right thing
in the "Hosts:" field. So maybe for now you just need to use DNS for these
picky, fucked up, multi-site servers, whereas most hosts can just use IP
addresses?
neal