When in doubt, refer to the triangle pages.
<a
href="http://www.cs.washington.edu/research/networking/detour/local/triangle/">here</a>
---
To all {data, result}, consumers,
Discreperencies were discovered in the data sets provided previously -- specifically, that the data was not as well sorted as thought.
If any of your code depended on the sorted aspect of the data file, you will likely want to rerun it.
After removal of any trace that had as a target an ICMP rate limiter, the xmas tree/loss knee has been vanquished.
If you view the dataset results prior to min50 sampling (see below), you'll note that all of our high-loss links were sadly spurious.
Additionally, an error was discovered in the calculation of the loss rate for alternate paths. This has also been corrected. A copy of the analysis code will be posted to the Triangle web page, along with an additional copy of this message.
New copies of the data sets are available in the following formats:
The unmodified, original data file. Not sorted, in original format. The original data file, in the new format. The data file after being sorted, having their dupes, A-A paths, robots.txt limiters, changed IP's, et al, removed. The data file after the additional removal of all targets that do ICMP rate limiting The data file after all of the above, and having a minimum of 50 objects per path.
Result files for the best better alternate path, per various metrics, are also included. These are not time-of-day or date filtered. Such result files are provided for both the 50 sample and the non 50 sample data files (the last two).
The interpretation of the result filenames is as follows:
test0_-1_-1_0.EvilNap.csv test0_-1_-1_0.csv test0_-1_-1_0.evilNap.node.minus.csv test0_-1_-1_0.evilNap.node.plus.csv test0_-1_-1_1.EvilNap.csv test0_-1_-1_1.csv ...
Given the following prefix,
test0_-1_-1_0
test is merely a prefix name. The first 0 denotes that this was the zeroth time zone to be scanned, and the {-1,-1} markers denote the time interval. -1 denotes no time restriction. Other digits would represent the time (in seconds) into the day, as began at midnight.
The latter {0,2} denotes the metric used to determine the best alternate 2-hop path. The metrics are as follows:
COST_METRIC_MEAN_LATENCY = 0; COST_METRIC_MEDIAN_LATENCY = 1; COST_METRIC_MIN_LATENCY = 2; COST_METRIC_MAX_LATENCY = 3; COST_METRIC_MEAN_HOPS = 4; COST_METRIC_MEDIAN_HOPS = 5; COST_METRIC_MIN_HOPS = 6; COST_METRIC_MAX_HOPS = 7; COST_METRIC_DROPS = 8;
As for the final suffix, the "standard" test file is the short filename -- ".csv".
EvilNap.csv attempts to quantify the loci of evility, by listing a reference count: incremented for each primary path that contained this ip, and had a better alternate. Decremented for good primary paths. (Note that Verio is consistently at the top, with high evility scores.)
The evilNap.plus/minus files list the actual paths that included the evilNaps, for plus and minus effects.