In my mind there's also something vaguely disturbing about drops - or at
   least consistently provoking lots of drops - because it indicates a system
   where, fundamentally, senders literally have no clue, and end up sending
   so fast that they shoot themselves in the foot (by dropping their own
   packets and suffering timeouts) and others in the head (by filling up
   buffers that others would like to use, forcing them into timeouts). 
I used to be of the "but that's inefficient" school concerning drops.
But the RED paper convinced me otherwise -- drops are just a slightly
less efficient way of doing resource discovery and ECN.  In the case of 
small flows, a *very* inefficient way of doing resource discovery and ECN.
Perhaps there are other operating regimes where RED is very inefficient 
-- if so, we should find them.
On per-destination credits:
   This seems interesting. Maybe the "reverse multicast tree" idea gets at
   this (i can't visualize what that means), but it seems like you only get a
   big win here if the "destination" is an aggregation of hosts, since a
   single host will almost surely have at most 4-5 flows going to it.
By destination I imagined something more like netscape.com.
Perhaps we could fold together aggregation and route lookup --
for example, we could aggregate all of UW's traffic together.  
This sparked another idea -- you could do hierarchical aggregation
of credits.  For example, you would first share credits among all 
flows to the same autonomous region, and then once you get to the
destination region, you should share credits on a more fine-grained
basis, etc.  This fits nicely with our model for the classifier.
tom