---------- Forwarded message ----------
Date: Mon, 2 Nov 1998 19:19:40 -0800 (PST)
From: Neal Cardwell <cardwell@cs.washington.edu>
To: tcp-impl@cthulhu.engr.sgi.com
Cc: Neal Cardwell <cardwell@cs.washington.edu>
Subject: delayed ACKs for retransmitted packets: ouch!
Recently i've been looking at a scenario where i'm seeing delayed ACKs for
retransmitted packets really destroy the performance of New Reno.
The TCP congestion control draft (draft-ietf-tcpimpl-cong-control-00.txt)
specifies that "Out-of-order data segments SHOULD be acknowledged
immediately, in order to trigger the fast retransmit algorithm." Many
implementations -- at least FreeBSD 3.0 and Linux 2.1, and probably most
others, i'm guessing -- interpret this by sending an immediate
acknowledgment only if a data segment they receive is above a hole in
their receive queue. That is, the ACK only if the sequence number is above
and not equal to rcv_next (see Figure 27.15 in Stevens vol 2 for the code
snippet that does this in Net/3 and FreeBSD).
Unfortunately, this means that if the sender retransmits a single segment
which fills in a hole, then the receiver finds that this segment fits in
nicely at rcv_next. So the receiver will sit around until its delayed ACK
timer expires, possibly hundreds of ms later. Only then will it ACK to the
sender that the hole has successfully been filled, and only then will the
sender be able to continue on, perhaps filling other holes.
Consider the following sequence plots of tcpdumps of two TCP connections:
http://www.cs.washington.edu/homes/cardwell/misc/xfer1.ps
http://www.cs.washington.edu/homes/cardwell/misc/xfer2.ps
These show a Linux 2.1.126 sender at UW sending 100KB to my Linux 2.0.32
machine at home over my 440Kbps DSL line. The traces are from the
perspective of the sender. The RTT is about 22ms for short packets, and
the MSS is about 1460 bytes.
These transfers should have taken about 2 seconds, judging from the slope
of the ACKs during slow start. But of course slow start overshoots, and
there are many losses at around the 1 second mark in both traces. Now
because the Linux 2.1.126 sender is using New Reno, it spends several
painful seconds in Fast Recovery filling in the holes, one segment at a
time. As a result the second transfer, for instance, spends nearly 5
seconds in Fast Recovery; during this period I'm getting about 30Kbps on
average, and not so happy about the $ i forked over for DSL buying me
modem performance!
Why does it spend so long in Fast Recovery? I think the main problem is
that the receiver is delaying its ACKs for the retransmitted segments that
are nicely filling holes in its receive queue. It happens to be delaying
them by a lot, due to the particular delayed ACK implementation in Linux
2.0. But i think the point is that delaying acknowledgments is a very bad
idea when the sender is filling in holes one packet at a time, as it will
tend to do in Fast Recovery, or immediately after an RTO (assuming no
SACK).
So what i'm asking is this: is it a good idea to clarify or extend the
notion of "out-of-order" data that should be ACKed immediately, in such a
way that data segments that fill in a hole in the receive queue should be
ACKed immediately? This would seem to alleviate this problem with New
Reno. Are there other scenarios where it would make things worse instead?
neal