Tutorial >> Summary Information | TOC |
Linux 2.4 TCP is NOT Reno (nor Vegas, nor Tahoe) and can not be made to be exactly like these other TCP flavors without substantial kernel mods. There are three primary references:
The maximum socket buffer sizes have been set to 40 MB for both reading and writing, and the maximum per connection buffer sizes to 20 MB for both reading and writing. These values can be verified with:
cat /proc/sys/net/core/[rw]mem_max # max recv/send windows cat /proc/sys/net/ipv4/tcp_[rw]mem # max TCP recv/send buffersThe SO_SNDBUF and SO_RCVBUF arguments to setsockopt() are bounded by one-half of the rmem_max and wmem_max values. You will notice this behavior when you use the -w flag in the iperf command. For example, "-w 4M" will result in a message indicating that the buffer size has been actually set to 8 MB (twice the requested 4 MB).
You can inspect the TCP tuning parameters either by examining the files /proc/sys/net/ipv4/tcp* using the cat or more commands or by the sysctl command. For example, to display all TCP parameters, try one of these two commands:
more /proc/sys/net/ipv4/tcp* sysctl -a | grep tcp
All standard advanced TCP features are ON by default in the ONL testbed. Try one these commands:
cat /proc/sys/net/ipv4/tcp_{timestamps,window_scaling,sack} sysctl net.ipv4.tcp_{timestamps,window_scaling,sack}and you will discover that:
sudo /usr/local/bin/net/timestamps-off sudo /usr/local/bin/net/sack-offThey can be turned back ON by calling the complementary command (e.g., timestamps-on).
If you are planning to experiment with a long-delay path, you should look at Yee-Ting Li's work. In short, there are low-level buffers that may be sized too small when you have long, fat pipes (delays of hundreds of milliseconds and Gbps rate). Unfortunately, Linux will silently drop packets when these buffers are full leaving no indication that you have lost packets at the endhost. But these parameters may have little utility with the delay plugin since it limits traffic to 200 Mbps. The following two commands allow you to increase the size of the receive and send buffers from their defaults.
Readers should consult the Sarolahti and Kuznetsov paper for a detailed discussion of IETF conformance. This section summarizes what we consider to be the important differences for bulk transfer experiments in the ONL testbed. The table below is a reproduction of the conformance table in the Sarolahti and Kuznetsov paper.
Specification | Status |
---|---|
RFC 1323 (Performance Extensions) | Same |
RFC 2018 (SACK) | Same |
RFC 2140 (TCP Control Block Sharing) | Same |
RFC 2581 (Congestion Control) | Differs |
RFC 2582 (New Reno) | Differs |
RFC 2861 (Cwnd Validation) | Same |
RFC 2883 (Duplicate SACK) | Same |
RFC 2988 (RTO) | Differs |
RFC 3042 (Limited Transmit) | Same |
RFC 3168 (ECN) | Differs |
Perhaps the most noticeable Linux 2.4 TCP feature is its retentive destination caching in which the ssthresh and RTO estimation parameters are cached for each destination. This means that once traffic has been sent to some destination D, the initial ssthresh value for the next TCP connection to D will be the same as the one at the end of the preceding connection to D. This is disturbing when you decide to change the packet delay to destination D and expect the initial ssthresh value to be infinite; i.e., stay in slow start until a packet drop occurs.
Below are comments on some of the features listed in the table:
sudo /usr/local/bin/net/tcp-route-flushflushes the cache and in effect, makes ssthresh infinite. Unfortunately, it needs to be run before EACH new TCP connection.
Tutorial >> Summary Information | TOC |