Hi, First, great job on Mininet. It's a wonderful tool, and I think replicating the results of CS papers is extremely important. Second, I've been having trouble replicating the dctcp experiments, as described in http://reproducingnetworkresearch.wordpress.com/2012/06/09/dctcp-2/ With 3.2 and 3.2.54 kernels, it doesn't appear as if the ECN marking is occuring. The bottleneck queue length has the same oscillation between 200 and 425 packets for reno, reno+ecn, and dctcp. I added an execution of 'tc -s qdisc' on the switch at the end of dctcp.py and it confirms that no packets are being marked. The behavior improves somewhat with a 3.6 kernel (the patch required little modification up to that point in the series). At this point I see reno+ecn working to keep the bottleneck queue length below 30 packets. But dctcp still doesn't appear to work even though stats show the switch is marking packets. I have also uncommented the printks marking the transition between CE=0 and CE=1 states in the ACK generation state machine, but see nothing in dmesg. Do you have any insights into what might be going wrong? Sometimes I worry that my laptop isn't fast enough, but see below. Thank you for any information you might share, Andrew Shewmaker My laptop specs are: 2GHz 8-core i7 laptop w/ 8GB RAM Fedora 18 custom 3.2, 3.2.54, and 3.6 kernels + dctcp patch openvswitch 2.0 mininet from git 'sudo mn --test=iperf' yields: 550+ Mbps on 3.2.x 750+ Mbps on 3.6 10+ Gbps on 3.10+ (tso on) 1+ Gbps on 3.10+ (tso off, gso on) 550+ Mbps on 3.10+ (tso/gso off) 'sudo mn --link=tc,bw=100 --test=iperf' yields 78+ Mbps on 3.2.x 90+ Mbps on 3.10+ And those rates decrease again for the dctcp.py experiment: 7-16 Mbps on 3.2.x 20-30 Mbps on 3.6+ These don't seem fast enough to cause congestion, but the bottleneck queue length zigzags between 200 and 425 packets, and I see the regular reno cwnd response.-- Andrew Shewmaker
Monday, March 24, 2014
1
Subscribe to:
Post Comments (Atom)
Hi Andrew,
ReplyDeleteThe README (https://bitbucket.org/nikhilh/mininet_tests/src/ad08368cf347/dctcp/README) suggests using a 3.2.18 kernel. Have you tried the same version? You may find these deb packages useful: http://www.scs.stanford.edu/~jvimal/kernels. You can try using "alien" utility to convert the deb to rpm. IIRC, these were the kernel binaries we gave Stanford students for DCTCP related experiments.
The TCP code in the kernel changes very often, and DCTCP hasn't been merged into mainline, so it would take a lot of time to debug the issue. Try the above kernel and let us know.
The 78Mb/s avg rate on 3.2.x kernel with --bw=100 sounds troubling. At 100Mb/s, you would need a 120us timer to trickle 1500Byte packets for good rate limiting. I would try to use a server for running any experiment that requires good performance fidelity.
--
Vimal
Thanks Vimal,
ReplyDeleteI had forgotten the README suggests that specific kernel. I've tried
building that same version on multiple hosts, including a server that
has a good timer.
The server has significantly better fidelity, but I don't see the red
queue on the switch marking any packets with the 3.2.18 kernel. With
3.6 I see marked packets and tcp with standard ecn working.
I'll try your deb packages next.
Thanks again.
Andrew
Vimal,
ReplyDeleteI've used your kernel deb packages on Ubuntu 12.04.1. The results from
dctcp look like those from tcpecn. The bottleneck queue length
approaches 200 packets instead of oscillating around 20 packets as it
should. I've attached the results, minus the large tcp_probe.txt file.
The server I'm testing with is a 1.6GHz Opteron, and the timer
precision is well below 120us.
Any ideas for what I should look at next?
Thanks!