Before putting our new 40Gbit/s link into production I had the task to test it and check if it is production ready. We don’t have a dedicated 40Gbit/s tester yet, we only have a 40Gbit/s Ethernet card inside a Linux server. There are some technics how you can check fast links with slower test equipment and one is also described here. But my goal was also to find out if with a very limited budget (and equipment) I can do some testing on 40Gbit/s link, with 40Gbit/s ethernet card and a Linux sever running packETH application. In this article there are some guidelines and examples how to perform this kind of tests with packETH.
Initial setup is on the photo below:
Based on results we did on 10Gbit/s (http://packeth.sourceforge.net/packeth/Performance.html) it was quite clear that producing a full 40Gbit/s line rate won’t be a trivial task. Using jumbo frames will be for sure necessary, so the first thing I did was to set the MTU to 9000 on all ends.
On the Linux side you can do this simply by typing ifconfig ethX mtu 9000 (this value excludes L2 header). You can always check with packETH if MTU is set correctly by sending out the desired packet length. If you send one packet inside Builder window you get the info in the status bar how much bytes were sent on the link (including L2 ethernet header). Returning -1 indicates an error. Typically you get this error if you want to send larger packet then what interface MTU is set for. An example: if your interface mtu is 1500 but you create and send a 2000 bytes packet, you get as result -1. If MTU is set correctly, you will get: Sent 2000 bytes on ethX.
1) packETH performance on 40Gbit/s
First test was to try to produce full line rate traffic with packETH directly. Because I wanted to test the link between the Cisco’s in both directions the routing setup was done as you can see on the photo below. The first 6500 switched the packet towards second 6500 which routed packet back to first 6500 which finally discarded packet (ip route -> Null 0). The setup on Linux server is pretty simple as well, the destination MAC address has to match the L3 interface of the second Cisco and the destination ip has to match the final destination. Don’t forget to disable ip redirects on the second switch (“no ip redirects” command on the interface) because you are sending the packets out of the same interface as they are coming in. Without this command switch might send packets to CPU what will successfully kill the CPU and decrease the performance as well.
Increasing the load on packETH showed overrun errors on 6500 input interface when rate passed 12-13Gbit/s. I could see packETH easily going higher towards 40Gbit/s, also the input counters confirmed that, but the exit traffic was limited to about 12-13Gbit/s. Some investigation into 6500 40Gbit/s card architecture showed that 40Gbit/s cards are actually sort of 4x16Gbit/s ether channel what explains the above result (16 Gbit/s is including all the internal headers and overhead). You can see the card architecture if you click on the picture below (bottleneck is red circled):
Knowing that you can not achieve full line rate with single flow but you have to do sort of load-balancing. You can specify on 6500 which option would you like to have to do load-balancing. Since packETH supports the option “change source IP address while sending” this was also selected as load-balancing criteria on the switch.
How much this influences overall performance of packETH is visible below. Performance drop in understandable since with this option running packETH has to recalculate ip checksum (and udp&tcp checksum in case of udp&tcp packets*) for every packet sent on the link. But even with this options I got overrun errors while approaching 40Gbit/s. Could be that random that packETH does (function rand() inside C library) is not random enough for Cisco. Anyway, even running two parallel generators or more, which gave the full line rate was not good enough to pass the test without any errors, what was the final requirement.
* correct tcp and udp checksum recalculation works from version 1.7.3
packETH, 9000bytes, type UDP, max speed, no options -> 37,5Gbit/s
packETH, 9000bytes type UDP, max speed, change source ip address -> 11Gbit/s
packETH, 9000bytes no UDP&TCP header, max speed, change source ip address -> 35,5Gbit/s (1)
packETHcli, 9000bytes UDP, max speed, no output to stdout -> 39,8Gbit/s (2)
packETHcli, 9000bytes UDP, max speed, display output every 1s -> 39,8Gbit/s (3)
Ad 1: if you change source IP while sending, you need to recalculate IP and UDP (or TCP) header, what requires even more computing. Thus is is better to create a non UDP and non TCP packet where only IP checksum will be recalculated. End result is 3x times better!!!
Ad 2: example how to run cli version: ./packETHcli -i eth0 -d -1 -n 0 -m 2 -f juni2-9000.pcap
Ad 3: example how to run cli version: ./packETHcli -i eth0 -d 0 -n 0 -m 2 -f juni2-9000.pcap
I also did some tests to see what is the maximum performance in this setup using different packet length. What if you want to use some shorter packets, what performance we can expect here? Similar as with 10Gbit/s tests, I did the test on 40Gbit/s and found some interesting stuff:
First thing is obvious, we need jumbo frames to get rates close to 40Gbit/s. But even then, we can get different results. Same setup, same link, same parameters, but why two different results? I don’t know. Rebooting server did mostly help getting better results, but even this was not always the case. The clear reason stays open for now.
Conclusion: it is clear, that reaching 40Gbit/s is not that easy as reaching 10Gbit/s, especially if some parameters are changed during sending, what requires checksum recalculation. Without this changes, packETHcli can be used to put full line rate on the link.
Finally, because of Cisco 40Gbit/s card architecture I had to choose different setup to get full link capacity.
2) using packETH with 4 different streams
Because of Cisco 40Gbit/s card architecture I had to change test scenario to get full link capacity between Cisco’s. In this case we have two different options. In either case we first use GUI to create the 4 different packets that will later meet load-balancing criteria (it was enough to choose 4 different source IP addresses, with little guessing it was clear that they are correctly load balanced). Then we have two options: we can use GUI’s Gen-s option to send 4 different streams. For max performance choose manual stream mode and set all the gaps to 0 only number of packets to 1. With this setup I achieved approx. 33,7Gbit/s. Another option is to start 4 cli generators, each one sending different packet. It turns out that appropriate gap in this case is about 7 ms, what brings approx. 10Gbit/s per stream and near the full line rate all together,
packETH, Gen-s, 9000bytes, UDP, 4 streams, manual mode, 0ms gaps -> 33,7Gbit/s
packETHcli, 9000bytes, UDP, 4x streams 10Gbit/s each, -> 39,8Gbit/s (1)
Ad 1: start 4 packETHcli generators in 4 different windows, each can be started as:
./packETHcli -i eth0 -d 7 -n 0 -m 2 -f fileX.pcap (X are four different packets)
3) using packETH and loop the traffic between 6500
Finally we can use the routers themselves as traffic generators. In this case we configure the route which will loop the traffic between both routers until TTL reaches 0. We use packETH to generate 4 different streams with approx 78Mbit/s per stream with TTL set as 254. That means our traffic will be 127 times multiplicated on the link what totals near 40Gbit/s (4 x 78Mbit/s x 127 times = 39,6Gbit/s). On the picture below the setup is done for one stream, the same applies for all 4 streams. In this scenario you even don’t need to have the 40Gbit/s link between the linux server and router, what means you can do it without 40Gbit/s ethernet card.
I think you can use packETH to successfully test 40Gbit/s links. It is not a dedicated hardware tester and it can not do full line rate tests with shorter packets. But with this limitations in mind you can still do some tests that will at least in some points guarantee you that link works as required and no traffic get lost while load approaches 40Gbit/s.
In this post there is not much said how you check for errors or lost packets. In above tests you mostly do it on the switch side, observing the packet and error counters on input and output interfaces. How to use counters with packETH and what you need to be aware relying on them will be described in some other post.
Ethernet card: Mellanox ConectX MCX313A
driver: mlx4_en (MT_1060140023_CX-3)
version: 1.5.10 (Jan 2013)
Server: HP380DL Gen 8, 32GB ram, 24x cpu
DUT: Cisco 6500 sup2t, WS-X6904-40G card