Created
December 3, 2024 05:18
-
-
Save wolfspider/6bb959083eec2ec55fedfd6f1e68e426 to your computer and use it in GitHub Desktop.
Formal RingBuffer 3: Getting Low and Getting Results
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
pub fn buildPacket() []const u8 { | |
// Define the packet components | |
const dest_mac: [6]u8 = [_]u8{ 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF }; // Destination MAC | |
const src_mac: [6]u8 = [_]u8{ 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 }; // Source MAC | |
const ethertype: [2]u8 = [_]u8{ 0x08, 0x00 }; // EtherType (IPv4) | |
const payload: [46]u8 = [_]u8{0} ** 46; // Payload (46 bytes of zeroes) | |
// Combine all components into a single array | |
const packet: [60]u8 = [_]u8{ | |
// Destination MAC | |
dest_mac[0], dest_mac[1], dest_mac[2], dest_mac[3], dest_mac[4], dest_mac[5], | |
// Source MAC | |
src_mac[0], src_mac[1], src_mac[2], src_mac[3], src_mac[4], src_mac[5], | |
// EtherType | |
ethertype[0], ethertype[1], | |
// Payload | |
payload[0], payload[1], payload[2], payload[3], | |
payload[4], payload[5], payload[6], payload[7], payload[8], payload[9], | |
payload[10], payload[11], payload[12], payload[13], payload[14], payload[15], | |
payload[16], payload[17], payload[18], payload[19], payload[20], payload[21], | |
payload[22], payload[23], payload[24], payload[25], payload[26], payload[27], | |
payload[28], payload[29], payload[30], payload[31], payload[32], payload[33], | |
payload[34], payload[35], payload[36], payload[37], payload[38], payload[39], | |
payload[40], payload[41], payload[42], payload[43], payload[44], payload[45], | |
}; | |
// Return as a slice | |
return packet[0..]; | |
} |
The Python speeds are as follows for tx.py:
870.232124 main_thread [2781] 17.326 Mpps (17.335 Mpkts 8.317 Gbps in 1000479 usec) 256.34 avg_batch 1 min_space
871.233117 main_thread [2781] 19.112 Mpps (19.131 Mpkts 9.174 Gbps in 1000994 usec) 256.22 avg_batch 1 min_space
872.234124 main_thread [2781] 17.408 Mpps (17.425 Mpkts 8.356 Gbps in 1001007 usec) 256.14 avg_batch 256 min_space
873.235122 main_thread [2781] 18.698 Mpps (18.717 Mpkts 8.975 Gbps in 1000998 usec) 256.34 avg_batch 1 min_space
874.236123 main_thread [2781] 19.919 Mpps (19.939 Mpkts 9.561 Gbps in 1001001 usec) 256.15 avg_batch 1 min_space
875.237122 main_thread [2781] 18.584 Mpps (18.603 Mpkts 8.921 Gbps in 1000999 usec) 256.10 avg_batch 1 min_space
876.238111 main_thread [2781] 19.494 Mpps (19.513 Mpkts 9.357 Gbps in 1000989 usec) 256.06 avg_batch 256 min_space
And now the new and improved Zig speed:
956.379630 main_thread [2781] 24.087 Mpps (24.112 Mpkts 11.562 Gbps in 1001033 usec) 256.05 avg_batch 0 min_space
957.380678 main_thread [2781] 24.205 Mpps (24.230 Mpkts 11.618 Gbps in 1001048 usec) 256.03 avg_batch 512 min_space
958.381724 main_thread [2781] 24.406 Mpps (24.431 Mpkts 11.715 Gbps in 1001047 usec) 256.02 avg_batch 512 min_space
959.382651 main_thread [2781] 24.374 Mpps (24.396 Mpkts 11.699 Gbps in 1000927 usec) 256.03 avg_batch 1 min_space
960.383717 main_thread [2781] 23.921 Mpps (23.946 Mpkts 11.482 Gbps in 1001066 usec) 256.04 avg_batch 256 min_space
961.384121 main_thread [2781] 24.063 Mpps (24.073 Mpkts 11.550 Gbps in 1000403 usec) 256.02 avg_batch 512 min_space
962.385148 main_thread [2781] 24.095 Mpps (24.120 Mpkts 11.566 Gbps in 1001027 usec) 256.01 avg_batch 256 min_space
963.386189 main_thread [2781] 22.208 Mpps (22.231 Mpkts 10.660 Gbps in 1001042 usec) 256.02 avg_batch 256 min_space
964.387235 main_thread [2781] 21.936 Mpps (21.959 Mpkts 10.529 Gbps in 1001046 usec) 256.02 avg_batch 1 min_space
965.388283 main_thread [2781] 23.325 Mpps (23.349 Mpkts 11.196 Gbps in 1001047 usec) 256.02 avg_batch 256 min_space
966.389328 main_thread [2781] 24.206 Mpps (24.232 Mpkts 11.619 Gbps in 1001046 usec) 256.02 avg_batch 1 min_space
As you can see they are nearly the same because they are, for once, actually doing the same thing! Imagine that. After getting all excited I realized how seriously bad it was that I was sending these packets with a length of 60 and somehow saturating the connection. Basically, that's just not possible if everything is exactly the same. I finally do get packets on the other end of the vale switch using onepacket.py as well:
Waiting for a packet to come
Received a packet with len 60
ffffffffffff000000000000080000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
Waiting for a packet to come
Received a packet with len 60
ffffffffffff000000000000080000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
Waiting for a packet to come
Received a packet with len 60
ffffffffffff000000000000080000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
Waiting for a packet to come
Received a packet with len 60
ffffffffffff000000000000080000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
The number one primary thing to remember is that netmap does not make your network go faster! Despite this many applications using it are based around doing things fast- go figure.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
100Gbps seems excessive doesn't it? This is because it turns out I was updating tx and rx slot length simultaneously in some hideous mistake. Whenever you get these astronomical values it is good to check further and see what is going on. Since writing this I had a power outage and after bringing the machine back up nothing worked. It turns out there were still a few things wrong. Since then I simplified the api some more to store the addresses of both rings and buffers as pointers and successfully use anyopaque to store the buffers. This looks a bit more concise: