Zig Pcap Bitcast || Nadim's adventures at sea

In this article, I want to go through how to use Zig’s @bitCast to extract specific fields from a network packet’s TCP header.

Context

Tigerbeetle have a demo on @bitCast which shows how to fill a struct’s values by pointing it to bytes in memory without any parsing, deserialisation or other processing. This works so long as the memory layout at the specified address matches the struct’s layout. Zig will also check that the size of the type we are casting to (e.g the struct’s type) is the same as the size of the data we are passing in.

Bitcast

In the image above, we show that 4 consecutive bytes in memory could be interpreted in multiple ways, for example as a single u32 , or as two u16 values (which could be a struct field each). This can be applied to bits as well where a u8 such as 10101010 could be interpreted as eight u1 values each representing a state of “on” or “off” for example.

In this article, we want to extract the source and destination ports from the TCP header of a packet that we captured. Wireshark and tshark are tools that can provide this information, but if we want to use those values in another program we would need to use a process substitution to retrieve them. This would slow the program down if we were parsing millions of pcaps. We can however leverage the “pcap” library (libpcap) which those tools are built upon. Together with Zig’s @bitCast we can extract the fields that we want. We will use a simplified capture with a single packet extracted from a sample source pcap:

$ wget https://wiki.wireshark.org/uploads/__moin_import__/attachments/SampleCaptures/fix.pcap
$ editcap -r fix.pcap onefix.pcap 12

Let’s look at how we can achieve this!

Updates

Zig is still under heavy development and as such we can expect breaking changes when building older code against newer versions. I recently tried to build this code again and ran into some issues which are documented in this section.

20231127

Newer versions of zig changed the @bitCast function to take only a single argument. According to this commit, zig fmt automatically fixes this.

$ zig version
0.12.0-dev.1744+f29302f91
$ zig build-exe -lpcap -lc src/main.zig  --cache-dir zig-cache
src/main.zig:17:25: error: expected 1 argument, found 2
    const tcp_src_dst = @bitCast(tcp_header_partial, packet[34..38].*); // <--
                        ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
$ zig fmt
$ cat src/main.zig | grep bitCast
    const tcp_src_dst = @as(tcp_header_partial, @bitCast(packet[34..38].*)); // <--

That looks good, although we could remove the @as cast and explicitly declare the type of tcp_src_dst. We’ll keep the code zig fmt produced and try to build again:

$ zig build-exe -lpcap -lc src/main.zig  --cache-dir zig-cache
src/main.zig:13:9: error: local variable is never mutated
    var errbuf: [*]u8 = undefined;
        ^~~~~~
src/main.zig:13:9: note: consider using 'const'

The compiler gives us a great hint, let’s try its suggestion. Here is the final code:

const std = @import("std");
const c = @cImport({
    @cInclude("pcap/pcap.h");
});

const tcp_header_partial = packed struct {
    src_port: u16,
    dst_port: u16,
};

pub fn main() !void {
    var header = c.struct_pcap_pkthdr{ .ts = undefined, .caplen = undefined, .len = undefined };
    const errbuf: [*]u8 = undefined;
    var pcap = c.pcap_open_offline("onefix.pcap", errbuf);
    errdefer pcap.close();
    var packet = c.pcap_next(pcap, &header);
    const tcp_src_dst = @as(tcp_header_partial, @bitCast(packet[34..38].*)); // <--
    std.debug.print("src: {d}, dst: {d}\n", .{ @byteSwap(tcp_src_dst.src_port), @byteSwap(tcp_src_dst.dst_port) });
}

Looks like that fixes it and we are back to working code:

$ zig build-exe -lpcap -lc src/main.zig  --cache-dir zig-cache; ./main
src: 11001, dst: 53867

Implementation

Pre-requisites

Zig is installed

$ zig version
0.11.0-dev.2639+4df87b40f

Tshark/Wireshark installed

$ tshark --version
TShark (Wireshark) 3.6.2 (Git v3.6.2 packaged as 3.6.2-2)

Part 1: Calling C from Zig

The first part of the puzzle is figuring out how to use the C library “pcap” from Zig. The official Zig docs are a good starting point to understand how to interact with C. We will be trying to use the function pcap_open_offline which let’s us open an existing pcap file for reading.

First, let’s create a directory and a main.zig file:

$ mkdir bitcast-pcap
$ cd bitcast-pcap
$ cat >main.zig <<EOF
const std = @import("std");
const c = @cImport({
    @cInclude("pcap/pcap.h");
});

pub fn main() !void {
    var errbuf: [*]u8 = undefined;
    var pcap = c.pcap_open_offline("onefix.pcap", errbuf);
    std.debug.print("{any}\n", .{pcap});
}
EOF

Next, let’s try to build the executable:

$ zig build-exe main.zig 
main.zig:2:11: error: C import failed
const c = @cImport({
          ^~~~~~~~
main.zig:2:11: note: libc headers not available; compilation does not link against libc

The error tells us that by default, compilation does not link against libc. Maybe we need to tell build-exe what libraries to link against. Let’s look at the help:

$ zig build-exe --help | grep -A1 "Link Options"
Link Options:
  -l[lib], --library [lib]       Link against system library (only if actually used)

There is an option for linking against specified system libraries. In our case we want to use “pcap.h” so let’s specify the library to link against:

$ zig build-exe -lpcap  main.zig
[...]
/home/nequo/.cache/zig/o/d644b3b2a9abdbbfef932e09d37e5788/cimport.h:1:10: error: 'pcap/pcap.h' file not found
#include <pcap/pcap.h>

The error we get indicates that we are missing the required C header files. Let’s install the package that provides them¹ and try again:

$ sudo apt install libpcap-dev
$ zig build-exe -lpcap main.zig 
$ ./main 
Segmentation fault at address 0x0
???:?:?: 0x0 in ??? (???)
/home/nequo/zig-linux-x86_64-0.11.0-dev.2639+4df87b40f/lib/std/start.zig:609:37: 0x208fbd in posixCallMainAndExit (main)
            const result = root.main() catch |err| {
                                    ^
/home/nequo/zig-linux-x86_64-0.11.0-dev.2639+4df87b40f/lib/std/start.zig:368:5: 0x208a70 in _start (main)
    @call(.never_inline, posixCallMainAndExit, .{});
    ^
Aborted

This time, the build worked but running the code produced a segmentation fault. My intuition is that pcap.h requires some components of libc ². Let’s also link against libc :

$ zig build-exe -lpcap -lc main.zig
$ ./main 
.home.nequo..cache.zig.o.c835a6c34d3e1a61bd792506e3063b28.cimport.struct_pcap@d104c0

Ok great! We now have a way of using a function from the C library “pcap” in our Zig code!

Part 2: Using libpcap

In Part 1, we used the function pcap_open_offline which will open a pcap file and return a pcap_t * if the operation was successful, or write the error to errbuf otherwise. How do we use that return value? Reading through programming with pcap, several interesting functions are mentioned. Here are their function signatures extracted from the documentation:

const u_char *pcap_next(pcap_t *p, struct pcap_pkthdr *h);
int pcap_loop(pcap_t *p, int cnt, pcap_handler callback, u_char *user);
int pcap_dispatch(pcap_t *p, int cnt, pcap_handler callback, u_char *user);

For our simple example, let’s use pcap_next which takes a pcap_t and a pcap_pkthdr struct and returns a pointer to the start of the packet’s data³.

We are going to need a pointer to an empty pcap_pkthdr struct to pass in, but how do we use a struct declared in a C library?. If we add --cache-dir zig-cache to our build-exe command, we will be able to see some extra build artifacts that include a cimport.zig file with libpcap’s functions and structs imported into Zig:

$ zig build-exe -lpcap -lc main.zig  --cache-dir zig-cache
$ find zig-cache -name cimport.zig 
zig-cache/o/c835a6c34d3e1a61bd792506e3063b28/cimport.zig

Looking at cimport.zig , we find the definition of our pcap_pkthdr struct:

pub const struct_pcap_pkthdr = extern struct {
    ts: struct_timeval,
    caplen: bpf_u_int32,
    len: bpf_u_int32,
};

Let’s use this to create an uninitialised struct in our code, then run pcap_next to fill it. Here is the full main.zig file:

const std = @import("std");
const c = @cImport({
    @cInclude("pcap/pcap.h");
});

pub fn main() !void {
    var header = c.struct_pcap_pkthdr{ .ts = undefined, .caplen = undefined, .len = undefined };
    var errbuf: [*]u8 = undefined;
    var pcap = c.pcap_open_offline("onefix.pcap", errbuf);
    var packet = c.pcap_next(pcap, &header);
    _ = packet;
    std.debug.print("{any}\n", .{header});
}

Let’s build and run our new program:

$ zig build-exe -lpcap -lc main.zig  --cache-dir zig-cache
$ ./main 
zig-cache.o.c835a6c34d3e1a61bd792506e3063b28.cimport.struct_pcap_pkthdr{ .ts = zig-cache.o.c835a6c34d3e1a61bd792506e3063b28.cimport.struct_timeval{ .tv_sec = 1448733590, .tv_usec = 539999 }, .caplen = 505, .len = 505 }

This prints out the values from the struct and we can see the length of the packet is 505 bytes. We can verify this with tshark or wireshark :

$ tshark -r onefix.pcap -T fields -e frame.len
505

At this point, packet is a pointer to the start of the packet’s bytes, which will be the start of the packet’s ethernet header!

Part 3: The @bitCast trick

In this section, the goal will be to use the pointer we produced in “Part 2” to retrieve the source and destination ports from the packet’s TCP header. Let’s first get that information from tshark so that we know what to values we are looking for:

$ tshark -r onefix.pcap -T fields -e tcp
Transmission Control Protocol, Src Port: 11001, Dst Port: 53867, Seq: 1, Ack: 1, Len: 439

Recall that in our code, packet is pointing to the start of the ethernet header. To get to the TCP header, we need to offset our pointer by sizeof(ETHERNET_HEADER) + sizeof(IP_HEADER) . A regular ethernet frame is 14 bytes long, and for this packet, the IP header is 20 bytes long ( $ tshark -r onefix.pcap -T fields -e ip.hdr_len ). The TCP header starts with 16 bits for the source port, so we take 2 bytes from the 34th byte by converting from our packet pointer to a slice:

const std = @import("std");
const c = @cImport({
    @cInclude("pcap/pcap.h");
});

pub fn main() !void {
    var header = c.struct_pcap_pkthdr{ .ts = undefined, .caplen = undefined, .len = undefined };
    var errbuf: [*]u8 = undefined;
    var pcap = c.pcap_open_offline("onefix.pcap", errbuf);
    var packet = c.pcap_next(pcap, &header);
    const tcp_src_port = @bitCast(u16, packet[34..36].*); // <---
    std.debug.print("{d}\n", .{tcp_src_port});
}

We build and run the code:

$ zig build-exe -lpcap -lc main.zig  --cache-dir zig-cache
$ ./main 
63786

Oops, that does not look like 11001.. I’ll save you the pain that I went through with debugging this with a single word: “endianness”.

11,001 in decimal is 2AF9 in Hex.
63786 in decimal is F92A in Hex.

On one hand, @bitCast uses the endianness of the host when casting. On the machine I am running this on, that is little endian. On the other hand, our pcap’s bytes are arranged in Network Byte Order which is big endian. The bytes are ordered 2A F9 in memory, but the bitcast is reading them as little endian and producing F9 2A . Let’s use @byteSwap to fix that and print the expected value:

const tcp_src_port = @bitCast(u16, packet[34..36].*);
std.debug.print("{d}\n", .{@byteSwap(tcp_src_port)}); // <--

We build and run the code again:

$ zig build-exe -lpcap -lc main.zig  --cache-dir zig-cache
$ ./main 
11001

This time we get what we expect! But we promised at the start of this article that we would be bitcasting to a struct so let’s see what that would look like. In our TCP header, the 16 bits after source port represent the destination port, so we just need the first 4 bytes of the header:

const std = @import("std");
const c = @cImport({
    @cInclude("pcap/pcap.h");
});

const tcp_header_partial = packed struct {
    src_port: u16,
    dst_port: u16,
};

pub fn main() !void {
    var header = c.struct_pcap_pkthdr{ .ts = undefined, .caplen = undefined, .len = undefined };
    var errbuf: [*]u8 = undefined;
    var pcap = c.pcap_open_offline("onefix.pcap", errbuf);
    errdefer pcap.close();
    var packet = c.pcap_next(pcap, &header);
    const tcp_src_dst = @bitCast(tcp_header_partial, packet[34..38].*); // <--
    std.debug.print("src: {d}, dst: {d}\n", .{ @byteSwap(tcp_src_dst.src_port), @byteSwap(tcp_src_dst.dst_port) });
}

Instead of casting into a u16 like in our previous example, we are casting into a packed struct containing two u16s. The “packed” keyword guarantees the struct’s memory layout. Try removing it and see what the compiler outputs. Notice that we also added errdefer pcap.close();. This will close the pcap file when the function returns and give an error if that fails (although in our case we ignore it).

Let’s run our new program:

$ zig build-exe -lpcap -lc main.zig  --cache-dir zig-cache
$ ./main 
src: 11001, dst: 53867

Those values match what tshark gave us - we did it!

The end

We looked at how to use an external C library in Zig and learned about the basics of working with libpcap. We then used Zig’s @bitCast to parse the source and destination port numbers from a packet’s TCP header without needing to allocate extra memory or call any parsing/deserialising functions. Remember that you can play around with the pointer offset to retrieve other field values from the packet!

There might be better ways to @bitCast with different endianness, but @byteSwap was the only solution that I found which worked for me. If you know of a better way please do let me know!

Taking this further

Use pcap_loop instead of pcap_next to iterate over a pcap with multiple packets.
Produce an output pcap after applying filtering based on the extracted struct fields.
Performance tests for the above.
Use zig’s build system rather than build-exe.

I am using Ubuntu in WSL for this example - package names will vary between distros and operating systems. ↩︎
Looking at the codebase for pcap.h , we see that it includes stdio.h . ↩︎
Pcap files have a defined file format described in pcap_savefile. There is a per-file header and a per-packet header. Here we extract the per-packet header off our single pcap. ↩︎