Zig Structs of Arrays (2024)

Hacker News Top Tools

Summary

Explains how Zig's comptime and type reflection enable creating struct-of-arrays (SoA) data structures like MultiArrayList, which improve cache performance in high-performance applications.

No content available
Original Article
View Cached Full Text

Cached at: 06/08/26, 12:16 PM

# Zig structs of arrays | Andreas Hohmann Source: [https://andreashohmann.com/zig-struct-of-arrays/](https://andreashohmann.com/zig-struct-of-arrays/) Andreas HohmannMay 04, 2024\#[software](https://andreashohmann.com/tags/software/)\#[zig](https://andreashohmann.com/tags/zig/)\#[data oriented programming](https://andreashohmann.com/tags/data-oriented-programming/)\#[comptime](https://andreashohmann.com/tags/comptime/)If you want to see the power of Zig's comptime, look no further than[MultiArrayList](https://github.com/ziglang/zig/blob/master/lib/std/multi_array_list.zig)\. This collection stores the data of a list of structs in struct of array \(SoA\) rather than an array of structs \(AoS\)\. This technique is a staple of[data\-oriented design](https://en.wikipedia.org/wiki/Data-oriented_design)and[array\-oriented programming](https://en.wikipedia.org/wiki/Array_programming)\(long live APL\) as used in high\-performance applications such as game engines, scientific computing, and compilers\. The latter is probably the reason why it already exists in Zig's standard library \(see Andrew Kelley's[Practical Data\-Oriented Design](https://vimeo.com/649009599)presentation\)\. Generating a struct\-of\-arrays type for a struct is a non\-trivial type manipulation that I expect from type\-oriented languages such as TypeScript or Scala, but not from a low\-level system programming language such as Zig\. How does Zig pull this off? It turns out that it's "just" compile\-time execution and reflection\. Types in Zig are compile time values\. They can be assigned to constants, passed as arguments to \(compile\-time\) functions, and returned by those functions\. This is already visible in the syntax for type definitions\. A named struct type, for example, is defined by initializing a constant with an anonymous struct declaration\. ``` 1const Token = struct { 2 kind: enum { id, string, number }, 3 data: []const u8, 4}; 5 6test "token size" { 7 try testing.expectEqual(24, @sizeOf(Token)); 8 try testing.expectEqual(2400, @sizeOf([100]Token)); 9} ``` As the test shows, this structure needs 24 bytes on my 64 bit machine, 2\*8 bytes for the`data`slice \(pointer and length\) and 8 bytes for the`kind`enum and alignment\. An array of 100 Token structs needs 2400 bytes\. The[MultiArrayList](https://github.com/ziglang/zig/blob/master/lib/std/multi_array_list.zig)cuts the memory needed down to 1700 bytes: 1600 bytes for the data slices and 100 bytes for the kinds\. The test below demonstrates this by using a fixed buffer allocator with exactly 1704 bytes \(I don't know why the 4 extra bytes are needed\)\. ``` 1const TokenList = std.MultiArrayList(Token); 2 3test "token lists" { 4 var buffer: [1704]u8 = undefined; 5 var fba = std.heap.FixedBufferAllocator.init(&buffer); 6 const allocator = fba.allocator(); 7 8 var tokens = TokenList{}; 9 try tokens.setCapacity(allocator, 100); 10 defer tokens.deinit(allocator); 11 12 tokens.appendAssumeCapacity(.{ .kind = .number, .data = "1000" }); 13 try testing.expectEqual(Token{ .kind = .number, .data = "1000" }, tokens.get(0)); 14} ``` How does this work under the hood? Types are passed to a function using a compile\-time \(`comptime`\) parameter of type`type`\. This is the foundation for Zig's generic types such as the collections in the standard library\. Here is a minimal version of an array list of fixed size \(see[array\_list\.zig](https://github.com/ziglang/zig/blob/master/lib/std/array_list.zig)for the real thing\)\. ``` 1pub fn FixedArrayList(comptime T: type) type { 2 return struct { 3 const Self = @This(); 4 5 items: []T, 6 allocator: Allocator, 7 8 pub fn init(allocator: Allocator, length: usize) Allocator.Error!Self { 9 return .{ 10 .allocator = allocator, 11 .items = try allocator.alloc(T, length), 12 }; 13 } 14 15 pub fn deinit(self: *Self) void { 16 self.allocator.free(self.items.ptr[0..self.items.len]); 17 } 18 }; 19} 20 21test "allocates array list" { 22 const allocator = testing.allocator; 23 24 const n = 10; 25 const PointList = FixedArrayList(Point); 26 var points = try PointList.init(allocator, n); 27 defer points.deinit(); 28 29 points.items[5] = .{ .x = 10, .y = 20 }; 30 try testing.expectEqual(20, points.get(5).y); 31} ``` This is not a very useful structure as it only wraps a slice that one could use in most cases directly, but it demonstrates a simple type generator function\. The type argument`T`is only used once for the type of the`items`slice, and we don't need to know anything about the inner structure of`T`\. To construct the struct\-of\-arrays for a given struct, we have to go a step further\. Our type construction function must be able to look at the original struct's fields and their types to construct the new struct\-of\-arrays type\. Zig offers a compile\-time reflection API for this purpose\. To get a feel for this API, let's define a type constructor for points with explicit coordinates`x1`,`x2`,`x3`, and so forth\. Our type function takes the type`T`of the coordinates and the dimension`n`and returns a struct with one field for each of the dimensions\. ``` 1pub fn PointN(comptime T: type, comptime N: comptime_int) type { 2 var fields: [N]std.builtin.Type.StructField = undefined; 3 for (0..N) |i| { 4 var num_buf: [128]u8 = undefined; 5 fields[i] = .{ 6 .name = std.fmt.bufPrintZ(&num_buf, "x{d}", .{i + 1}) catch unreachable, 7 .type = T, 8 .default_value = null, 9 .is_comptime = false, 10 .alignment = @alignOf(T), 11 }; 12 } 13 return @Type(.{ 14 .Struct = .{ 15 .is_tuple = false, 16 .layout = .Auto, 17 .decls = &.{}, 18 .fields = &fields, 19 }, 20 }); 21} 22 23test "n-dimensional point" { 24 const P3 = PointN(i32, 3); 25 const p = P3{ .x1 = 10, .x2 = 20, .x3 = 30 }; 26 try expectEqual(p.x2, 20); 27} ``` We construct the struct type dynamically using the[@Type](https://ziglang.org/documentation/master/#Type)function\. The metadata of the fields is an array of`StructField`objects\. We are still not using the inner structure of the type`T`\(besides its alignment obtained with[@alignOf](https://ziglang.org/documentation/master/#alignOf)\), but we can see what kind of metadata Zig is using to describe structs and fields\. The final missing piece is a way to obtain this metadata for a given type\. Zig's[@typeInfo](https://ziglang.org/documentation/master/#typeInfo)function does just that\. The[MultiArrayList](https://github.com/ziglang/zig/blob/master/lib/std/multi_array_list.zig)stores the data in a single[byte array](https://github.com/ziglang/zig/blob/6a65561e3e5f82f126ec4795e5cd9c07392b457b/lib/std/multi_array_list.zig#L22)\. This array is the concatenation of the arrays for the fields of the original structure\. To optimize alignment, these sub\-arrays are included in[decending order](https://github.com/ziglang/zig/blob/6a65561e3e5f82f126ec4795e5cd9c07392b457b/lib/std/multi_array_list.zig#L139-L173)of their alignment \(the alignment of the corresponding field\)\. The generated struct\-of\-arrays type keeps the field sizes and field index permutation \(due to the sorting\) in the[sizes](https://github.com/ziglang/zig/blob/6a65561e3e5f82f126ec4795e5cd9c07392b457b/lib/std/multi_array_list.zig#L142)structure\. This structure is used to compute the overall allocation size \([capacityInBytes](https://github.com/ziglang/zig/blob/6a65561e3e5f82f126ec4795e5cd9c07392b457b/lib/std/multi_array_list.zig#L537)\) and the offsets for the[field slices](https://github.com/ziglang/zig/blob/6a65561e3e5f82f126ec4795e5cd9c07392b457b/lib/std/multi_array_list.zig#L198)\. The rest of the[MultiArrayList](https://github.com/ziglang/zig/blob/master/lib/std/multi_array_list.zig)code implements the various methods for adding and removing items, resizing, and conversion to a slice of structs\. Considering the low\-level pointer and index gymnastics, it is fairly readable\. Zig's compile\-time type construction has its limits\. It's not possible, for example, to generate methods dynamically, in contrast to the field construction that we used for the`PointN`structure\. However, this is a current restriction of the reflection API and not a fundamental issue of the type generation functions, and this restriction may be lifted in the future\. The main benefit of the comptime function approach is that we don't have to learn a new language \(such as Rust's procedural macros\), even though we do have to learn the reflection API\.

Similar Articles

Tagged Union Subsets with Comptime in Zig

Mitchell Hashimoto

Mitchell Hashimoto demonstrates how to use Zig's comptime to create subset types of tagged unions, enabling compile-time safety without exhaustive case handling.

Steering Zig Fmt

Lobsters Hottest

A blog post describing two tips for using `zig fmt` effectively, highlighting its 'steerable' formatting approach where trailing commas and line breaks control layout decisions, and showcasing columnar array formatting.

Writing a C Compiler, in Zig

Hacker News Top

A developer documents their experience building a C compiler named paella in Zig, following Nora Sandler’s tutorial series.

Zig Builds Are Getting Faster

Mitchell Hashimoto

Zig 0.15 shows significant compile-time improvements over 0.14, with build script compilation dropping from ~7s to ~1.7s and full builds from 41s to 32s, even while still using LLVM. The article highlights progress toward self-hosted backends and incremental compilation.