Tag
This blog post explores an optimization for LLVM's SmallVector::push_back by tail-calling the grow-and-push path, which eliminates callee-saved register spills and improves the fast path performance.
This tweet explains the effectiveness of reverse engineering skills, citing examples like stopping a $4 billion cyberattack and exposing NSA tools, and describes the technical process of converting binary to assembly.
Decomp Academy is an interactive online platform that teaches users how to decompile GameCube games by writing byte-matching C code against real PowerPC assembly output from the 2001 MWCC compiler.
Slisp is a simple compiler that reads Lisp programs and generates standalone assembly representations for Linux/AMD64, with support for basic primitives, closures, and a standard library.
An educational article that explains how ARM64 (AArch64) instructions are encoded in 32-bit fixed-length words, debunking common misconceptions and providing hands-on decoding examples using ADD immediate on Apple Silicon.
This article revisits techniques for creating extremely small ELF executables on Linux, exploring how to reduce size to 45 bytes by abusing header fields and overlapping structures while maintaining ELF specification conformance.
ymawky is a web server written entirely in ARM64 assembly, supporting CGI, static files, and multiple HTTP methods, now available on Linux.
A blog post explaining a counterintuitive optimization where using float division (DIVSD) instead of integer division (IDIVQ) yields faster performance on modern CPUs, with benchmarks and assembly analysis.
ASM SHADER TOY is a tool that lets you write shaders using assembly language, similar to the popular Shader Toy platform.
A tweet highlights James Molloy's 2008 free tutorial 'Roll Your Own Toy UNIX Clone OS', which teaches building a Unix-like kernel from scratch in C and assembly, covering bootloader, memory management, filesystems, and multitasking.
A guide to writing ARM64 assembly code that is portable across Apple's Darwin and Linux/BSD systems, covering differences in ABI, symbol naming, and vector mnemonics.
The article introduces Linux restartable sequences (rseq), a kernel feature that enables thread-safe data structures without locks or atomics, achieving dramatic performance improvements on many-core CPUs. It provides a tutorial and demonstrates up to 43x speedup on a 96-core AMD Threadripper.
A technical blog post exploring how to use SBCL as a breadboard for assembly code, focusing on stack-based virtual machine techniques such as rotating stacks and efficient primop dispatch, with references to the F18 processor and x87 stack.
VirtualPC is an open-source 8-bit computer simulator that can train small neural networks from assembly code, demonstrating machine learning at the bare-metal level.
The article explores the origin and meaning of the phrase 'Halt and Catch Fire' in computing, tracing it from a joke mnemonic to actual CPU behavior in the Motorola 6800 and IBM System/360.
A developer rebuilt their entire Linux desktop stack—from shell to terminal, window manager, and utilities—in pure x86_64 Assembly using Claude Code, achieving microsecond startup times and hours of extra battery life.
Raymond Chen explores why x86 compilers universally prefer "xor eax,eax" over "sub eax,eax" to zero a register, attributing it to historical momentum and slightly safer flag behavior rather than technical superiority.