When compilers surprise you

Lobsters Hottest 04/19/26, 05:34 AM News

Summary

Matt Godbolt explores compiler optimizations that convert an O(n) summation loop into an O(1) closed-form solution, highlighting how Clang and GCC employ sophisticated techniques like loop unrolling and mathematical simplification to dramatically improve code performance.

<p><a href="https://lobste.rs/s/vugypt/when_compilers_surprise_you">Comments</a></p>

Original Article Export to Word Export to PDF

View Cached Full Text

Cached at: 04/20/26, 02:44 PM

# When compilers surprise you — Matt Godbolt’s blog Source: [https://xania.org/202512/24-cunning-clang](https://xania.org/202512/24-cunning-clang) Written by me, proof\-read by an LLM\. Details at end\. Every now and then a compiler will surprise me with a really smart trick\. When I first saw this optimisation I could hardly believe it\. I was looking at loop optimisation, and wrote something like this simple function that sums all the numbers up to a given value: So far so decent: GCC has done some preliminary checks, then fallen into a loop that efficiently sums numbers using`lea`\(we’ve[seen this before](https://xania.org/202512/02-adding-integers)\)\. But taking a closer look at the loop we see something unusual: ``` .L3: lea edx, [rdx+1+rax*2] ; result = result + 1 + x*2 add eax, 2 ; x += 2 cmp edi, eax ; x != value jne .L3 ; keep looping ``` The compiler has cleverly realised it can do two numbers[1](https://xania.org/202512/24-cunning-clang#fn:check)at a time using the fact it can see we’re going to add`x`*and*`x \+ 1`, which is the same as adding`x\*2 \+ 1`\. Very cunning, I think you’ll agree\! If you turn the optimiser up to`\-O3`you’ll see the compiler works even harder to vectorise the loop using parallel adds\. All very clever\. This is all for GCC\. Let’s see what clang does with our code: This is where I nearly fell off my chair:**there is no loop**\! Clang checks for positive`value`, and if so it does: ``` lea eax, [rdi - 1] ; eax = value - 1 lea ecx, [rdi - 2] ; ecx = value - 2 imul rcx, rax ; rcx = (value - 1) * (value - 2) shr rcx ; rcx >>= 1 lea eax, [rdi + rcx] ; eax = value + rcx dec eax ; --eax ret ``` It was not at all obvious to me what on earth was going on here\. By backing out the maths a little, this is equivalent to: ``` v + ((v - 1)(v - 2) / 2) - 1; ``` Expanding the parentheses: ``` v + (v² - 2v - v + 2) / 2 - 1 ``` Rearranging a bit: ``` (v² - 3v + 2) / 2 + (v - 1) ``` Multiplying the`\(v \- 1\)`by 2 / 2: ``` (v² - 3v + 2) / 2 + (2v - 2)/2 ``` Combining those and cancelling: Simplifying and factoring gives us`v\(v \- 1\) / 2`which is the closed\-form solution to the “sum of integers”\! Truly amazing[2](https://xania.org/202512/24-cunning-clang#fn:why)\- we’ve gone from an O\(n\) algorithm as written, to an O\(1\) one\! I love that despite working with compilers for more than twenty years, they can still surprise and delight me\. The years of experience and work that have been poured into making compilers great is truly humbling, and inspiring\. We’re nearly at the end of this series \- there’s so much more to say but that will have to wait for another time\. Tomorrow will be a little different: see you then\! *See[the video](https://youtu.be/V9dy34slaxA)that accompanies this post\.* --- *This post is day 24 of[Advent of Compiler Optimisations 2025](https://xania.org/AoCO2025-archive), a 25\-day series exploring how compilers transform our code\.* *←[Switching it up a bit](https://xania.org/202512/23-switching-it-up)\|[Thank you](https://xania.org/202512/25-thank-you)→* *This post was written by a human \([Matt Godbolt](https://xania.org/MattGodbolt)\) and reviewed and proof\-read by LLMs and humans\.* *Support Compiler Explorer on[Patreon](https://patreon.com/c/mattgodbolt)or[GitHub](https://github.com/sponsors/compiler-explorer), or by buying CE products in the[Compiler Explorer Shop](https://shop.compiler-explorer.com/)*\. Posted at 06:00:00 CST on 24thDecember 2025\.

When compilers surprise you

Similar Articles

Building a C compiler with a team of parallel Claudes

Writing a C Compiler, in Zig

Making Julia as Fast as C++ (2019)

@agupta: some ideas are much clearer when you can use coding agents to show a proof of concept. eg I hadn’t really understood ho…

Testing Local LLMs in Practice: Code Generation, Quality vs. Speed

Submit Feedback

Similar Articles

Building a C compiler with a team of parallel Claudes

Making Julia as Fast as C++ (2019)

@agupta: some ideas are much clearer when you can use coding agents to show a proof of concept. eg I hadn’t really understood ho…

Testing Local LLMs in Practice: Code Generation, Quality vs. Speed