People get confused when language implementations break language guarantees

Hillel Wayne — Computer Things 04/21/26, 05:40 PM News

Summary

TLA+ semantics guarantee nonordered updates, but the TLC model checker breaks these guarantees by requiring ordered assignments and adding effectful operators like PrintT, causing confusion for beginners.

Take the following Python program: <div class="codehilite"><pre><code># x = 1, y = 2 x = 0 y = x print([x, y]) </code></pre></div> It'll print <code>[0, 0]</code>. If we swapped the two assignments, it'd instead print <code>[0, 1]</code>. Each assignment happens in a separate temporal step. Pretty much all imperative languages behave this way. Now take the following TLA+ snippet: <div class="codehilite"><pre><code>\* x = 1, y = 2 /\ x' = 0 /\ y' = x /\ PrintT(<<x', y'>>) </code></pre></div> This'll print <code><<0, 1>></code>. Unlike in imperative languages, TLA+ separates the notion of update and temporal step. We read <code>x' = 0</code> as "in the next state, <code>x</code> will be 0, but it still has the same value in the current state". So in every state <code>x</code> and <code>x'</code> are essentially separate variables. As a consequence of this, the order of statements don't matter in TLA+ semantics and swapping the two assignments doesn't change the printed output. The language is really clever like that! This means, among other things that there's basically no intrastep race conditions. One function can update a variable without affecting how any other function uses it. Okay so now beginners inevitably run into a problem with this: <div class="codehilite"><pre><code>\* x = 1, y = 2 /\ x' = y' /\ y' = x /\ PrintT(<<x', y'>>) </code></pre></div> This'll crash because <code>y'</code> isn't defined yet. But if we swap the two assignments this works fine and prints <code><<1, 1>></code>. So clearly that whole thing about nonordering is a pack of bunk. Well, not exactly. TLA+ semantics still guarantee nonordering. The problem is that verifiers don't perfectly implement the TLA+ semantics. <code>y' > 0</code> is a totally reasonable "assignment" but there's an infinite number of possible next states! So the main model checker (TLC) instead requires that <code>y' = some_value</code> comes before any other use of <code>y'</code>, which means ordering now matters. The bigger problem is with <code>PrintT</code>. The TLA+ semantics guarantee nonordering because the semantics don't allow for side effects. The model checker adds in effectful operators like <code>PrintT</code> and <code>Assert</code> and <code>IOExec</code>. This can cause a problem with guard statements. Theoretically speaking these two script blocks are equivalent: <div class="codehilite"><pre><code>/\ x = 0 \* guard statement /\ P() /\ x' = x + 1 /\ x' = x + 1 /\ P() /\ x = 0 \* guard statement </code></pre></div> When <code>x = 1</code>, these don't lead to a new state due to the guard clause. But the model checker evaluates each line one at a time, meaning in block 2 it will run <code>x' = x + 1</code> and <code>P()</code> before getting to <code>x = 0</code> and discarding the state. If <code>P</code> is a proper TLA+ operator, this isn't a problem, but if it's <code>PrintT</code> or <code>Assert</code> it will take its effect first, leading to weird ghost prints that don't correspond with any next states. This difference between "what TLA+ semantics guarantees" and "the specific ways TLC can break those guarantees" is a huge source of confusion for people! On top of that many of these operators, like <code>IOExec</code> and <code>TLCSet</code>, are meant as escape hatches. So if you need them you're already doing something pretty weird, and that makes it even more confusing. And on top of that is that there's no syntactic or visual distinguishing between a guarantee-breaking TLC operator and a regular safe TLA+ operator. In compiled languages you got pragmas and preprocessors, which let the compiler do things the language can't. But those usually have a visually distinct syntax, so you know from looking that here be dragons. I'm reminded of Neel Krishnaswami's incredible post <a href="https://semantic-domain.blogspot.com/2013/07/what-declarative-languages-are.html" target="_blank">What Declarative Languages Are</a><a class="footnote-ref" href="#fn:laurie">1</a>: <blockquote> This also lets us make the prediction that the least-loved features of any declarative language will be the ones that expose the operational model, and break the declarative semantics. So we can predict that people will dislike (a) backreferences in regular expressions, (b) ordered choice in grammars, (c) row IDs in query languages, (d) cut in Prolog, (e) constraint priorities in constraint languages. </blockquote> Prolog cuts have a visually distinct syntax, but a predicate that uses a cut isn't visually distinct from a <a href="https://www.metalevel.at/prolog/purity" target="_blank">logically pure</a> predicate. But that's also a little less tricky than what we got in TLA+, since the cut is still part of the language semantics. (Then again, different Prolog dialects have different ways of <a href="https://eu.swi-prolog.org/pldoc/man?section=printmsg" target="_blank">printing strings</a>, which add side effects to Prolog, and they're not visually distinct from other predicates. So the same problem!) I don't actually have any fix for this. I just find it a fascinating example of a <a href="https://en.wikipedia.org/wiki/Leaky_abstraction" target="_blank">leaky abstraction</a>. Maybe we could write a code highlighter that highlights all functions that transitively use a "weird" function or something. <div class="footnote"> <hr /> <ol> <li id="fn:laurie"> <a href="https://tratt.net/laurie/blog/2013/relative_and_absolute_levels.html" target="_blank">Obligatory response post</a> <a class="footnote-backref" href="#fnref:laurie" title="Jump back to footnote 1 in the text">↩</a> </li> </ol> </div>

Original Article

View Cached Full Text

Cached at: 05/16/26, 03:38 AM

# People get confused when language implementations break language guarantees Source: [https://buttondown.com/hillelwayne/archive/people-get-confused-when-language-implementations](https://buttondown.com/hillelwayne/archive/people-get-confused-when-language-implementations) Take the following Python program: ``` # x = 1, y = 2 x = 0 y = x print([x, y]) ``` It'll print`\[0, 0\]`\. If we swapped the two assignments, it'd instead print`\[0, 1\]`\. Each assignment happens in a separate temporal step\. Pretty much all imperative languages behave this way\. Now take the following TLA\+ snippet: ``` \* x = 1, y = 2 /\ x' = 0 /\ y' = x /\ PrintT(<<x', y'>>) ``` This'll print`<<0, 1\>\>`\. Unlike in imperative languages, TLA\+ separates the notion of update and temporal step\. We read`x' = 0`as "in the*next*state,`x`will be 0, but it still has the same value in the current state"\. So in every state`x`and`x'`are essentially separate variables\. As a consequence of this, the order of statements don't matter in TLA\+ semantics and swapping the two assignments doesn't change the printed output\. The language is really clever like that\! This means, among other things that there's basically no intrastep race conditions\. One function can update a variable without affecting how any other function uses it\. Okay so now beginners inevitably run into a problem with this: ``` \* x = 1, y = 2 /\ x' = y' /\ y' = x /\ PrintT(<<x', y'>>) ``` This'll crash because`y'`isn't defined yet\. But if we swap the two assignments this works fine and prints`<<1, 1\>\>`\. So clearly that whole thing about nonordering is a pack of bunk\. Well, not exactly\. TLA\+ semantics still guarantee nonordering\. The problem is that verifiers don't perfectly implement the TLA\+ semantics\.`y' \> 0`is a totally reasonable "assignment" but there's an infinite number of possible next states\! So the main model checker \(TLC\) instead requires that`y' = some\_value`comes before any other use of`y'`, which means ordering now matters\. The bigger problem is with`PrintT`\. The TLA\+ semantics guarantee nonordering because the semantics don't allow for side effects\. The model checker adds in effectful operators like`PrintT`and`Assert`and`IOExec`\. This can cause a problem with guard statements\. Theoretically speaking these two script blocks are equivalent: ``` /\ x = 0 \* guard statement /\ P() /\ x' = x + 1 /\ x' = x + 1 /\ P() /\ x = 0 \* guard statement ``` When`x = 1`, these don't lead to a new state due to the guard clause\. But the model checker evaluates each line one at a time, meaning in block 2 it will run`x' = x \+ 1`and`P\(\)`before getting to`x = 0`and discarding the state\. If`P`is a proper TLA\+ operator, this isn't a problem, but if it's`PrintT`or`Assert`it will take its effect first, leading to weird ghost prints that don't correspond with any next states\. This difference between "what TLA\+ semantics guarantees" and "the specific ways TLC can break those guarantees" is a huge source of confusion for people\! On top of that many of these operators, like`IOExec`and`TLCSet`, are meant as escape hatches\. So if you need them you're already doing something pretty weird, and that makes it even more confusing\. And on top of that is that there's no syntactic or visual distinguishing between a guarantee\-breaking TLC operator and a regular safe TLA\+ operator\. In compiled languages you got pragmas and preprocessors, which let the compiler do things the language can't\. But those usually have a visually distinct syntax, so you know from looking that here be dragons\. I'm reminded of Neel Krishnaswami's incredible post[What Declarative Languages Are](https://semantic-domain.blogspot.com/2013/07/what-declarative-languages-are.html)[1](https://buttondown.com/hillelwayne/archive/people-get-confused-when-language-implementations#fn:laurie): > This also lets us make the prediction that the least\-loved features of any declarative language will be the ones that expose the operational model, and break the declarative semantics\. So we can predict that people will dislike \(a\) backreferences in regular expressions, \(b\) ordered choice in grammars, \(c\) row IDs in query languages, \(d\) cut in Prolog, \(e\) constraint priorities in constraint languages\. Prolog cuts have a visually distinct syntax, but a predicate that uses a cut isn't visually distinct from a[logically pure](https://www.metalevel.at/prolog/purity)predicate\. But that's also a little less tricky than what we got in TLA\+, since the cut is still part of the language semantics\. \(Then again, different Prolog dialects have different ways of[printing strings](https://eu.swi-prolog.org/pldoc/man?section=printmsg), which add side effects to Prolog, and they're not visually distinct from other predicates\. So the same problem\!\) I don't actually have any fix for this\. I just find it a fascinating example of a[leaky abstraction](https://en.wikipedia.org/wiki/Leaky_abstraction)\. Maybe we could write a code highlighter that highlights all functions that transitively use a "weird" function or something\.

People get confused when language implementations break language guarantees

Similar Articles

@hbouammar: Maybe long-context reasoning should stop relying on models writing their own recursive control code. We open-sourced λ-…

Can LLMs model real-world systems in TLA+?

Borrow-checking without type-checking

LaTER: Efficient Test-Time Reasoning via Latent Exploration and Explicit Verification

LLMs are bad at vibing specifications

Submit Feedback

Similar Articles

@hbouammar: Maybe long-context reasoning should stop relying on models writing their own recursive control code. We open-sourced λ-…

Can LLMs model real-world systems in TLA+?

Borrow-checking without type-checking

LaTER: Efficient Test-Time Reasoning via Latent Exploration and Explicit Verification

LLMs are bad at vibing specifications