Record type inference for dummies

Lobsters Hottest News

Summary

The article explains the basics of type inference for anonymous records in statically typed languages, using type theory notation and Haskell as the implementation language.

<p><a href="https://lobste.rs/s/ufml52/record_type_inference_for_dummies">Comments</a></p>
Original Article
View Cached Full Text

Cached at: 06/23/26, 01:46 PM

# Record type inference for dummies Source: [https://haskellforall.com/2026/06/record-type-inference-for-dummies](https://haskellforall.com/2026/06/record-type-inference-for-dummies) The reason I'm writing this post is because I actually wanted to write a more advanced post on type inference for anonymous records, but then I realized that most of my readers wouldn't understand the latter post in isolation\. So I figured I would write this introductory post to teach people new to type theory the basics\. The reason I'm writing*both*posts is because I believe that good type inference for anonymous records is one of the big things holding back statically typed languages[1](https://haskellforall.com/2026/06/record-type-inference-for-dummies#user-content-fn-1)and not enough people appreciate this or understand why\. I also think that there is a large disconnect between what programming language experts understand is possible and what lay programmers are comfortable or familiar with\. To make a play on[XKCD \#2501](https://xkcd.com/2501/): ![](https://haskellforall.com/imgs/record-type-inference-for-dummies/experts.webp) So this post \(and the next one\) are going to be an exposition of where the field of type theory was at**over three decades ago**, which our industry still hasn't really caught up to, yet\. ## Anonymous records I mentioned that this is a post about type inference for*anonymous records*and I've met some programmers who either don't know what anonymous records are or at least don't recognize them by that name, so I'll briefly touch upon them\. An anonymous record is a record that doesn't require an associated datatype declaration, and they're quite common in dynamically typed languages\. For example, JavaScript calls them "objects": ``` { name: "Alice", age: 25 } ``` Python calls them "dictionaries": ``` { "name": "Alice", "age": 25 } ``` Ruby calls them "hashes": ``` { :name => "Alice", :age => 25 } # … or equivalently: { name: "Alice", age: 25 } ``` …, and Nix calls them "attribute sets": ``` { name = "Alice"; age = 25; } ``` If you've ever worked with JSON then you've worked with anonymous records because JSON objects \(just like JavaScript objects\) are anonymous records\. ## Static typing A smaller number of statically typed programming languages support anonymous records because statically typed languages usually prefer named datatypes\. For example, Haskell does not support anonymous records and requires a datatype declaration for all records, like this: ``` data Person = Person{ name :: Text, age :: Integer } example :: Person example = Person{ name = "Alice", age = 25 } ``` … but there are still some statically typed programming languages that do support anonymous records, like TypeScript: ``` { name: "Alice", age: 25 } : { name: string, age: number } ``` … or C\# \(which calls them "anonymous types"\): ``` new { Name = "Alice"; Age = 25 } ``` … or PureScript \(which calls them records\): ``` { name: "Alice", age: 25 } : { name :: String, age :: Int } ``` If all we needed was language support for record literals without any operations on records then it's fairly easy to infer their types\. To do so, though, I'm going to introduce some basic type theory notation\. Let's start by defining a basic abstract syntax tree that says that an expression \(\) can be either: - a\(e\.g\.\) - a\(e\.g\.\) - a record containing 0 or more fields \(e\.g\.\) The way we would write that[2](https://haskellforall.com/2026/06/record-type-inference-for-dummies#user-content-fn-2)formally is: … and the equivalent Haskell code would be something like: ``` type Identifier = Text data Expression = Boolean Bool | String Text | Number Double | Record (Map Identifier Expression) ``` Fields can store arbitrary expressions, which means you can nest records, like: We'll also need to define an abstract syntax tree for our inferred types, which can be either: - thetype - thetype - thetype - a record type containing 0 or more fields \(e\.g\.\) The notation we'd use for that is: The equivalent Haskell code would be something like: ``` data Type = BooleanType | StringType | NumberType | RecordType (Map Identifier Type) ``` Now that we have a syntax for expressions and types we can define some basic type inference rules: - andalways have type - a\(like\) always has type - a\(like\) always has type We write out those rules using this notation: If you've never seen this notation before, you can think of it as mathematical pseudocode for how to implement a type inference function\. The equivalent Haskell function would be something like: ``` infer :: [(Identifier, Type)] -- ^ context, a.k.a. "Γ" (currrently unused) -> Expression -- ^ input expression -> Either Text Type -- ^ output inferred type (currently never fails) infer context (Boolean _) = return BooleanType infer context (String _) = return StringType infer context (Number _) = return NumberType ``` > **Note:**You can find the complete Haskell code in the Appendix\. Now suppose that we wanted to infer the type of record literal like this one: Normally we'd reason about the expression's type by hand like this: 1. to infer the type ofwe need to infer the type of each field: A\. first, infer the type of\(which is\) B\. then, infer the type of\(which is\) 2. now combine those into the final record type: Type theorists have a notation for that sort of reasoning process, which looks like this: That is known as a "typing derivation" and the way it works is that "outer" reasoning steps \(like steps 1 and 2\) go on the bottom and "inner" reasoning steps \(like steps A and B\) go on top\. If we take that reasoning process and generalize it to all records we might write something like this: … which you can read as saying "if you want to infer the type of a record then infer the type of each field and replace each field with its inferred type"\. In Haskell this would be: ``` infer context (Record fields) = do fieldTypes <- traverse (infer context) fields return (RecordType fieldTypes) ``` … and we can verify this all works in the Haskell REPL: ``` ghci> :set -XOverloadedStrings -XOverloadedLists ghci> infer [] (Record [("name", String "Alice"), ("age", Number 25)]) Right (RecordType (fromList [("age",NumberType),("name",StringType)])) ``` ## Field access Any programming language worth its salt[3](https://haskellforall.com/2026/06/record-type-inference-for-dummies#user-content-fn-3)will also support record field access using something like dot notation \(e\.g\.\)\. For example, in Nix that would look like this: ``` nix-repl> { name = "Alice"; age = 25; }.name "Alice" ``` So we'll add a type inference rule for field access, but first we need to extend our expression syntax to support dot notation: … and now we can add this type inference rule: This says that we can access a field namedfrom an expressionif's inferred type is a record with a field named\. The inferred type of the fieldbecomes the inferred type of the field access\. The equivalent Haskell code would be something like: ``` data Expression = … | FieldAccess Expression Identifier … infer context (FieldAccess expression field) = do expressionType <- infer context expression case expressionType of RecordType fieldTypes -> case Map.lookup field fieldTypes of Just fieldType -> return fieldType Nothing -> Left "missing field" _ -> Left "not a record" ``` This is the first type inference rule we've added that can fail\. If we were to infer the type of an expression likethen that wouldn't match that type inference rule \(or any other rule\) so we would reject the expression with a type error \("not a record"\)\. However, the rule would also reject a field access if the field is missing, too\. If we were to infer the type ofthat*also*doesn't match our type inference rule so we would reject the expression with a type error \("missing field"\)\. Before we move on, let's test out this type inference rule on an example\. Suppose that we want to infer the type of this expression: Our reasoning process might go something like: - to know the type of the field access \(\) I need the type of the record- to know the type of the record I need the type of thefield- the type of thefield \(set to\) is - the type of the record is - the type of the field access \(\) is … and the equivalent formal derivation would be: … and we can confirm in Haskell that the inferred type is indeed: ``` exampleRecord :: Expression exampleRecord = Record [("name", String "Alice"), ("age", Number 25)] exampleAccess :: Expression exampleAccess = FieldAccess exampleRecord "age" ``` ``` ghci> infer [] exampleAccess Right NumberType ``` ## Variables You might wonder why we don't just write the rule like this: In other words, why don't we consult the expression's*value*instead of the expression's*type*when inferring the field's type? After all, that would make our reasoning process much more direct for that last example: - to know the type of the field access \(\) I need the type of thefield \(set to\)- the type ofis - the type of the field access \(\) is … and the equivalent formal derivation would also be simpler: Consulting the value instead of the type would work for our current \(incredibly simple\) programming language, but would no longer work once we add support for variables because then an expression like this would be rejected: You can read that as assigning an expression \(\) to a local variable \(\) and then accessing thefield from that variable\. We can only infer a type for this expression if we consult's type\. We can't consult's value because right nowis just an unevaluated variable \(not a record literal\)\. We*could*evaluate our expression first to get's value but that would defeat the purpose of type checking: normally you type check an expression*before*you evaluate the expression in order to catch mistakes before evaluation begins[4](https://haskellforall.com/2026/06/record-type-inference-for-dummies#user-content-fn-4)\. On that note, let's go ahead and add variables and variable assignment to our very minimal language: Nowis a valid expression \(which evaluates to\) and so is: … which evaluates to\. > **Note:**I'm not going to spell out the rules for evaluation in this post since I'm just focused on explaining type inference\. The type inference rule foris the first time we'll actually use the context \(\) that we've ignored so far: This says that in order to infer the type of aexpression we need to first infer the type of the right\-hand side of each local variable assignment \(,, …\)\. We then extend our type inference context \(\) with the inferred type of each assigned local variable \(\) while inferring the type of the final result \(\)\. ``` data Expression = … | Let [(Identifier, Expression)] Expression … infer context (Let [] expression) = do infer context expression infer context (Let ((x, assignment) : assignments) expression) = do assignmentType <- infer context assignment infer ((x, assignmentType) : context) (Let assignments expression) ``` This rule pairs with the type inference rule for variables, which is: … which you can read as saying "to infer the type of a variable named, look up's inferred type in the context"\. Ifis missing from the context then this rule does not match and we reject the expression with a type error \("unbound variable"\)\. Thehere is acting like a "named ellipsis" standing in for other inferred types that we don't care about right now\. ``` data Expression = … | Variable Identifier infer context (Variable identifier) = do case lookup identifier context of Just assignmentType -> return assignmentType Nothing -> Left "unbound variable" ``` Armed with those two new rules we can now write out a typing derivation to infer the type of our earlier example: This is essentially saying: - has typebecause…- has typebecause …- has type - … thereforehas type - … thereforehas type … and we can confirm that in Haskell, too: ``` exampleLet :: Expression exampleLet = Let [("r", Record [("x", Number 1)])] (FieldAccess (Variable "r") "x") ``` ``` ghci> infer [] exampleLet Right NumberType ``` ## Functions If that's all our programming language needed then type inference for anonymous records would be easy and most statically typed programming languages would support anonymous records\. However, every programming language also supports functions and that's where type inference begins to get tricky\. To see why, consider this TypeScript function: ``` const getName = person => person.name; ``` … which in the lambda calculus would be written asor in Nix would be written as: ``` getName = person: person.name; ``` What type would we infer for the function? In order to answer that question we need to first extend our syntax to useto denote an anonymous function of one or more arguments: … and we also need to add a new syntax for function types: … wheremeans a function whose input type isand whose output type is\. Then we can add a type inference rule for functions: This is the first type inference rule we've written that can't just be directly translated to Haskell code as written \(it*is*possible to codify using[unification](https://en.wikipedia.org/wiki/Unification_(computer_science)#Application:_type_inference)or similar, but that's outside the scope of this post\)\. However, you can read that rule as saying thathas typeifhas typewhenhas type\. Even without code, we can still use that rule to reason through what the type forshould be: In other words,is a function whose input type is\(a record with at least afield of type\) and whose output type is\. That makes sense\! So far, so good\. However, this is not*exactly*the type most programming languages would infer or expect\. Specifically, most programming languages do not support an ellipsis likein a record type, but that's because there are two slightly better ways to handle types like that\. ### Subtyping Some statically typed programming languages \(like TypeScript\) treat a record type likeas a subtype of all larger records that share the same fields\. In other words: Under this approach the ellipsis is redundant because all record types have an implied ellipsis\. That means that in TypeScript you can write: ``` const getName = (person: { name: string }) => person.name; ``` … and`getName`will still work even if you call it on a larger record \(e\.g\.`\{ name: "Alice", age: 25 \}`\)[5](https://haskellforall.com/2026/06/record-type-inference-for-dummies#user-content-fn-5)\. This approach works okay, but there is an even better approach for handling extra record fields, which brings us to: ### Row polymorphism The second approach is that instead of*dropping*the ellipsis we*name*the ellipsis, meaning that instead of this: … we write something like this: This is the approach taken by languages like PureScript and Elm, where PureScript will write that type like this: ``` { name :: String | other } ``` … and Elm uses a similar syntax: ``` { other | name : String } ``` This has some nice upsides, which we'll get to in the next section, but first let's formalize what it means to "name the ellipsis"\. First, we'd create a new class of identifier \(known as a "row variable"\): > "Row" is an antiquated name for "a set of fields"\. Remember that the research into this sort of type inference occurred a long time ago, before JSON even existed\. However, the upside of the name is that type theorists get to be a little cheeky and use the greek letter\("rho"\) to represent a "row"\. Then we'd change the abstract syntax for record types to permit an optional row variable: … and update our type inference rule for field access to use a row variable instead of an ellipsis: This approach of using named ellipses is known as "row polymorphism" because it lets us abstract \(be "polymorphic"\) over the set of other fields \("row"\)\. ## Record extension You might wonder: why do we want to abstract over the set of other fields? In particular, why do we need to give a*name*to these other fields? One reason why is to support our next record operator: record extension\. We'll use this syntax: … which you can read as "extend the recordwith a fieldwhose value is"\. If the fieldalready exists withinthen this extension replaces \(a\.k\.a\. "shadows"\) the old field\. So let's add that to our abstract syntax for expressions: Now what type would we infer for a record update? Our first stab at type inference rule might look like this: … but that is**not**correct because the fieldmight already exist within the old record \(\), which meanswould include\. That in turn means thatis not necessarily a well\-formed record type because anfield included withinwould conflict with thewe're adding to the final record type\. In order to fix this we need some notation that says that the old record may or may not have a field named: … and nowdenotes all fields within the old record that were not\. This fixes the problem because now both occurrences ofcannot include the field\. > You can think ofas analogous to TypeScript's`x?: T`notation for the type of a field that might be absent\. With that type inference rule we can now infer the type of a function like this one: … which is: In other words, the inferred type: … says that this is a function whose input is a record that may or may not have afield plus some other fields \(\) and the output is a record with afield storing aplus those same other fields \(\)\. > This is an example where we don't want to use ellipses in place of\. If we were to replacewith ellipses: … then the type would no longer convey that the two ellipses represent the**same**set of extra fields\. If you were to write the same function in TypeScript the type would be a lot clumsier: ``` function crown<T extends object>(woman: T): Omit<T, 'queen'> & { queen: boolean } { return { ...woman, queen: true }; } ``` … because TypeScript doesn't have a way to abstract over sets of fields \(rows\)\. TypeScript can only abstract over types \(like`T`in the above example\), which makes this sort of type more awkward\. ## Conclusion I'm personally interested in row polymorphism for another reason: right now I'm working on a type checker and language server for Nix and one of the trickier parts of the language to type check is Nix's`//`operator \(the "record concatenation" operator that combines two records\)\. Row polymorphism is basically the only approach to type inference framework handles that operator well, which is explained in the paper[Type inference for record concatenation and multiple inheritance](https://www.sciencedirect.com/science/article/pii/089054019190050C)and that paper is going to be the basis of my next post\. In the meantime, though, if you liked this post then you might also like another post of mine,[Datatype unification using Monoids](https://haskellforall.com/2025/08/type-inference-for-plain-data), which presents a simple type inference algorithm for JSON\-like data\. ## Appendix Here's the fully worked Haskell program that combines all the prior snippets from the post: ``` {-# LANGUAGE OverloadedStrings #-} module Example where import Data.Map (Map) import Data.Text (Text) import qualified Data.Map as Map type Identifier = Text data Expression = Boolean Bool | String Text | Number Double | Record (Map Identifier Expression) | FieldAccess Expression Identifier | Let [(Identifier, Expression)] Expression | Variable Identifier data Type = BooleanType | StringType | NumberType | RecordType (Map Identifier Type) infer :: [(Identifier, Type)] -- ^ context, a.k.a. "Γ" (currrently unused) -> Expression -- ^ input expression -> Either Text Type -- ^ output inferred type (currently never fails) infer context (Boolean _) = do return BooleanType infer context (String _) = do return StringType infer context (Number _) = do return NumberType infer context (Record fields) = do fieldTypes <- traverse (infer context) fields return (RecordType fieldTypes) infer context (FieldAccess expression field) = do expressionType <- infer context expression case expressionType of RecordType fieldTypes -> case Map.lookup field fieldTypes of Just fieldType -> return fieldType Nothing -> Left "missing field" _ -> Left "not a record" infer context (Let [] expression) = do infer context expression infer context (Let ((x, assignment) : assignments) expression) = do assignmentType <- infer context assignment infer ((x, assignmentType) : context) (Let assignments expression) infer context (Variable identifier) = do case lookup identifier context of Just assignmentType -> return assignmentType Nothing -> Left "unbound variable" ``` 1. … the other big thing being good type inference for anonymous variants/unions \(a\.k\.a\. "polymorphic variants"\)[↩](https://haskellforall.com/2026/06/record-type-inference-for-dummies#user-content-fnref-1) 2. There isn't really an official or standard notation for this sort of thing\. Type theory notation is something that authors just sort of make up as they go and while there are some common conventions there is still a lot of notational variation between authors\. In other words, it's just vibes and as long as it's understandable in context that's usually good enough\.[↩](https://haskellforall.com/2026/06/record-type-inference-for-dummies#user-content-fnref-2) 3. Haskell in shambles[6](https://haskellforall.com/2026/06/record-type-inference-for-dummies#user-content-fn-6)[↩](https://haskellforall.com/2026/06/record-type-inference-for-dummies#user-content-fnref-3) 4. There are some situations where it does make sense to interleave type checking and evaluation and I discuss one such case in another post of mine:[Type\-safe eval in Grace](https://haskellforall.com/2026/01/typesafe-eval)\.[↩](https://haskellforall.com/2026/06/record-type-inference-for-dummies#user-content-fnref-4) 5. … with the exception of a fresh object, which would be rejected by the[excess property checker](https://www.typescriptlang.org/docs/handbook/2/objects.html#excess-property-checks)\.[↩](https://haskellforall.com/2026/06/record-type-inference-for-dummies#user-content-fnref-5) 6. Haskell does technically support accessing record fields nowadays using the`OverloadedRecordDot`language extension but it's still fairly unergonomic, in my opinion\. I still love Haskell to death, though, as you might guess from the name of my blog\.[↩](https://haskellforall.com/2026/06/record-type-inference-for-dummies#user-content-fnref-6)

Similar Articles

Data types à la carte (2008)

Lobsters Hottest

This paper presents a technique for composing data types and functions from independent components, and extends the approach to combine free monads, enabling a modular structuring of Haskell's IO monad.

Borrow-checking without type-checking

Lobsters Hottest

A blog post presents a toy language that enforces borrow-checking at runtime without static typing, using cheap reference-counting on the stack to enable interior pointers and single ownership in a dynamically-typed setting.

Types and Neural Networks

Hacker News Top

This article explores the theoretical and practical challenges of training LLMs to produce typed outputs natively, rather than relying on post-hoc typechecking, with a focus on formally typed languages like Idris, Lean, and Agda. It analyzes current ad-hoc approaches to enforcing types during inference and proposes rebuilding LLMs from the ground up to generate inherently typed outputs.

Static types and shovels (2026)

Lobsters Hottest

The author argues that the resurgence of static typing in the 2010s was due to improved type systems (e.g., TypeScript, Haskell, Rust) that offer nullable type handling, sum types, and type inference, contrasting with poor static typing in earlier languages like Java and C++98.

Deconstructing Datalog

Lobsters Hottest

The article describes the author's PhD dissertation 'Deconstructing Datalog', which integrates Datalog's recursive query capabilities into a typed functional language (Datafun) by using least prefix points and monotonicity tracking in the type system.