Odyssey: Constructing Verifiable Local Truth-Preserving Foundation Models

arXiv cs.AI 06/29/26, 04:00 AM Papers
foundation-models verifiability categorical-framework sheaf-theory argumentation topos-theory icml-2026
Summary
This paper introduces a categorical framework for constructing verifiable, local truth-preserving foundation models using composable foundries, implemented in the Odyssey system, and scheduled for a tutorial at ICML 2026.
arXiv:2606.27593v1 Announce Type: new Abstract: We introduce a categorical framework called ODYSSEY for constructing verifiable, local truth-preserving foundation models as compositions of foundries: building-block architectural components that specify a cover of local contexts, local representation families, restriction maps, gluing rules, obstruction policies, update obligations, and human-facing views. A foundry is an organized sheaf of knowledge that carries within it an argumentation component. Concrete foundries are built from generic foundries such as evidence/argument, operational decision, institutional/financial, market meaning, scientific challenge, research-program, assistant-build, and evaluation-harness foundries. Universal Foundry Learning (UFL) formalizes foundry construction as a composition of left and right Kan extensions, with left Kan extension rolling local artifacts into candidate foundries and right Kan extension enforcing the restriction, gluing, obstruction, and argumentation conditions required for promotion. Foundry SQL (FSQL) is a small typed query surface for slicing maintained foundry artifacts that uses TICKET (Topos Integration using Causal Kan Extension Transformers) certification for admitting external or pre-built models into durable ODYSSEY state. ODYSSEY is fully implemented and tested across a wide spectrum of concrete foundries, showing that the same categorical machinery supports domain construction, artifact replay, sheaf diagnostics, grounded Toulmin/local-LLM scrutiny, residual-obstruction ledgers, and optimized TICKET-compatible causal-claim extraction across heterogeneous sources. This paper is to be presented as a 2.5 hour tutorial at ICML 2026. The tutorial home page is at https://bit.ly/4ajS0nA.
Original Article
View Cached Full Text
Cached at: 06/29/26, 05:27 AM
# Constructing Verifiable, Local Truth-Preserving Foundation Models Paper to be presented as a 2.5 hour tutorial at ICML 2026, July 6, Seoul, South Korea.
Source: [https://arxiv.org/html/2606.27593](https://arxiv.org/html/2606.27593)
Sridhar Mahadevan Adobe Research and University of Massachusetts, Amherst smahadev@adobe\.com, mahadeva@umass\.edu

###### Abstract

We introduce a categorical framework for constructing verifiable, local truth\-preserving foundation models as compositions of*foundries*: building\-block architectural components that specify a cover of local contexts, local representation families, restriction maps, gluing rules, obstruction policies, update obligations, and human\-facing views\. The framework is implemented using a system calledOdyssey, where a foundry is not just an organized sheaf of knowledge, but one that carries within it an argumentation component for answering not only “what does the foundry contain?” but also “why is this claim warranted here, and where would the argument fail?” Concrete foundries are built from generic foundries such as evidence/argument, operational decision, institutional/financial, market meaning, scientific challenge, research\-program, assistant\-build, and evaluation\-harness foundries\. The resulting models are sheaf\-like families of local predictive and logical models whose consistency, provenance, promotion decisions, and failures to glue remain explicit artifacts\. Foundry management is assigned to five agents:Scyllafor human\-facing inspection,Homerfor source ingestion,Athenafor construction and audit, andPrometheusfor causal and predictive state, whileToulminsupplies the argumentative reasoning layer\. Foundry SQL \(FSQL\) provides a typed query surface for maintained artifacts\. To turn external pretrained models, including GPT\-style models, into durableOdysseyfoundries, we introduceTicket, Topos Integration using Causal Kan Extension Transformers\. The ICML tutorial version also integrates BRIDGE/SKFM causal\-geometry screens for latent causal refinement\(Mahadevan,[2026c](https://arxiv.org/html/2606.27593#bib.bib22)\), infinitesimal\-causality diagnostics for local causal variation\(Mahadevan,[2026a](https://arxiv.org/html/2606.27593#bib.bib20)\), and SkillOpt optimization of natural\-language admission policies for causal claim tickets\(Yang et al\.,[2026](https://arxiv.org/html/2606.27593#bib.bib33)\)\. We formalize our specific implementation of foundry construction and management as Universal Foundry Learning \(UFL\): the universal property is expressed as a composition of left and right Kan extensions, with left Kan extension rolling local artifacts into candidate foundries and right Kan extension enforcing the restriction, gluing, obstruction, and argumentation conditions required for promotion\.Odysseyis fully implemented and tested across a wide spectrum of concrete foundries, showing that the same categorical machinery supports domain construction, artifact replay, sheaf diagnostics, grounded Toulmin/local\-LLM scrutiny, BRIDGE/SKFM residual\-obstruction ledgers, and optimized TICKET\-compatible causal\-claim extraction across heterogeneous sources\. This paper is to be presented as a 2\.5 hour tutorial at ICML 2026\. The tutorial home page is at[https://bit\.ly/4ajS0nA](https://bit.ly/4ajS0nA)\.

*K*eywordsFoundation Models⋅\\cdotLarge Language Models⋅\\cdotToulmin Argumentation⋅\\cdotSheaves⋅\\cdotTopos Theory

## 1Introduction

Large language models are useful general\-purpose interfaces, but their dominant form of reuse is still a poor fit for many scientific, industrial, and operational settings\. A user rarely wants only a larger embedding or a flat summary\. They want a foundation model for a domain that can be designed, inspected, adapted, refreshed, and trusted: a model of a retailer, a research program, a corpus benchmark, a company filing workflow, a repair\-manual procedure, or a scientific controversy\. Such a model must preserve local evidence, expose uncertainty, remember provenance, and say when a claim should not be transported from one regime to another\.

Odysseyis a framework for constructing such foundation models as*foundries*\. A foundry is a reusable model\-making architecture: it specifies a cover of local contexts, local representation families, restriction maps, gluing rules, obstruction policies, update obligations, and human\-facing views\. Concrete foundries are built from generic foundries such as evidence/argument, operational decision, institutional/financial, market meaning, scientific challenge, research\-program, assistant\-build, and evaluation\-harness foundries\. Specialized foundries are typed compositions of these generic objects, for example a sporting\-goods retailer foundry that combines review evidence, store operations, brand meaning, and corporate filing evidence\.

Evidence / argumentOperational decisionMarket meaningInstitutional / financialScientific challengeEvaluation harnessFree foundry space: specialized foundation models are typed combinations and restrictions of generic foundriesExamples: DKS = retail/brand/review/filing; KET = research/evaluation over PTB, WikiText\-2, WikiText\-103; Amazon Reviews 2023 = corpus\-benchmark; MyFixIt = procedural repair PSR; IKEA ASM = multimodal assembly PSRFigure 1:Foundries as reusable building blocks for foundation models\. The “basis vector” language is an algebraic analogy: generic foundries span a free space of typed model\-making objects, and concrete foundation models are compositions, restrictions, and gluing decisions over those objects\.The central design principle is:

representation=covering\+gluing\.\\text\{representation\}=\\text\{covering\}\+\\text\{gluing\}\.Documents are not reduced to one vector; they become overlapping regions with local logics\. Brands are not reduced to personas; they become overlapping customer, product, promise, and channel contexts\. Companies are not reduced to summaries; they become overlapping financial, risk, narrative, and market regions\. Agents are not reduced to policy vectors; they become overlapping state, action, goal, tool, and evaluation contexts\. The system is engineered so that agreement and non\-agreement are both durable artifacts\.

Odysseyis implemented as a five\-agent stack\.Scyllatranslates a human request into a model\-design brief and explains what the resulting foundry can responsibly answer\.Homerturns the brief into an executable workflow skeleton\.Athenaassigns the representational semantics: covers, local models, truth values, restrictions, gluing rules, obstruction policies, and cross\-sheaf bridges\.Prometheusis the engine\-room operator: it instantiates the plan as a Topos World Model, evaluates local predicates, audits gluing, emits obstruction reports, and builds Scylla\-facing dashboards and durable JSON artifacts\.Toulminis the argumentation agent: it turns maintained foundry state into warranted claims, explicit grounds, backing, qualifiers, rebuttals, and context\-sensitive justification\.

#### Contributions\.

This paper makes six contributions\.

1. 1\.We introduceOdyssey, a foundry architecture for building inspectable foundation models as sheaf\-like families of local predictive and logical models rather than as single opaque embeddings\.
2. 2\.We specify the implementation contract amongScylla,Homer,Athena,Prometheus, andToulmin, including the durable artifacts exchanged at each boundary\.
3. 3\.We describe a foundry algebra in which generic foundries can be composed, specialized, restricted, glued, lifted, and audited to produce domain\-bound foundry instances\.
4. 4\.We introduce Foundry SQL \(FSQL\), a small typed query surface for slicing maintained foundry artifacts, and TICKET, Topos Integration using Causal Kan Extension Transformers, for admitting external or pre\-built Prometheus models into durableOdysseystate\.
5. 5\.We formalize the local predictive\-state and sheaf\-theoretic core: contexts, covers, local sections, finite truth values, restrictions, gluing diagnostics, obstruction records, and promotion gates\.
6. 6\.We document the current foundry families and examples implemented in the repository, including storefront, brand, corporation, Dick’s Sporting Goods, Amazon Reviews 2023, MyFixIt, Indus Script, TCC 44K, research\-program, assistant\-build, embedding\-evaluation, and grounded Toulmin/local\-LLM comparison foundries\.

## 2Related Work

Odysseyis primarily a proposal about*foundation\-model construction*: how a durable domain model is specified, assembled, inspected, repaired, transported, and maintained after the initial sources have been collected\. This places the system at the intersection of foundation\-model engineering, data\-centric model development, domain\-specific foundation models, agentic orchestration, causal and argumentative NLP, and sheaf\-like consistency management\.

The central departure is the unit of construction\. Much foundation\-model work describes a model as a checkpoint, service, dataset, benchmark score, or application pipeline\.Odysseyinstead treats a model\-building effort as a*foundry*: a typed artifact family with a source surface, local contexts, representation families, restriction maps, gluing rules, obstruction ledgers, promotion gates, refresh obligations, and human\-facing views\. The fiveOdysseyagents decompose this construction problem\.Scyllafixes the human\-facing contract,Homermakes the workflow executable,Athenasupplies the representation and gluing laws,Prometheusmaterializes the local world\-model artifacts, andToulminturns maintained state into warranted, qualified, rebuttable claims\. Related work is therefore best read through this construction stack: which parts of the lifecycle does each prior approach make explicit, and which parts remain implicit in model state, prompt context, or ad\-hoc application code?

#### Construction of foundation models\.

The term*foundation model*was introduced to name the emerging class of models trained on broad data at scale and adapted across many downstream tasks\(Bommasani et al\.,[2021](https://arxiv.org/html/2606.27593#bib.bib3)\)\. Much of the subsequent literature treats construction as an engineering pipeline: assemble data, train at scale, serve efficiently, adapt to downstream tasks, and evaluate risks\. Surveys of training and serving systems emphasize the computational, memory, bandwidth, parallelism, and deployment constraints that make foundation\-model development a systems problem as much as a modeling problem\(Zhou et al\.,[2024](https://arxiv.org/html/2606.27593#bib.bib34)\)\. Data\-centric work argues that model construction also depends on curation, attribution, benchmark design, knowledge transfer, and inference\-time context, all of which are underrepresented if foundation models are described only by architecture and parameter count\(Xu et al\.,[2024](https://arxiv.org/html/2606.27593#bib.bib30)\)\.

Recent work on domain\-specific foundation models makes the construction problem more explicit: the goal is not merely to fine\-tune a general model, but to customize data, objectives, architecture, adaptation strategy, and evaluation to the structure of a particular industry or scientific domain\(Chen et al\.,[2024](https://arxiv.org/html/2606.27593#bib.bib4)\)\.Odysseyagrees with this domain\-specific turn but shifts the unit of construction\. A foundry is not only a trained model checkpoint\. It is a typed recipe for building a durable domain model: source surfaces, local contexts, representation families, restriction maps, gluing rules, obstruction policies, update contracts, and human\-facing views\.

There is also a growing engineering literature around how foundation models are released, observed, guided, and embedded into agentic systems\. The Model Openness Framework argues that reproducibility and usability require releasing components across the model development lifecycle, not only weights\(White et al\.,[2024](https://arxiv.org/html/2606.27593#bib.bib29)\)\. Work on foundation\-model “sherpas” studies ways that agents can guide models through knowledge augmentation, prompting, reasoning support, updating, and output evaluation\(Bhattacharjya et al\.,[2024](https://arxiv.org/html/2606.27593#bib.bib2)\)\. Agent design\-pattern catalogues similarly treat foundation\-model applications as architectures with memory, planning, tool use, reflection, and accountability tradeoffs\(Liu et al\.,[2024](https://arxiv.org/html/2606.27593#bib.bib15)\)\.Odysseyis closest to this architectural view, but its emphasis is representational and argumentative rather than only procedural: Scylla, Homer, Athena, Prometheus, and Toulmin make the domain model itself an inspectable sheaf\-like artifact before it is used by a human or agent\.

#### Causal relation extraction from text\.

There is a long line of work on identifying causal relations in natural language, from cue\-phrase and pattern\-based systems to neural classifiers; see surveys of causal relation extraction and event causality identification\(Yang et al\.,[2022](https://arxiv.org/html/2606.27593#bib.bib32); Cheng et al\.,[2025](https://arxiv.org/html/2606.27593#bib.bib5)\)\. Classical systems usually predict whether a pair of spans or events in a sentence stands in a causal relation, while corpus\-scale work connects such local predictions to event forecasting or explanatory retrieval\(Radinsky et al\.,[2012](https://arxiv.org/html/2606.27593#bib.bib27)\)\.Prometheususes such extracted relations as evidence units, but the paper’s object of study is downstream: how thousands of local claims should be localized, compared, transported, or blocked across corpus regions\.

#### Causal knowledge bases and graphs from corpora\.

Causal knowledge\-base projects mine cause–effect tuples from large corpora and aggregate them into graph\-structured resources\(Hassanzadeh et al\.,[2020](https://arxiv.org/html/2606.27593#bib.bib9)\)\. This graph\-building perspective is close to the firstDemocrituscontribution, where LLM\-generated causal statements are compiled into local causal models and larger causal atlases\(Mahadevan,[2025a](https://arxiv.org/html/2606.27593#bib.bib16)\)\.Prometheusextends that line by treating local graphs and cSQL rows as observations for local causal PSRs\. The global object is therefore not one merged graph but a sheaf\-like family of charts whose overlaps reveal agreement, drift, contradiction, and underdetermination\.

#### LLMs for causal discovery and reasoning\.

A growing literature asks whether LLMs can propose causal directions, graph structures, interventions, or counterfactual explanations from variable descriptions and textual context\(Kıcıman et al\.,[2024](https://arxiv.org/html/2606.27593#bib.bib12); Le et al\.,[2024](https://arxiv.org/html/2606.27593#bib.bib14)\)\.Prometheusis deliberately more conservative\. It does not treat the LLM as an oracle for ground\-truth causal discovery\. Instead, the LLM helps surface causal discourse: claims, mechanisms, modifiers, regimes, and source passages that can be normalized, audited, and compared\. Local intervention probes in the atlas are therefore model\-internal research tests unless paired with external data and identification assumptions\.

#### Agentic systems for automated scientific discovery\.

Recent systems also aim to automate larger portions of the scientific workflow\. The AI Scientist\-v2, for example, uses agentic tree search to propose hypotheses, design and execute machine\-learning experiments, analyze and visualize results, and write scientific manuscripts\(Yamada et al\.,[2025](https://arxiv.org/html/2606.27593#bib.bib31)\)\. This line of work is close in ambition toPrometheus: both ask how AI systems can participate in scientific discovery rather than merely answer questions about existing papers\. The emphasis is different\. AI Scientist\-v2 organizes autonomous experimentation and manuscript generation, primarily in machine\-learning research settings\.Prometheusinstead constructs an explicit causal topos world model from heterogeneous research artifacts—text, data, figures, source code, and scientific models—so that local claims, gluing failures, evidentiary limits, and grounded counterfactual revisions remain inspectable\. In this sense,Prometheuscan be viewed as a complementary world\-model layer for scientific agents: it records what a research substrate supports, where it does not glue, and which counterfactuals can actually be evaluated\.

#### Prometheus as prior work\.

The Prometheus paper introduced Topos World Models for deep causal research: language\-derived causal episodes, local causal PSRs, restriction maps, gluing diagnostics, persistent causal atlases, and grounded counterfactual rebuilding over scientific substrates\(Mahadevan,[2026d](https://arxiv.org/html/2606.27593#bib.bib23)\)\. We do not repeat that contribution here\.Odysseytreats Prometheus as one engine\-room component inside a broader foundation\-model foundry stack\. The new questions in this paper are architectural and empirical: how human requests become typed foundry expressions; how Scylla, Homer, Athena, Prometheus, and Toulmin exchange artifacts; what families of foundries have been built; and whether the first procedural PSR foundry, MyFixIt, yields useful preliminary retrieval behavior\.

#### Causality\-aware NLP\.

More broadly, causal ideas have been used to study text effects, counterfactual augmentation, representation robustness, and explanations for NLP systems\(Feder et al\.,[2022](https://arxiv.org/html/2606.27593#bib.bib6)\)\.Prometheuspoints in the opposite direction: it uses NLP and LLM extraction to construct explicit causal artifacts for human research\. The Claims Atlas is meant to be inspected, corrected, extended, and rerun, so provenance and gluing failures are part of the output rather than post\-hoc debugging aids\.

## 3OdysseySystem Architecture

Odysseyseparates foundation\-model construction into five roles with explicit artifact boundaries\. The point of this separation is not only software modularity\. It is also epistemic discipline: a human request, an executable workflow, a representational semantics, an instantiated world model, and an argumentative justification layer should not be confused with one another\.

Table 1:The implementedOdysseyhandoff contract\. Each module emits a durable object that can be inspected independently and reused by later stages\.The current implementation is deliberately deterministic in several places\. For example, Scylla uses domain templates to compile broad requests into model briefs, and Prometheus validation runs evaluate finite truth predicates using stable rules\. This is a feature of the present design stage: it lets us test the foundry contract, artifact schemas, and user\-facing surfaces before replacing individual components with richer parsers, learned classifiers, retrieval systems, or external model calls\.

#### Scylla: request to brief\.

Scylla names the user’s request in human terms\. A brief records the domain, goal, entities, decisions, uncertainties, trust questions, and output contract\. For a corporation foundry, for example, Scylla identifies financial statements, risk factors, management narrative, strategy signals, and market context as objects of concern\. For an Indus Script foundry, it separates symbols, inscription sequences, statistical structure, decipherment hypotheses, archaeological context, and uncertainty\. This prevents the system from answering as if every domain were a generic retrieval task\.

#### Homer: brief to workflow\.

Homer compiles the brief into an executable program skeleton: collect sources, request an Athena plan, instantiate the Prometheus model, run gluing audits, emit dashboards, and schedule refresh obligations\. In process foundries such as research\-program, assistant\-build, or evaluation\-harness foundries, Homer also records maintenance cadence, policy gates, simulation surfaces, and artifact rebuild obligations\.

#### Athena: workflow to sheaf plan\.

Athena owns the representational semantics\. It chooses the local contexts that form the cover, assigns each context a local representation type, lists predicates that can be evaluated inside the context, declares overlaps, and defines gluing logic\. The current truth\-value strategy uses the finite partition

⊥,WEAK,PLAUSIBLE,SUPPORTED,⊤,\\bot,\\ \\mathrm\{WEAK\},\\ \\mathrm\{PLAUSIBLE\},\\ \\mathrm\{SUPPORTED\},\\ \\top,translated for users as unsupported, weak signal, plausible, well supported, and directly confirmed\. An overlap glues when shared predicates are at least plausible and no paired predicate is bottom; otherwise Athena asks Prometheus to emit an obstruction record and a next\-observation recommendation\.

#### Prometheus: plan to Topos World Model\.

Prometheus instantiates Athena’s plan as a compact Topos World Model\. It emits local sections, restriction maps, gluing results, integration layers, update status, evidence sources, and HTML views\. For domains with source sheaves, Prometheus also builds auxiliary sheaves such as the Dick’s Sporting Goods shopping\-experience sheaf, the Amazon Reviews 2023 corpus sheaf, the Indus evidence sheaf, and procedural PSR candidates over MyFixIt repair manuals\. Prometheus is therefore the engine room ofOdyssey, but not the entire system: it runs after Scylla, Homer, and Athena have fixed the intent, workflow, and representation contract\.

#### Toulmin: foundry state to argument\.

Toulmin is the argumentation agent layered over maintained foundry state\. It does not replace Prometheus; Prometheus remembers and audits local sections, while Toulmin turns those sections into defensible arguments\. A Toulmin ticket records a claim, its data or grounds, the warrant licensing the inference, backing for that warrant, qualifiers describing the force of the claim, rebuttals that restrict where it applies, and the contexts over which the argument can be transported\. This makes the evidence/argument foundry an active structure:Odysseycan answer not only “what does the foundry contain?” but also “why is this claim warranted here, and where would the argument fail?”

## 4Foundry Algebra

Odysseyis organized around a small algebra of foundries\. A generic foundry is a reusable representational or process architecture\. Representational foundries include Toulmin\-style evidence/argument, operational decision, institutional/financial, market meaning, scientific challenge, document coherence, database\-atlas incorporation, and Prometheus\-model incorporation foundries\. Process foundries include research\-program, assistant\-build, evaluation\-harness, product\-development, codebase\-evolution, and result\-communication foundries\.

A specialized foundry is a typed expression built from generic foundries\. In the current demos:

DKS=specialize\(compose\(storefront\_operations,brand\_meaning,corporation\_financial,review\_evidence\),domain=“Dick’s Sporting Goods”\),Indus=specialize\(compose\(scientific\_challenge,evidence\_argument,symbol\_sequence\_model,hypothesis\_lattice,archaeology\_context\),domain=“Indus Script”\)\.\\begin\{array\}\[\]\{l\}\\mathrm\{DKS\}=\\mathrm\{specialize\}\(\\mathrm\{compose\}\(\\mathrm\{storefront\\\_operations\},\\mathrm\{brand\\\_meaning\},\\\\ \\qquad\\mathrm\{corporation\\\_financial\},\\mathrm\{review\\\_evidence\}\),\\mathrm\{domain\}=\\text\{\`\`Dick's Sporting Goods''\}\),\\\\\[5\.69054pt\] \\mathrm\{Indus\}=\\mathrm\{specialize\}\(\\mathrm\{compose\}\(\\mathrm\{scientific\\\_challenge\},\\mathrm\{evidence\\\_argument\},\\\\ \\qquad\\mathrm\{symbol\\\_sequence\\\_model\},\\mathrm\{hypothesis\\\_lattice\},\\mathrm\{archaeology\\\_context\}\),\\mathrm\{domain\}=\\text\{\`\`Indus Script''\}\)\.\\end\{array\}The formal operations include composition, specialization, restriction, gluing, product, coproduct, quotient, and lift\. In implementation terms, these operations determine which local sections must exist, which overlaps must be checked, which source sheaves may be bridged, and which artifacts become durable state\.

This algebra givesOdysseya discipline for growth\. New demos should not enter the system as unrelated scripts\. They should declare the generic foundries they instantiate, the target domain, the local contexts, the restriction maps, and the promotion gate that decides when an output can become maintained foundry state\.

## 5Foundry SQL and TICKET Admission

Odysseyuses Foundry SQL \(FSQL\) as a compact typed query surface over foundry artifacts\. Standard SQL queries build typed constructions over tables; FSQL generalizes the same idea from rows and joins to foundry sections, source sheaves, restriction maps, gluing checks, and promotion gates\. In this analogy, tables become local charts and artifact stores, rows become claims or model states, foreign keys become shared identifiers and provenance links, joins become gluing checks, views become Scylla\-facing dashboards, and constraints become Athena and Prometheus audits\.

### 5\.1Universal Foundry Learners

The categorical template behind the admission story is the Universal Decision Learner \(UDL\) pattern fromMahadevan \([2025c](https://arxiv.org/html/2606.27593#bib.bib18)\)\. UDL characterizes a decision rule by its universal property rather than by one numerical algorithm\.

###### Definition 5\.1\(Universal Decision Learner\)\.

Given local decision dataJ:𝒪→𝒞J:\\mathcal\{O\}\\to\\mathcal\{C\}andD:𝒪→𝒟D:\\mathcal\{O\}\\to\\mathcal\{D\}, a Universal Decision Learner is the composite Kan\-extension semantics

UDLJ\(D\)=RanJ\(LanJD\),\\mathrm\{UDL\}\_\{J\}\(D\)=\\mathrm\{Ran\}\_\{J\}\(\\mathrm\{Lan\}\_\{J\}D\),whenever the displayed Kan extensions exist, with the evident restriction or precomposition steps understood when needed for typing\. More generally, UDL denotes any decision semantics obtained by composing left and right Kan extensions along problem\-specific inclusions of local context into global context\.

Operationally, the construction has two stages:

local decision data→Lanrolled\-out candidates→Ranglobally consistent decisions\.\\text\{local decision data\}\\xrightarrow\{\\ \\mathrm\{Lan\}\\ \}\\text\{rolled\-out candidates\}\\xrightarrow\{\\ \\mathrm\{Ran\}\\ \}\\text\{globally consistent decisions\}\.The left Kan stage expands, aggregates, or propagates information\. The right Kan stage filters, glues, or enforces consistency\. In reward\-enriched settings, for example over the max\-plus semiring, the left Kan stage recovers the familiar dynamic\-programming rollout

\(LanJD\)\(c\)=maxJd→c⁡\(D\(d\)\+w\(d→c\)\),\(\\mathrm\{Lan\}\_\{J\}D\)\(c\)=\\max\_\{Jd\\to c\}\\bigl\(D\(d\)\+w\(d\\to c\)\\bigr\),while the right Kan stage imposes the tightest downstream inequalities induced by all continuations\. In deterministic one\-step planning this specializes to the Bellman recurrence

V\(s\)=maxa⁡\{r\(s,a\)\+V\(T\(s,a\)\)\}\.V\(s\)=\\max\_\{a\}\\\{r\(s,a\)\+V\(T\(s,a\)\)\\\}\.
###### Definition 5\.2\(Universal Foundry Learner\)\.

Letxxbe a candidate source artifact,YYa target foundry, and

Fx,Y:𝒞x→𝒞YF\_\{x,Y\}:\\mathcal\{C\}\_\{x\}\\to\\mathcal\{C\}\_\{Y\}the declared source\-to\-target interface functor\. IfX:𝒞xop→𝖣𝖺𝗍𝖺X:\\mathcal\{C\}\_\{x\}^\{op\}\\to\\mathsf\{Data\}is the candidate presheaf, define its admitted target\-side rollout by

A=LanFx,YX\.A=\\mathrm\{Lan\}\_\{F\_\{x,Y\}\}X\.The Universal Foundry Learner associated with this TICKET interface is the right\-Kan consistency envelope

UFLx,Y\(X\)=RanFx,YFx,Y∗A\.\\mathrm\{UFL\}\_\{x,Y\}\(X\)=\\mathrm\{Ran\}\_\{F\_\{x,Y\}\}F\_\{x,Y\}^\{\*\}A\.Thus UFL is the foundry\-construction specialization of UDL: left Kan admission constructs the least target\-side foundry candidate, and right Kan consistency collects the target obligations visible through the same interface\.

###### Theorem 5\.3\(Canonical foundry rollout\)\.

LetX:𝒞xop→𝖣𝖺𝗍𝖺X:\\mathcal\{C\}\_\{x\}^\{op\}\\to\\mathsf\{Data\}andFx,Y:𝒞x→𝒞YF\_\{x,Y\}:\\mathcal\{C\}\_\{x\}\\to\\mathcal\{C\}\_\{Y\}be as above\. IfA=LanFx,YXA=\\mathrm\{Lan\}\_\{F\_\{x,Y\}\}Xexists, then for every target foundry modelG:𝒞Yop→𝖣𝖺𝗍𝖺G:\\mathcal\{C\}\_\{Y\}^\{op\}\\to\\mathsf\{Data\}there is a natural bijection

Nat\(A,G\)≅Nat\(X,Fx,Y∗G\)\.\\mathrm\{Nat\}\(A,G\)\\cong\\mathrm\{Nat\}\(X,F\_\{x,Y\}^\{\*\}G\)\.HenceAAis initial among target\-side foundry candidates receiving the local source artifactXX\.

###### Proof\.

This is the defining universal property of the left Kan extension, read in the foundry category\. To compare the rolled\-out target candidateAAwith any target modelGGis equivalently to compare the source artifactXXwith the restriction ofGGalong the declared TICKET interface\. ∎

###### Theorem 5\.4\(Canonical foundry consistency\)\.

IfRanFx,YFx,Y∗A\\mathrm\{Ran\}\_\{F\_\{x,Y\}\}F\_\{x,Y\}^\{\*\}Aexists, then every target modelG:𝒞Yop→𝖣𝖺𝗍𝖺G:\\mathcal\{C\}\_\{Y\}^\{op\}\\to\\mathsf\{Data\}whose restriction is compatible with the rolled\-out candidate admits a canonical comparison map

G⟶UFLx,Y\(X\)\.G\\longrightarrow\\mathrm\{UFL\}\_\{x,Y\}\(X\)\.When this comparison is an isomorphism,GGcomputes the same foundry semantics as the Universal Foundry Learner\.

###### Proof\.

This is the defining universal property of the right Kan extension\. It says that globally defined foundry models satisfying the source\-visible target obligations factor through the canonical consistency envelope\. ∎

The first maintained FSQL admission construction is TICKET:

TICKET=Topos Integration using Causal Kan Extension Transformers\.\\textsc\{TICKET\}=\\text\{Topos Integration using Causal Kan Extension Transformers\}\.The phrase “Kan Extension Transformer” comes from the broader categorical learning program developed inMahadevan \([2025c](https://arxiv.org/html/2606.27593#bib.bib18)\)and in the Kan Extension Transformers arXiv manuscript\(Mahadevan,[2026b](https://arxiv.org/html/2606.27593#bib.bib21)\)\. In the language\-modeling setting, a KET is a neural sequence model whose inductive bias is organized by Kan\-extension\-style transport between local contexts\. TICKET uses the same categorical idea, but not the same object\. It is not a text generator and it is not a replacement for the KET language model\. It is an admission transformer: a typed procedure that extends a source model along a declared map into anOdysseytarget foundry, then tests whether the transported state satisfies the target cover, restrictions, and gluing laws\.[AppendixA](https://arxiv.org/html/2606.27593#A1)gives the appendix\-level categorical specification, including the source\-to\-target functor, the left\-Kan admission move, the right\-Kan UDL consistency pass, and the formal gluing operation used by the GUI TICKET cards\.

Thus, TICKET is the bridge between*model construction*and*maintained foundry state*\. A Prometheus run, CSQL atlas, benchmark repository, repair\-manual slice, or scientific\-evidence bundle may already have local structure, butOdysseydoes not admit it merely because it exists\. The source must be sliced into local sections, transported into the target foundry, checked against Athena’s representation contract, and compared with maintained state\. Only then can it be promoted, quarantined for later repair, or preserved as an obstruction record\.

Source artifactX:𝒞xop→𝖣𝖺𝗍𝖺X:\\mathcal\{C\}\_\{x\}^\{op\}\\to\\mathsf\{Data\}documents, events, claims, tracesLeft\-Kan rolloutLanFx,YX=A\\mathrm\{Lan\}\_\{F\_\{x,Y\}\}X=Adocument\-claim candidateTarget foundryYYsite𝒞Y\\mathcal\{C\}\_\{Y\}, cover, restrictions, Toulmin rolesRight\-Kan / pullback scrutinyRanFx,Y\(Fx,Y∗A\)\\displaystyle\\mathrm\{Ran\}\_\{F\_\{x,Y\}\}\(F\_\{x,Y\}^\{\*\}A\)consistency of claim, grounds, warrantMaintained stateMYM\_\{Y\}existing local sections and obstruction ledgerToulminticketclaim, grounds, warrant, qualifier, rebuttalTICKET decisionpromote, quarantine, or preserve obstructionFx,YF\_\{x,Y\}scrutinize rolloutglue / compareargument rolescandidate claimFigure 2:TICKET asOdyssey’s admission and consistency operator\. The left Kan extension rolls a structured source artifact into the target foundry as a document\-claim candidate\. The right Kan / pullback pass supplies the UDL\-style consistency check: the same interface asks whether the candidate claim, its grounds, and its warrant survive Toulmin scrutiny against the target cover and maintained foundry state\. Promotion occurs only when the rolled\-out claim is warranted, qualified, and compatible with the existing obstruction ledger\.Table 2:TICKET specializes the Kan Extension Transformer idea to foundry ingestion\. The transformer acts on structured model artifacts and sheaf contracts, not directly on next\-token prediction\.Operationally, TICKET admits a pre\-built or external model intoOdysseyby slicing the source run, transporting its cover into a target foundry by a Kan\-extension\-style transformer, checking causal orientation and provenance, auditing restrictions and gluing, and then deciding whether the candidate state is promoted, quarantined, or held as an obstruction\. The current admission checks are:

typed\_manifest,subobject\_classifier,restriction\_maps,gluing\_audit,j\_closure,promotion\_gate\.\\begin\{gathered\}\\texttt\{typed\\\_manifest\},\\ \\texttt\{subobject\\\_classifier\},\\ \\texttt\{restriction\\\_maps\},\\\\ \\texttt\{gluing\\\_audit\},\\ \\texttt\{j\\\_closure\},\\ \\texttt\{promotion\\\_gate\}\.\\end\{gathered\}
A typical FSQL surface has the form

```
SELECT object_type, object_id, status, source
FROM prometheus_runs
SLICE BY run("tcc_44k")
TICKET BY target_foundry
```

and returns the transformer, target foundry, checks, promotion record, and refresh obligations that govern admission\. The TCC 44K example treats the Testing Causal Claims cSQL atlas as a source causal\-claims foundation model: nodes, edges, support rows, methods, journals, p\-values, and causal\-direction metadata are lifted into a causal\-claims foundry while weak joins and direction uncertainties remain visible as guardrails\.

## 6BRIDGE/SKFM Refinement and SkillOpt Integration

The ICML tutorial version ofOdysseyincludes a new experimental lane that connects three pieces of the system: BRIDGE/SKFM causal\-geometry screens\(Mahadevan,[2026c](https://arxiv.org/html/2606.27593#bib.bib22)\), TICKET admission, and SkillOpt skill optimization\. The goal is not to collapse these into one optimizer\. BRIDGE/SKFM supplies a latent causal refinement foundry,Odysseysupplies the local sections, Toulmin qualifications, restriction maps, and promotion gates, and SkillOpt treats the natural\-language extraction policy itself as a trainable artifact\.

#### BRIDGE/SKFM foundry contract\.

The implemented contract is a generic*BRIDGE/SKFM latent causal refinement foundry*\. It refines existingOdysseyfoundries by mapping local causal, argument, filing, repair, or product\-feedback charts into typed variables, influence masks, candidate edge masks, Frobenius/Lie residual ratios, latent\-obstruction ledgers, and optional SKFM vector\-field diagnostics\. Four source\-family lanes are currently declared:

Democritus PDF causal corpus,MyFixIt repair PSRs,10\-K filing arguments,brand/product feedback\.\\text\{Democritus PDF causal corpus\},\\quad\\text\{MyFixIt repair PSRs\},\\quad\\text\{10\-K filing arguments\},\\quad\\text\{brand/product feedback\}\.The promotion rule is deliberately conservative: residual\-bearing pairs are kept as obstruction candidates unless the source family, local section, Toulmin warrant, qualifier, and missing\-context explanation are preserved\. Numeric SKFM training is enabled only when calibrated traces, interventions, probes, or comparable field targets exist; otherwise the current stage is reported as a proxy\-field BRIDGE screen rather than as a learned causal vector field\.

#### Stage\-0 BRIDGE/SKFM experiments\.

The first Democritus PDF scale\-up screened2929article charts grouped into66cluster sections using Kimi 2\.5 served through a localexoOpenAI\-compatible endpoint\. It found11gluable mechanism overlap and77overlaps requiring review\. The Lie\-BRIDGE adapter emitted1515language variables,2929claim fields,1515candidate edges, and66latent obstruction pairs\. These artifacts are useful precisely because they do not pretend that latent variables have been discovered as named hidden causes\. They are reviewable residual witnesses over causal\-PSR cells, suitable for later numeric BRIDGE/SKFM scoring when stronger intervention or trajectory data are available\.

The same contract has deterministic adapters for MyFixIt, 10\-K, brand feedback, and Democritus language\-extraction slices\. In the MyFixIt lane, repair\-manual action\-observation fields are converted into tool, part, verb, image\-context, and future\-test variables\. A follow\-up retrieval experiment separates two effects: simply adding BRIDGE vocabulary did not improve the retrieval metric, whereas adding manual\-context upper\-bound fields improved MRR and moved the Phillips\-screw failure slice to rank11\. The 10\-K, brand\-feedback, and Democritus language\-extraction adapters likewise improve targeted section, theme, stance, or causal\-language retrieval while retaining latent obstruction rows for audit\.

#### SkillOpt as policy optimization for TICKET admission\.

SkillOpt is integrated through anodyssey\_causal\_claimenvironment\(Yang et al\.,[2026](https://arxiv.org/html/2606.27593#bib.bib33)\)\. Each rollout gives a model a short scientific, news, or business passage and asks for a strict JSON causal claim bundle: claim, cause, effect, relation, qualifier, evidence span, mechanism, scope, guardrail, and TICKET status\. The scorer combines field accuracy withOdyssey\-specific diagnostics: TICKET compatibility, diagrammatic consistency energy inspired by infinitesimal\-causality tests\(Mahadevan,[2026a](https://arxiv.org/html/2606.27593#bib.bib20)\), graph similarity, and a KET\-style transfer score across local contexts\. The optimized object is the Markdown skill that instructs the target model how to avoid causal overclaiming, preserve source scope, downgrade associations, and emit admission\-compatible fields\.

Table 3:SkillOpt runs for theodyssey\_causal\_claimenvironment\. All three runs use the same split and local OpenAI\-compatible model route\. The important artifact is not only the final score, but the optimized skill and the failure traces that explain which admission rules, qualifiers, or guardrails changed\.This integration gives a concrete empirical role to the categorical machinery\. Diagrammatic backpropagation contributes consistency losses over the claim diagram; infinitesimal\-causality diagnostics test local causal variation; geometric\-transformer\-style comparison scores relational fit between predicted and admitted claim graphs; KET\-style transfer tests whether a skill learned in one local regime extends to neighboring domains; and BRIDGE/SKFM records residual\-bearing pairs that should remain local until stronger evidence justifies promotion\. SkillOpt then optimizes the human\-readable policy used by the model, whileOdysseypreserves the admission ledger that decides whether the resulting claims become maintained foundry state\.

## 7FromDemocritustoPrometheus

Democritusis the predecessor pipeline: a language\-to\-causal\-model system for compiling documents into local causal models, causal databases, and interactive diagnostic artifacts\(Mahadevan,[2025a](https://arxiv.org/html/2606.27593#bib.bib16)\)\. A public implementation of the releasedDemocritusclient is available asDemocritus\_OpenAI\(Mahadevan,[2025d](https://arxiv.org/html/2606.27593#bib.bib19)\)\. The broader categorical machine learning background for this line of work is developed inMahadevan \([2025c](https://arxiv.org/html/2606.27593#bib.bib18)\)\. It extracts local causal claims, organizes them into causal triples or local causal models, and stores structured outputs in cSQL\-like causal tables\. This is already useful: it turns unstructured text into queryable causal objects\.

Prometheuschanges the central object\. Instead of treating a local DAG or causal table as the final representation,Prometheustreats it as evidence for a local predictive\-state model\. A local model records not only thatXXcauses or influencesYY, but which histories and tests are present, what the model predicts under those tests, how much support each cell has, and where the evidence came from\. The global object is not one merged DAG\. It is a sheaf\-like family of local models, equipped with restriction and gluing diagnostics\.

Table 4:Prometheusinherits the extraction discipline ofDemocritusbut shifts the representation from graph\-centered synthesis to predictive\-state sheaves\.This shift matters because deep research is rarely about finding one graph\. It is about understanding which local graphs, claims, and predictions are transportable\. Topological organization gives the system a way to say: these regions agree, these overlap but pull apart, and this claim should not be moved without additional evidence\.

## 8Toulmin: Argumentation as a Foundry Agent

The Democritus–Prometheus lineage made causal extraction durable: a document could yield local causal triples, local causal models, and eventually local PSR sections\. What it did not yet make first class was the formal structure of an argument\. Recent NLP work has shown that Toulmin’s theory can also be used as a zero\-shot prompt frame for explicating informal arguments into claims, reasons, and warrants\(Gupta et al\.,[2024](https://arxiv.org/html/2606.27593#bib.bib8)\)\. InOdyssey,Toulminfills the complementary systems gap\. Following Toulmin’s analysis of practical reasoning as field\-dependent, defeasible, and context\-sensitive\(Toulmin,[1958](https://arxiv.org/html/2606.27593#bib.bib28)\), the new agent treats argumentation as a primary foundry capability rather than as prose wrapped around a causal triple\.

Table 5:Toulmin structure gives the evidence/argument foundry an explicit fiber of argumentative roles\. Gluing then tests not only whether claims agree, but whether their warrants, backing, qualifiers, and rebuttals remain compatible on overlaps\.This changes the topology of anOdysseyfoundry\. A causal extraction such asA→BA\\to Bis no longer only a directed edge or table row\. Toulmin expands it into a structured argument object:

\(grounds,warrant,backing,qualifier,rebuttals\)⇒claim\.\(\\mathrm\{grounds\},\\mathrm\{warrant\},\\mathrm\{backing\},\\mathrm\{qualifier\},\\mathrm\{rebuttals\}\)\\Rightarrow\\mathrm\{claim\}\.Prometheus can still build the persistent local sections, but Toulmin asks what licenses each section as an assertion\. On an overlap, two papers may disagree because their claims conflict, because they share a claim but use incompatible warrants, because one paper supplies a rebuttal that restricts the other, or because a qualifier narrows the open set over which the argument is valid\. These are different obstruction types, andOdysseyshould preserve the difference\.

The sheaf interpretation is direct\. A Toulmin foundry is a presheaf of argument objects over a context category: each local site carries claims, grounds, warrants, backing, qualifiers, and rebuttals; restriction maps move arguments to smaller regimes; gluing tests descent not merely for extracted facts but for warranted arguments\. Thus Prometheus remembers persistent foundry state, while Toulmin reasons over it\. The resulting answer contract is not “the model saysXX,” but “XXis warranted in these contexts, by these grounds and warrants, with these qualifiers, unless these rebuttals or obstructions apply\.”

This is especially important for domains such as legal reasoning, policy analysis, scientific debate, compliance, peer review, strategic planning, and software\-architecture justification\. In these settings, global Boolean consistency is often the wrong target\. What matters is whether local arguments cohere under the warrants and standards of their field, whether rebuttals are visible, and whether cross\-context transport is licensed\. Toulmin therefore turns the evidence/argument foundry from a passive storage surface into an active argumentative structure insideOdyssey’s sheaf design\.

#### TRACE as an LLM\-centric Toulmin baseline\.

Recent work has also begun to import Toulmin\-style argumentation into the evaluation of large\-language\-model reasoning\. TRACE\(Kim and Yang,[2026](https://arxiv.org/html/2606.27593#bib.bib13)\)is a reference\-free framework for evaluating chain\-of\- thought outputs by segmenting a model’s reasoning into sentences, labeling each sentence with constructive elements such as Claim, Data/Evidence, Warrant, Backing, Qualifier, Rebuttal, Monitoring, and Evaluation, and then combining State Validity with Transition Coherence into an interpretable reasoning score\. This is an important point of contact forOdyssey: it gives a concrete Toulmin\-inspired baseline against which ToulminSheaf argument foundries can be tested on LLM reasoning traces\.

The difference is the unit of construction\. TRACE treats Toulmin structure as an evaluator for a generated chain\-of\-thought sequence: the input is an LLM reasoning trace, the output is a sentence\-label sequence and scalar score\. A ToulminSheaf treats argumentation as part of the foundry itself\. Its sites may be paragraphs, documents, studies, filings, tables, causal triples, code changes, tool traces, or model\-generated rationales; its restriction maps move arguments across contexts; and its gluing tests whether claims, grounds, warrants, backing, qualifiers, and rebuttals remain compatible on overlaps\. Thus TRACE is naturally an evaluation\-harness specialization insideOdyssey, whereas ToulminSheaf is a generic foundry agent\. Document\-coherence foundries and Democritus\-style causal foundries are special cases of ToulminSheaf foundries, while TRACE supplies an LLM\-centric baseline for testing whether ToulminSheaf gluing quality agrees with or improves upon sentence\-level CoT validity and transition\-coherence scores\.

#### Current grounded local\-LLM implementation\.

The repository now contains a first executable Toulmin layer over maintainedOdyssey/Prometheus state\. The command\-line interface inventories locally cached Hugging Face and MLX models, exposes domain aliases, builds visible Toulmin prompts, and runs a selected local model over document\-grounded claim packets\. Each packet contains an extracted Prometheus event, its source context, the normalized document claim, source\-topic alignment, matched source terms, restriction and gluing\-diagnostic counts, and any source\-coherence warning\. The local model is asked to emit only a visible Toulmin ticket: claim, grounds, warrant, qualifier, and rebuttal\.Odysseythen compares the ticket with the grounded packet, storing the result inlocal\_llm\_grounded\_toulmin\_comparison\.jsonand rendering it through a Scylla page\.

Before adding the neural inspector, we ran seven deterministic Toulmin certification cases across the currentOdysseydomains: Indus uncertainty, TRACE science reasoning, TRACE math reasoning, TCC causal claims, 10\-K filing arguments, brand\-product feedback, and Democritus causal claims\. Table[6](https://arxiv.org/html/2606.27593#S8.T6)summarizes the result\. In all seven cases the foundry built a full Toulmin ticket with the required claim, grounds, warrant, qualifier, and rebuttal roles, plus a warrant bridge and argument hyperedge\. None of the seven was silently promoted\. Each retained an explicit rebuttal obstruction and was quarantined for review, which is the desired behavior for a certification layer: Toulmin certifies that the argument surface is complete enough to inspect, not that the claim has become globally true\.

Table 6:Deterministic Toulmin certification cases\. All seven runs constructed the required role coverage, one warrant bridge, and one argument hyperedge\. All seven also retained an explicit rebuttal obstruction, so the promotion decision wasquarantine\_for\_reviewrather than admission without exception surface\.Table 7:Scale\-independent obstruction in the grounded Toulmin pilot\.Odysseyran the same six Democritus/Prometheus grounded claims through three local MLX models\. The models differed in ticket style and verbosity, but all converged on the same Scylla review case, indicating that the obstruction is tied to the sheaf\-grounded packet rather than to a particular local model\.This implementation makes Toulmin a neural argument inspector rather than a truth oracle\. It checks whether the model preserves the extracted document claim and whether its warrant honestly respects source alignment, restrictions, gluing diagnostics, and rebuttals\. Table[7](https://arxiv.org/html/2606.27593#S8.T7)reports the first scale\-sensitivity check: on the same six Democritus/Prometheus grounded claims, Llama 3\.2–3B, Qwen3\-Next\-80B, and Qwen3\-235B all exposed the same Dinosaur Extinction review packet\. This is the intended failure mode\. Scylla does not merely display the model’s argument, but compares the argument against the sheaf\-grounded packet that licensed the prompt\.

## 9Odysseyas Algebraic Headquarters

The earlier sections introducedOdyssey’s foundry algebra and admission logic\. The system\-level point is thatOdysseyis not a renamed Prometheus run\. It is the algebraic headquarters that decides which representational object is being built, which local contexts must exist, which overlap laws must be checked, and which artifacts may enter durable state\. Prometheus remains crucial, but it is called through typed contracts rather than used as an ungoverned side channel\.

Figure[3](https://arxiv.org/html/2606.27593#S9.F3)shows the currentOdysseypipeline\. A human request first becomes a Scylla\-facing intent and answer contract\. Homer compiles that intent into an executable workflow and, when needed, a Prometheus job contract\. Athena type\-checks the foundry expression: cover, local representation family, truth values, restriction maps, gluing rule, bridges, and obstruction policy\. Prometheus then builds or refreshes the requested artifact bundle, including BRIDGE/SKFM refinement screens and IDC\-style local causal diagnostics when the target foundry calls for causal admission\. Toulmin constructs the argument ticket: claims, grounds, warrants, backing, qualifiers, rebuttals, and applicable contexts\. Finally,Odysseyruns TICKET admission against the target foundry, promoting compatible state and quarantining or blocking non\-gluing candidates; SkillOpt can use the resulting admission traces to optimize the natural\-language policy that produced the next candidate ticket\.

User request and domain goalScylla: foundry intent, answer contract, responsible explanationHomer: workflow skeleton, replay obligations, Prometheus job contractAthena: cover, local representation types, restrictions, gluing rule, obstruction policyPrometheus: source ingestion, local sections, audits, dashboards, artifact bundle, BRIDGE/SKFM screensToulmin: construct warranted claims, qualifiers, rebuttals, and argument ticketsTICKET: restrict to target foundry, run IDC\-style local causal diagnostics, promote or quarantineSkillOpt: optimize admission skill from rollout, reflection, and held\-out gate tracesDurableOdysseyfoundry state with Scylla\-facing views and refresh contractsFigure 3:Odysseyturns a user request into maintained foundry state through typed Scylla, Homer, Athena, Prometheus, Toulmin, TICKET, and SkillOpt contracts\. In causal foundries, Prometheus may emit BRIDGE/SKFM residual\-obstruction artifacts, TICKET may apply IDC\-style local causal diagnostics, and SkillOpt can optimize the admission skill using the resulting rollout and gate traces\.#### Odysseystate\.

The promoted object is not only a generated HTML page or a model snapshot\. It is a typed foundry state: local sections, source sheaves, restriction maps, gluing records, obstruction ledgers, promotion decisions, refresh obligations, and user\-facing views\. This is why examples such as Dick’s Sporting Goods, Amazon Reviews 2023, Indus Script, MyFixIt, TCC 44K, research\-program, and assistant\-build can share one architecture while retaining different local semantics\.

#### Contract discipline\.

Odyssey’s main systems constraint is that every backend artifact must cross a typed interface\. A Prometheus run may propose a source manifest, world model, gluing audit, dashboard, model manifest, or refresh plan\. TICKET then decides whether those objects restrict into the target foundry and whether their overlaps glue with maintained state\. This turns ingestion from an append\-only file drop into an algebraic admission problem\.

## 10Athena: Representation Contracts

Athena is the module that preventsOdysseyfrom collapsing into ordinary retrieval\. It decides what kind of object the system is allowed to build\. In the current implementation, a Homer program is compiled intoathena\_sheaf\_plan\.json, which records a representation strategy, a finite truth\-value strategy, a cover of local contexts, overlap declarations, gluing logic, artifact strategy, source sheaves, and cross\-sheaf bridges\.

The key Athena output is a cover\. For a corporation model, local contexts include financial statements, risk factors, management discussion, strategy signals, and market context\. For an Indus Script model, they include symbol inventory, inscription sequences, statistical structure, decipherment hypotheses, archaeological context, and controversy uncertainty\. For an assistant\-build foundry, they include users, tasks, tool permissions, memory boundary, autonomy level, policy, evals, simulations, deployment surface, and maintenance cadence\. These covers are not labels for a dashboard; they define where predicates may be evaluated and where transport is allowed\.

Athena also owns the finite truth semantics used by the deterministic validation artifacts:

⊥,WEAK,PLAUSIBLE,SUPPORTED,⊤\.\\bot,\\ \\mathrm\{WEAK\},\\ \\mathrm\{PLAUSIBLE\},\\ \\mathrm\{SUPPORTED\},\\ \\top\.This partition givesOdysseya small, inspectable substitute for uncalibrated confidence prose\. An overlap glues when shared predicates are at least plausible and no paired predicate is bottom\. If the overlap does not glue, Athena’s contract requires a Prometheus obstruction record: which predicates failed, which evidence was shared, and what next observation would be needed\.

Athena is therefore the type checker for foundry construction\. It specifies the local representation family before Prometheus builds artifacts, and it specifies the gluing law beforeOdysseyis allowed to promote an artifact into maintained state\.

## 11Homer: Workflow and Replay Contracts

Homer is the orchestration layer\. Given a Scylla brief, it emitsmodel\.homer\.json: a workflow skeleton, required contexts, required audits, Athena requests, and Prometheus build steps\. The current default workflow parses the user goal, constructs a candidate cover, requests Athena’s sheaf plan, builds a Prometheus world model, runs gluing audits, and serves a Scylla explanation\.

Homer becomes especially important for process foundries\. A research\-program foundry needs source manifests, goals, hypotheses, methods, experiments, evidence, artifact registries, refresh cadence, and obstruction policies\. An assistant\-build foundry needs tool\-permission manifests, eval suites, simulation traces, failure\-mode analysis, deployment surfaces, model cards, and maintenance plans\. An evaluation\-harness foundry needs dataset restrictions, task lanes, embedding adapters, query\-document objects, metric protocols, failure slices, and promotion gates\. Homer records these obligations as a program rather than as prose\.

When Prometheus is required, Homer compiles a job contract rather than issuing an informal backend request\. The contract names the target foundry, source family, collection plan, ingestion adapter, expected artifacts, budget limits, replay metadata, and refresh cadence\. This lets a future run be compared with the original build and lets TICKET decide whether new artifacts should replace, extend, or quarantine earlier state\.

## 12PrometheusInsideOdyssey

The Prometheus described in the earlier arXiv paper built causal Topos World Models from research corpora\. InsideOdyssey, Prometheus has a narrower and more operational role\. It is the engine room that materializes Athena’s plan: local sections, predicate values, restriction maps, overlap results, integration layers, update status, evidence sources, HTML explorers, Scylla\-facing dashboards, ingestion packets, and callback bundles\.

This newer Prometheus is artifact\-first\. It returns a bundle rather than a trusted foundry:

source\_manifest,artifact\_manifest,prometheus\_world\_model\.json,gluing\_audit\.json,obstruction\_ledger,dashboards,refresh\_plan\.\\begin\{gathered\}\\texttt\{source\\\_manifest\},\\quad\\texttt\{artifact\\\_manifest\},\\quad\\texttt\{prometheus\\\_world\\\_model\.json\},\\\\ \\texttt\{gluing\\\_audit\.json\},\\quad\\texttt\{obstruction\\\_ledger\},\\quad\\texttt\{dashboards\},\\quad\\texttt\{refresh\\\_plan\}\.\\end\{gathered\}Odysseythen performs admission\. A bundle may be promoted when it restricts to the target foundry and glues with maintained state\. It may be quarantined when it is useful as a source atlas but not yet part of the durable foundry\. It may be blocked when provenance, typing, or overlap compatibility fails\.

Prometheus also remains the place where heavier evidence construction happens\. It can ingest papers, filings, reviews, repair manuals, CSQL/DuckDB atlases, benchmark outputs, simulation traces, and prior Prometheus runs\. It can build domain\-specific source sheaves such as shopping\-experience, corporate\-workflow, Amazon Reviews 2023, Indus evidence, MyFixIt procedural PSRs, and TCC causal claim atlases\. What changes inOdysseyis governance: Prometheus proposes evidence\-bearing world\-model artifacts;Odysseydecides, through Athena’s laws and TICKET admission, what becomes maintained foundation\-model state\.

### 12\.1BidirectionalOdyssey–PrometheusTransport

The current implementation is bidirectional\. In the forward direction,Odysseycan import an existing Prometheus run as a candidate model\. The importer inspects a Prometheus output directory, loads its world model, maps the source family to a target foundry, synthesizes a Homer\-style job contract when one is missing, builds a TICKET admission record, and exposes the result to Scylla through an ingestion console\. The current console records129129inspected Prometheus\-family cases:66admittedOdyssey\-native Prometheus bundles,66queryable but quarantined bundles with preserved obstruction ledgers, and117117legacy Prometheus v1 run\-file candidates that remain visible but are not treated as durableOdysseystate until TICKET review is complete\. This is an important empirical point:Odysseyalready treats roughly one hundred pre\-existing Prometheus models as typed candidates rather than as opaque files or trusted global models\.

In the reverse direction,Odysseycan compile a foundry back into a Prometheus GUI\-compatible Topos World Model layout\. The compiler projectsOdysseylocal sections into Prometheus contexts, converts source\-sheaf events into Prometheus episodes, emits local PSRs and sheaf objects, writes restrictions and gluing diagnostics, and preserves theOdysseyingestion packet and source artifacts\. The folder layout is deliberately compatible with the Prometheus atlas workbench, while the manifest records the actualOdysseysource family and target foundry\. For DKS, this reverse path exports the sporting\-goods retailer foundry as a DKS Prometheus run with shopping\-experience and corporate\-workflow sheaves, a hidden\-state\-capture diagnostic, gluing diagnostics, and a health\-history record\. The point is not to abandonOdyssey’s admission discipline, but to use Prometheus as a visual microscope: we can inspect the sheaf structure, local PSRs, restrictions, and gluing tensions explicitly, then carry the annotated tensions back intoOdyssey’s promotion ledger\.

Table 8:Bidirectional transport betweenOdysseyand Prometheus\. Import keeps legacy Prometheus models behind TICKET gates; export uses Prometheus as an inspection surface forOdysseyfoundries\.

## 13Prometheus PSR Interface

OdysseyusesPrometheusas a world\-model construction engine, but the full language\-to\-PSR derivation belongs to the Prometheus paper\(Mahadevan,[2026d](https://arxiv.org/html/2606.27593#bib.bib23)\)\. Here we only need the interface thatOdysseyreceives from Prometheus and admits through TICKET\. A Prometheus artifact presents a finite family of local predictive\-state sections over a declared cover\. Each local section stores histories, tests or predictive motifs, prediction/support values, provenance, diagnostics, restriction maps, gluing results, and obstruction records\.

InOdysseyterms, the important object is not the estimator that produced the local PSR tables, but the typed contract they satisfy\. For each contextUU, Prometheus emits a local section of the schematic form

𝒫\(U\)=\(HU,TU,MU,SU,ΠU,DU\),\\mathcal\{P\}\(U\)=\(H\_\{U\},T\_\{U\},M\_\{U\},S\_\{U\},\\Pi\_\{U\},D\_\{U\}\),whereHUH\_\{U\}andTUT\_\{U\}are the finite histories and tests visible in that context,MUM\_\{U\}is the predictive table or support score,SUS\_\{U\}records support,ΠU\\Pi\_\{U\}records provenance, andDUD\_\{U\}records diagnostics such as sparsity, uncertainty, extraction confidence, and local mismatch\. A morphismV→UV\\to Ucarries a restriction map that projects histories, tests, claims, and provenance fromUUto the overlap visible inVV\. Gluing diagnostics then compare the restricted local sections on shared signatures\.

This is the only PSR machineryOdysseyneeds for foundation\-model construction\. Scylla and Homer do not depend on a particular spectral estimator; Athena only needs the cover, restriction, and gluing law; TICKET only needs the admission packet, compatibility result, and obstruction ledger; and Toulmin only needs the claim, grounds, warrant, qualifier, and rebuttal that can be read from the maintained local state\. A Prometheus artifact can therefore be treated as a source foundry candidate: it is promoted when its local sections restrict and glue under the target foundry contract, quarantined when useful but not yet compatible, and blocked when the obstruction ledger shows missing provenance, failed overlap compatibility, or unsupported transport\.

The operational sheaf condition used byOdysseyis deliberately finite\. Local sections over a cover may be promoted only when their restrictions agree on supported shared cells up to the declared tolerance\. Unsupported cells remain local, and incompatible cells are preserved as obstruction records rather than averaged away\. This is what letsOdysseyuse Prometheus as an engine\-room component without making this paper repeat the Prometheus construction: the vision paper studies how such artifacts become durable foundation\-model state, not how the underlying local PSR estimator is derived\.

## 14Finite Guarantees from Sheaf Foundries

The sheaf formalism is not only a visualization language\. It givesOdysseya small set of finite guarantees that are useful for foundation\-model construction\. The guarantees are conditional on the declared cover, restriction maps, support thresholds, and truth partition; they do not assert that extracted claims are true in the external world\. They assert thatOdysseywill only promote global state when the local evidence can be transported and glued under the declared contract\.

###### Definition 14\.1\(Foundry health\)\.

Let𝒰=\{Ui→U\}\\mathcal\{U\}=\\\{U\_\{i\}\\to U\\\}be a foundry cover and let each local PSR sectionsis\_\{i\}expose a finite matrix of truth\-weighted predictive cellsMi\[h,τ\]∈\[0,1\]M\_\{i\}\[h,\\tau\]\\in\[0,1\]\. Fix a promotion thresholdθ\\theta, corresponding in the implementation to the truth labelPLAUSIBLE\. The cell\-level hidden\-state\-capture ratio is

Hcell\(𝒰\)=∑i\|\{\(h,τ\):Mi\[h,τ\]≥θ\}\|∑i\|dom\(Mi\)\|\.H\_\{\\mathrm\{cell\}\}\(\\mathcal\{U\}\)=\\frac\{\\sum\_\{i\}\|\\\{\(h,\\tau\):M\_\{i\}\[h,\\tau\]\\geq\\theta\\\}\|\}\{\\sum\_\{i\}\|\\mathrm\{dom\}\(M\_\{i\}\)\|\}\.The context\-level capture ratio is

Hctx\(𝒰\)=\|\{i:Hcell\(Ui\)≥0\.6\}\|\|𝒰\|\.H\_\{\\mathrm\{ctx\}\}\(\\mathcal\{U\}\)=\\frac\{\|\\\{i:H\_\{\\mathrm\{cell\}\}\(U\_\{i\}\)\\geq 0\.6\\\}\|\}\{\|\\mathcal\{U\}\|\}\.Odysseyreports*stable hidden\-state capture*when both ratios are at least0\.60\.6,*partial hidden\-state capture*when the global cell ratio is at least0\.350\.35, and*weak hidden\-state capture*otherwise\.

This number is deliberately simple\. It measures whether the local predictive tables have enough plausible\-or\-better cells to serve as a maintained hidden state proxy\. It is not a final task score; it is a health diagnostic for the sheaf representation\. In the Prometheus export path, for example, DKS and MyFixIt receive a hidden\-state\-capture diagnostic before their sheaf structure is inspected visually\.

###### Theorem 14\.2\(Consistent promotion\)\.

Fix a foundry cover𝒰\\mathcal\{U\}, finite local sectionssis\_\{i\}, restriction mapsρij\\rho\_\{ij\}, support weights, and toleranceϵ\\epsilon\. If every overlap has support above threshold and

‖ρij\(si\)−ρji\(sj\)‖≤ϵ\\\|\\rho\_\{ij\}\(s\_\{i\}\)\-\\rho\_\{ji\}\(s\_\{j\}\)\\\|\\leq\\epsilonon every shared predictive cell, thenOdyssey’s support\-weighted aggregation constructs a promoted sectionsswhose restrictions agree with every local section up toϵ\\epsilonon supported overlaps\.

###### Proof\.

The implementation forms each promoted cell only from local cells that pass the declared support and compatibility checks\. The promoted value is a support\-weighted average of those compatible values\. Since each contributing local value differs from every other contributing local value by at mostϵ\\epsilon, the average also lies within the sameϵ\\epsilon\-diameter interval on the overlap\. Unsupported cells are not promoted, so no unsupported agreement is asserted\. ∎

###### Proposition 14\.3\(Obstruction soundness\)\.

If an overlap fails compatibility, lacks a restriction map, or lacks sufficient support,Odysseycannot silently promote the corresponding global claim through TICKET\. The failed overlap must appear as a gluing diagnostic, obstruction ledger entry, quarantine reason, or blocked promotion gate\.

###### Proof\.

TICKET admission reads the gluing audit and promotion status emitted by Prometheus\. The admission record is green only when required artifacts are present and preserved obstructions are empty for the target promotion lane\. A failed overlap therefore changes the admission state from admitted to quarantined, candidate, or blocked, and the obstruction is retained with the affected contexts and next observation\. ∎

###### Proposition 14\.4\(Non\-transportability certificate\)\.

Suppose a claimccis local toUiU\_\{i\}\. If there is no declared restriction path fromUiU\_\{i\}to a target contextVV, or if every such path contains an overlap whose tension exceeds tolerance, thenOdysseycannot promoteccas a claim overVV\. The best available status is local, quarantined, or awaiting new evidence\.

###### Proof\.

Promotion overVVrequires a chain of restriction maps that transports the claim’s histories, tests, support, and provenance intoVV, followed by a compatible gluing check\. If the chain is missing, the claim has no typed meaning inVV\. If the chain exists but contains a high\-tension overlap, the operational sheaf condition fails on that path\. In either case, TICKET lacks a valid promoted section overVV\. ∎

###### Proposition 14\.5\(Health monotonicity under compatible refinement\)\.

Let a foundry be refined by adding new local cells or new local sections whose truth weights are all at least the current thresholdθ\\theta, and whose overlaps are compatible with existing promoted sections\. ThenHcellH\_\{\\mathrm\{cell\}\}cannot decrease\. If the added sections also satisfy the context capture threshold,HctxH\_\{\\mathrm\{ctx\}\}cannot decrease\.

###### Proof\.

The cell ratio is a fraction whose numerator and denominator both increase by the number of added cells, because every added cell is above threshold\. Such an update cannot lower the ratio\. The same argument applies to context capture when every added section is captured; compatible overlaps ensure that the added cells are eligible for the same foundry state rather than being routed to a separate obstruction ledger\. ∎

###### Corollary 14\.6\(Auditability of model health\)\.

Changes in foundry health are attributable\. A decrease can only arise from adding weak cells, adding weak contexts, revising truth weights downward, or splitting previously compatible material into an obstructed local section\.

These guarantees explain the role of sheaves in the empirical sections\. DKS can be promoted because customer, assortment, brand, and filing sections glue\. Indus decipherment claims remain obstructed because the controversy/hypothesis overlap does not support a settled translation\. KET can make corpus\-local claims while blocking a global KET\-versus\-Transformer claim when the WikiText\-2 KET rows are missing\. MyFixIt improves retrieval through structured action\-observation state while keeping visual grounding outside promotion until image evidence is fetched or checksummed\.

#### Two\-stage gluing\.

The same formalism can be iterated across levels of analysis\. In the filing experiments, the local workflow slices

xC,yops,xC,ymkt,xC,yfin,xC,yinnx\_\{C,y\}^\{\\mathrm\{ops\}\},\\quad x\_\{C,y\}^\{\\mathrm\{mkt\}\},\\quad x\_\{C,y\}^\{\\mathrm\{fin\}\},\\quad x\_\{C,y\}^\{\\mathrm\{inn\}\}may first glue into a company\-year sectionsC,ys\_\{C,y\}\. These company\-year sections can then be compared over a second cover, such as sector\-year or temporal\-neighborhood covers\. The important point is that there is no single unconditional global average over all firms, papers, reviews, or agents\. Globality is always relative to a declared cover, and non\-gluing at a coarser cover is a meaningful result\.

#### Localized interventions\.

Prometheustreats interventions as local tests\. Ajj\-do querydoj⁡\(X=x\)\\operatorname\{do\}\_\{j\}\(X=x\)modifies histories or tests inside a contextUUand asks how the local predictive\-state table changes under comparable covers\. The result is not automatically an identified causal effect in Pearl’s sense\(Pearl,[2009](https://arxiv.org/html/2606.27593#bib.bib26)\); it is an intervention\-conditioned probe of the language\-derived world model\. This convention is also the point at which our recent IDC/infinitesimal\-causality work becomes relevant: the local query is best read as a typed deformation of predictive state, with support, provenance, and local variation recorded before any global causal effect is asserted\(Mahadevan,[2026a](https://arxiv.org/html/2606.27593#bib.bib20)\)\.Prometheusreports the support and provenance behind the probe rather than presenting it as a source\-free causal estimate\.

More explicitly, let

j\(U\)=\{ui:Ui→U\}j\(U\)=\\\{u\_\{i\}:U\_\{i\}\\to U\\\}be a cover of contexts considered comparable for the query, and let

IUia:𝒫\(Ui\)→𝒫do⁡\(a\)\(Ui\)I^\{a\}\_\{U\_\{i\}\}:\\mathcal\{P\}\(U\_\{i\}\)\\to\\mathcal\{P\}^\{\\operatorname\{do\}\(a\)\}\(U\_\{i\}\)be a local intervention map that edits a test, fixes an action, inserts a repair step, or conditions on an explicitly declared regime\. Thejj\-localized intervention state is computed by restriction, local intervention, and aggregation:

doj\(a\)U\(s\)=Aggui:Ui→U∈j\(U\)\(IUia\(ρU,Ui\(s\)\)\)\.\\operatorname\{do\}\_\{j\}\(a\)\_\{U\}\(s\)=\\operatorname\{Agg\}\_\{u\_\{i\}:U\_\{i\}\\to U\\in j\(U\)\}\\left\(I^\{a\}\_\{U\_\{i\}\}\\bigl\(\\rho\_\{U,U\_\{i\}\}\(s\)\\bigr\)\\right\)\.Compatibility is then checked after the intervention\. If the intervened local sections glue, the atlas may report a coherent intervention\-conditioned prediction overUU\. If they do not, the query is only locally supported, and the failed overlaps identify where comparability, measurement, or evidence breaks down\.

## 15The Claims Atlas

The primary user\-facing object is the Claims Atlas\. It is designed to answer research questions that flat summaries obscure\.

#### Main causal spine\.

The atlas extracts recurrent, high\-support causal paths that organize the corpus\. In an ocean\-warming corpus, a spine may include warming, stratification, oxygen loss, prey availability, migration, recruitment, and population change\. In SEC workflows, a spine may include investment, supply\-chain constraints, margin pressure, capital allocation, and realized outcomes\.

#### Local context regions\.

Each spine is decomposed into local regions\. A region may correspond to a species group, geography, time period, document cluster, product aspect, or workflow stage\. Users can enter a region and inspect its local PSR, support, claims, and provenance\.

#### Drift detection\.

When local models change across time, retrieval runs, or document strata,Prometheusreports drift\. Drift can be textual, causal, predictive, or topological: the support distribution changes, a causal polarity changes, a test prediction changes, or the overlap graph itself changes\.

#### Regime tensions\.

The atlas highlights where local models resist gluing\. Some tensions are contradictions; others are legitimate regime boundaries\. The interface should make this distinction visible by exposing modifiers, populations, measurement protocols, and source provenance\.

#### Provenance drill\-downs\.

Every atlas claim points back to evidence units\. A user can inspect source passages, extracted rows, normalized claims, support counts, and neighboring contexts\. Provenance is not decorative metadata; it is the mechanism by which the atlas remains corrigible\.

## 16ImplementedOdysseyFoundries

The currentOdysseyrepository contains a family of concrete foundry instances rather than a single monolithic demo\. Each instance follows the same artifact contract: Scylla emits a model brief, Homer emits a workflow skeleton, Athena emits a sheaf plan, and Prometheus emits a world model, gluing audit, and human\-facing explorer\. This section summarizes the foundries that have been implemented so far and the role each plays in testing theOdysseydesign\.

Table 9:ImplementedOdysseyfoundry families\. Each row is backed by generated JSON artifacts and HTML views in the repository, not only by a conceptual taxonomy\.#### Common artifact contract\.

The repeated pattern across these foundries is more important than any one domain\. Each run records the source request, a typed target foundry, the local cover, overlap checks, finite truth values, gluing outcomes, and update status\. The same surface can therefore represent a store operations problem, a filing workflow, an archaeological controversy, or a repair\-manual procedure\. The system’s current deterministic templates are intentionally simple; their role is to make the foundry interfaces replayable while richer extraction, retrieval, and learned local models are still under development\.

#### Generic and specialized foundries\.

Several examples test generic representational foundries directly: the storefront foundry tests operational decisions, the brand foundry tests market meaning, the corporation foundry tests institutional\-financial evidence, and the evaluation harness tests metric protocols\. Specialized foundries compose these generic objects and then restrict them to a concrete domain, corpus, or task lane\. Dick’s Sporting Goods is the restriction of a retail/brand/review and institutional\-financial composition to DKS:Odysseyglues a shopping\-experience sheaf to corporate workflow and 10\-K evidence, then checks whether store\-experience, assortment, brand\-promise, inventory, market, and risk signals agree\. The KET language\-modeling foundry is the restriction of the generic research\-program and evaluation\-harness foundries to the KET experiment suite\. It keeps PTB, WikiText\-2, and WikiText\-103 as separate corpus slices: PTB and WikiText\-103 admit near\-tied KET/Transformer denoising rows, WikiText\-2 blocks KET\-vs\-Transformer promotion because KET rows are absent from the admitted table, and full replay remains deferred until checkpoint checksums, hardware/runtime metadata, and command logs are attached\. Amazon Reviews 2023 glues corpus, benchmark, model, search, and reproduction contracts\. TCC 44K restricts a generic causal\-claims atlas to the economics literature and glues nodes, edges, support rows, method/journal/pp\-value lookup tables, and uncertainty guardrails while keeping direction and polarity tensions visible\. Indus Script glues visual, sequential, statistical, archaeological, and controversy contexts while preserving undecidable regions as explicit obstructions\. IKEA ASM is the restriction of a generic procedural PSR and perception\-action foundry to furniture assembly: actions, parts, pose, camera geometry, and object tracks must agree on the same episode rather than being evaluated as unrelated computer\-vision tasks\.

The following subsections give the artifact\-backed case studies for the implemented foundries rather than introducing new top\-level paper themes\.

### 16\.1Scientific Foundries: TCC 44K and Indus Script

TCC 44K and Indus Script are useful because they stress the same foundry contract in opposite regimes\. TCC 44K is a high\-volume map of the study of causality in economics, grounded in the Testing Causal Claims corpus\(Garg and Fetzer,[2025](https://arxiv.org/html/2606.27593#bib.bib7)\)\. The local CSQL atlas records295,252295\{,\}252canonical cause/effect nodes,261,714261\{,\}714aggregate causal edges, and265,656265\{,\}656support rows drawn from a roughly4444K\-paper corpus\. Its sheaf cover keeps node identities, aggregate edges, document support, causal direction, method/journal/pp\-value lookup tables, and uncertainty guardrails separate\. The important point is not only scale\. A repeated claim such as monetary policy increasing inflation can glue through node\-edge and edge\-support provenance, butOdysseystill refuses to turn support mass into causal certainty when polarity, controversy, or reverse\-causality evidence remains weak\.

Indus Script sits at the other extreme\. The artifact records419419indexed sign images,1,5481\{,\}548visual inscription sequences, a small decipherment literature sheaf, anchored by Parpola’s authoritative study of the corpus and decipherment problem\(Parpola,[2009](https://arxiv.org/html/2606.27593#bib.bib25)\), and archaeological context including collapse\-era figure data\. Here the central risk is not leaderboard collapse but premature interpretation\. Recent computational work argues that the Indus sign system does not cleanly match either heraldic or administrative non\-linguistic baselines, while still preserving the central uncertainty about whether it encodes spoken language\(Nair,[2026](https://arxiv.org/html/2606.27593#bib.bib24)\)\. Symbol inventories glue to inscription sequences, sequences glue tonn\-gram and entropy statistics, and statistical structure can inform a hypothesis lattice\. The controversy/hypothesis overlap deliberately obstructs: language\-like sequential structure, archaeological plausibility, and candidate language\-family priors do not by themselves license a trusted translation in the absence of bilingual anchors or stronger external evidence\.

Table 10:Two scientific foundries illustrate whyOdysseytreats gluing as an epistemic discipline\. TCC prevents a large support graph from becoming an unqualified causal truth map; Indus prevents suggestive statistical structure from becoming a premature translation\.
### 16\.2MyFixIt Procedural PSR Foundry

MyFixIt is the firstOdysseyfoundry where the representation is procedural rather than primarily argumentative or documentary\. The source surface is a repair\-manual dataset derived from iFixit’s repair\-guide corpus\(iFixit,[2026](https://arxiv.org/html/2606.27593#bib.bib11)\)with ordered guide steps, tool annotations, part spans, removal verbs, image URLs, guide metadata, and a neighboring human annotation workflow\.Odysseytreats this as a repair\-manual predictive\-state problem: the current instruction, tools, parts, and removal action define a local action\-observation state, while the next step defines a lightweight future test\.

The current MyFixIt Mac Laptop artifact contains2,2242\{,\}224manuals and53,48253\{,\}482steps in the admitted source sheaf\. Athena assigns seven local sections: manual manifest, step\-text observations, tool\-action labels, part\-state observations, removal\-verb actions, image provenance, and future\-step tests\. Four overlaps are checked\. Tool/part, verb/part, and step/future\-test overlaps glue\. The image/observation overlap is deliberately blocked: image URLs are present, but visual grounding is not promoted until assets are fetched, checksummed, or otherwise made inspectable\.

The Scylla\-facing MyFixIt page makes the PSR coding concrete\. It reports21,22521\{,\}225steps with removal verbs,21,20121\{,\}201steps with annotated parts, and53,20353\{,\}203steps with image handles\. The most frequent part labels include screw, Phillips screw, battery, case, tape, upper case, and fan; the dominant removal actions include remove, lift out, lift off, pull out, pry up, lift, and lift up\. These are not merely keywords\.Odysseyconverts them into typed histories and tests:

ht=\(manual,step order,tool,verb,part,image handle\),τt=next\-step action/observation probe\.h\_\{t\}=\(\\text\{manual\},\\text\{step order\},\\text\{tool\},\\text\{verb\},\\text\{part\},\\text\{image handle\}\),\\qquad\\tau\_\{t\}=\\text\{next\-step action/observation probe\}\.
Table 11:Concrete MyFixIt language entries as PSR tests\. The manual text is converted into action labels, observed part state, image handles, and a next\-step prediction target\.Table 12:MyFixIt gluing audit\. The blocked image overlap is a useful example ofOdyssey’s design rule: missing grounding should remain visible rather than being averaged into a confidence score\.The MyFixIt artifact also records an annotation\-tool contract\. The neighboring Flask annotator is treated as a provenance and human\-loop surface, with MongoDB configuration, processed\-table locations, entry point, and runtime guardrails preserved as foundry metadata\. This matters because the next promotion step is not merely a better ranker; it is reviewed repair semantics with stable image and annotation provenance\.

### 16\.3MyFixIt Retrieval Experiments

The evaluation harness restricts MyFixIt to a declared query/document object surface\. Queries ask for repair steps by action, affected part, tool, or future\-step relation; documents are manual\-step and action\-observation objects\. This is a deliberately local task: the claim being tested is not thatOdysseyhas solved all repair\-manual retrieval, but that a procedural PSR/sheaf representation improves ranking inside a typed repair\-manual restriction while preserving its failure slices\.

We compare two ranking surfaces\. The token baseline ranks candidates by standard token\-style overlap over the textual fields\. The PSR action\-observation representation ranks over structured repair state: manual identity, step order, action labels, tools, affected parts, and future\-test features\.[Table13](https://arxiv.org/html/2606.27593#S16.T13)reports three profiles emitted by the evaluation harness\. The current compact profile uses7272queries and360360candidate documents\. The broader profile uses500500queries and1,6001\{,\}600documents with manual\-level separation and duplicate\-guide checks\. The strict profile uses1,0001\{,\}000queries and3,2003\{,\}200documents with manual\-held\-out evaluation, future\-step leakage audit, image\-provenance quarantine, and cross\-category transfer holdout\.

Table 13:MyFixIt retrieval results across three evaluation\-harness profiles\. The PSR action\-observation representation improves nDCG@10, Recall@10, and MRR on the current compact, broader, and strict restrictions\.The pattern is stable across scale\. The current compact profile shows a\+0\.2478\+0\.2478nDCG@10 lift; the broader profile shows\+0\.2460\+0\.2460; and the strict profile shows\+0\.2726\+0\.2726\. Recall@10 improves more sharply as the candidate pool grows: the strict profile has0\.61300\.6130recall for the PSR representation versus0\.10000\.1000for the token baseline\. This is the expected advantage of a procedural foundry\. When many steps share surface words such as “remove”, “screw”, or “spudger”, the local action\-observation state narrows the candidate set by manual identity, part state, tool use, and future\-step tests\. At the same time, the system does not erase failures\. The failure\-slice report records cases where token ranking still beats PSR ranking, especially when the query uses an alias not yet canonicalized into the action vocabulary\.

Table 14:Failure slices preserved by the MyFixIt evaluation harness\.Odysseytreats these slices as promotion gates and future work items, not as incidental error analysis\.The main empirical lesson is that procedural foundries need typed state, not only text similarity\. Repair instructions have actions, objects, tools, order, visual references, and future tests\. A token representation can still be strong on lexical aliases, but it does not by itself know which part is being removed, which tool mediates the action, or which next state should become a predictive test\. The MyFixIt foundry is therefore a useful first domain forOdyssey: it is small enough to audit, structured enough to reward PSR\-style state, and incomplete enough to exercise the obstruction machinery\.

### 16\.4IKEA ASM Multimodal Assembly Foundry

The IKEA ASM dataset\(Ben\-Shabat et al\.,[2021](https://arxiv.org/html/2606.27593#bib.bib1)\)givesOdysseya second procedural foundry, complementary to MyFixIt\. MyFixIt begins with manuals and asks whether procedural text can be lifted into action\-observation state\. IKEA ASM begins with embodied assembly episodes: multi\-view RGB and depth video, atomic action labels, furniture\-part segmentation, part tracking, human pose, and camera calibration\. The naturalOdysseyconstruction is therefore not a generic video benchmark, but a restriction of the generic procedural PSR and perception\-action evaluation foundries to furniture assembly\.

The cover has five local sections\. The episode section records furniture item, environment, view, time, and train/test split\. The action section records per\-frame and clip\-level assembly labels\. The object section records furniture parts, instance masks, and tracking identities\. The pose section records 2D human joints and pseudo\-ground\-truth 3D joints\. The calibration section records camera parameters and view geometry\. Gluing tension appears whenever these sections disagree: an action label may imply a part manipulation that is absent from the object track; a pose trajectory may be plausible in one camera view but inconsistent after triangulation; a part identity may persist through SORT tracking but fail across occlusion; an action\-recognition improvement may not transport from top\-view RGB to multi\-view RGB\+pose\.

Table 15:IKEA ASM as a multimodal procedural assembly foundry\. The current local repository contributes the code and benchmark contracts; full promotion requires downloaded video, annotations, pretrained models, and replay logs\.The local checkout also supplies useful benchmark anchors\. The action recognition README reports top\-1 accuracy of 60\.40 for P3D and 57\.58 for I3D on single\-view clip baselines, while multi\-view RGB reaches 63\.24 and multi\-view RGB\+pose with HCN reaches 64\.25\. The pose README reports that fine\-tuned MaskRCNN improves 2D test PCK to 64\.3, while the 3D test baselines remain substantially harder, with VP3D obtaining the best reported PCK of 47 under Procrustes alignment\. We do not claim a new IKEA ASM result here\. Instead,Odysseyturns these baselines into a foundry replication target: a future experiment should ask whether a PSR state that glues action, part, pose, and camera sections improves action recognition, step prediction, or part\-state retrieval relative to the published single\-task baselines\.

### 16\.5Amazon Reviews 2023 and BLaIR\-Bench Replication

Amazon Reviews 2023 is the natural large\-scale next restriction forOdyssey, with BLaIR providing the released language\-item modeling and benchmark interface over that corpus\(Hou et al\.,[2024](https://arxiv.org/html/2606.27593#bib.bib10)\)\. The current foundry already separates five local sections: corpus manifest, recommendation benchmarks, language\-item models, product\-search tasks, and reproduction contract\. Its gluing audit reports four glued overlaps: manifest/benchmarks, benchmarks/models, models/search, and search/reproduction\. The source sheaf also records that the released dataset has an approximately750750GB footprint, so the correct paper claim is not thatOdysseyhas rerun the entire benchmark locally\. The correct claim is thatOdysseyhas converted the repository into an executable foundry contract with typed sections, replay obligations, split\-leakage checks, and BLaIR\-Bench entry points\.

BLaIR\-Bench is useful because it has the right contrast class\. It evaluates semantic item encoders on sequential recommendation, collaborative filtering, and product search\.Odysseycan reuse those tasks, but it asks an additional sheaf question: when does a benchmark result transfer from one local section to another? A good text embedding on a general product\-search query does not automatically glue to a sequential\-recommendation claim, and a recommendation score on one product category should not automatically promote a corpus\-wide representation claim\. The Amazon Reviews foundry turns these into explicit overlap tests rather than leaderboard prose\.

Table 16:BLaIR\-Bench replication surface inside the Amazon Reviews 2023 foundry\.Odysseycurrently admits the local checkout as an artifact\-backed replication contract, with task entry points, replay obligations, and promotion gates separated for product search, Amazon\-C4/Reddit\-Movie retrieval, and sequential\-recommendation or collaborative\-filtering claims\.This gives the paper a clean experimental progression\. MyFixIt provides the small, fully inspectable procedural result where PSR sheaf state already beats a token baseline\. Amazon Reviews 2023 provides the scale target\. BLaIR\-Bench provides external task definitions and baselines\. TheOdysseycontribution is to connect them without flattening them: each result is admitted through a specific foundry restriction, and non\-transfer across products, tasks, encoders, or reproduction surfaces remains visible as gluing tension\.

## 17Evaluation

Odysseyshould be evaluated at two levels\. The first is architectural: does a human request become a replayable foundry artifact with a clear brief, workflow, sheaf plan, world model, gluing audit, promotion gate, and refresh contract? The second is task\-level: once a foundry exists, does its local representation improve a concrete task while preserving failure slices and obstructions?

Table 17:Odysseyevaluation combines foundry\-contract checks with task\-level measurements\.The current artifacts already support a first set of foundry\-level gluing experiments\. These are not yet the final empirical story, but they make the evaluation target precise: perturb a local section, remove an overlap witness, or ask for an overbroad transport claim, then check whetherOdysseyeither glues the compatible restriction or preserves the resulting tension as an obstruction\.

The Prometheus ingestion console is the first larger\-scale architectural evaluation\. It shows that the same admission machinery applies to more than a handful of curated examples:129129Prometheus\-family artifacts are surfaced to Scylla, with1212Odyssey\-native bundles already either admitted or quarantined and117117older Prometheus v1 models preserved as reviewable candidates\. Conversely, the DKS and MyFixIt export path verifies that anOdysseyfoundry can be compiled back into Prometheus so that local sections, PSRs, restrictions, and gluing diagnostics can be inspected visually before their tensions are promoted, quarantined, or sent back for repair\.

Table 18:Further validation directions supported by the current foundry artifacts\. Each row identifies a concrete test of whether compatible restrictions are promoted while high\-tension or underspecified transports remain local\.The MyFixIt profiles are the first concrete task\-level evaluations\. They show a measurable lift for the PSR action\-observation representation on restricted retrieval tasks, while still recording text\-alias failures, wrong\-manual top\-1 errors, and the image\-grounding gap\. This is the kind of resultOdysseyis designed to produce: not just a single score, but a promoted local claim bounded by the cover, the restriction, the split policy, and the remaining obstructions\.

## 18Limitations and Future Research

The current implementation is still a design\-stage system\. Several Scylla and Athena steps are template\-driven, and several Prometheus validation runs use deterministic finite truth assignments rather than learned estimators\. This is useful for stabilizing interfaces, but it limits the strength of empirical claims\. The MyFixIt retrieval results are therefore preliminary even though they now include7272\-,500500\-, and1,0001\{,\}000\-query profiles: they remain deterministic restrictions over Mac Laptop repair\-manual slices, not a full benchmark across all repair categories\.

The foundry algebra also needs stronger type checking\. Today, foundry expressions, TICKET admission records, and FSQL slices are represented as structured artifacts and compact interpreters\. A matureOdysseyshould enforce more of this algebra statically: admissible compositions, source/target compatibility, required restriction maps, and promotion gates should be checked before a run can become durable state\.

The grounded Toulmin layer should also be read as an audit layer, not as a proof system\. The current implementation asks local LLMs to produce visible claim/grounds/warrant/qualifier/rebuttal tickets and then checks those tickets against Prometheus grounding packets\. This exposes claim drift and weak source\-topic alignment, but it does not establish external truth\. Stronger future versions should add calibrated warrant taxonomies, source\-span verification, cross\-model agreement tests, and explicit over\-licensing flags when a model treats review\-aligned evidence as if it fully licenses a claim\.

Finally, visual and multimodal grounding remain incomplete\. MyFixIt contains image URLs, but the current run does not fetch, checksum, or model the images\.Odysseyhandles this correctly by blocking the image\-observation overlap, but a full procedural foundation model will need image assets, visual part localization, and stronger human\-reviewed annotation loops\.

The present system demonstrates that foundries can be constructed, admitted, queried, and evaluated across several domains, but much of the current implementation remains deliberately finite and inspectable\. The next research question is how far this discipline can be pushed as foundries become larger, less template\-bound, and more automatically derived from source structure\. In particular, the current deterministic covers should give way to learned or semi\-automated cover synthesis, while still preserving the central contract: local sections, restriction maps, gluing diagnostics, obstruction ledgers, and promotion gates must remain explicit artifacts rather than hidden model state\.[Table18](https://arxiv.org/html/2606.27593#S17.T18)summarizes the concrete further validation agenda supported by the current artifacts\.

A second direction is to study foundry transport\. The Prometheus ingestion console shows thatOdysseycan treat a large collection of pre\-existing world models as typed candidates, and the DKS/MyFixIt exports show thatOdysseyfoundries can be inspected again inside a Prometheus sheaf workbench\. This suggests a broader experimental program: measure when model state can move between foundries, when it must remain local, and how obstruction records change after new evidence or new restriction maps are introduced\.

Procedural domains provide the most immediate testbed\. MyFixIt should be extended from the current Mac Laptop restriction to cross\-category repair manuals, with action and part aliases normalized, manual identity enforced, and image evidence fetched or checksummed before visual claims are promoted\. IKEA ASM then tests the same idea in a multimodal setting: actions, parts, pose, camera geometry, and object tracks should glue into a shared assembly state only when the evidence supports that transport\. The relevant comparison is not only against token retrieval, but also against single\-task action recognition, pose estimation, segmentation, and tracking baselines\.

Finally, large non\-procedural foundries such as Amazon Reviews 2023 make it possible to ask whether the same sheaf\-theoretic PSR discipline scales to recommendation, product search, and language\-item modeling\. The goal is not a single headline score\. The goal is a reusable empirical protocol for deciding when a foundry representation improves a task, when its local sections fail to glue, and what additional evidence is required before a model should be promoted\.

The grander version of the same question is whether a GPT\-scale pretrained model can beOdyssey\-ized\. In the language of[Figure2](https://arxiv.org/html/2606.27593#S5.F2), such a model would not be imported as a single opaque parameter object or treated as a universal source of truth\. It would be sliced into local behavioral, evidential, tool\-use, memory, task, and domain\-restricted charts; rolled out by left\-Kan admission into candidate foundries; and then pulled back by right\-Kan restriction, gluing, obstruction, and Toulmin warrant checks before any slice became durable state\. If this program succeeds, large foundation models would become substrates for verifiable local truth\-preserving foundries rather than replacements for them: their scale would supply expressive material, whileOdyssey’s categorical machinery would decide where that material is warranted, where it fails to glue, and where human review remains essential\.

## 19Conclusion

Odysseyreframes foundation\-model construction as the design of inspectable foundries: sheaf\-like families of local predictive, logical, and evidentiary models over a heterogeneous substrate\. The point is not to produce one more flat summary or one more opaque embedding\. It is to preserve locality, support, drift, contradiction, provenance, update obligations, and epistemic limits while giving users and agents a navigable model of a domain\. Scylla names the user\-facing contract, Homer makes it executable, Athena fixes the representational semantics, and Prometheus instantiates the Topos World Model and its audits\. Toulmin then turns maintained state into warranted, qualified, rebuttable claims, while TICKET decides whether external runs, pretrained models, atlases, or benchmark artifacts can enter durable foundry state\.

The BRIDGE/SKFM and IDC integrations sharpen the causal part of this story\. BRIDGE/SKFM supplies a latent\-causal\-refinement foundry: typed variables, influence masks, Lie/Frobenius residual ratios, candidate edge masks, and latent\-obstruction ledgers that prevent residual\-bearing pairs from being promoted as ordinary global causal edges\. Infinitesimal\-causality diagnostics add a local test surface for causal variation inside the admission loop\. In the SkillOpt experiments, those diagnostics become feedback for optimizing a human\-readable causal\-claim skill rather than merely scoring a final answer\. The resulting system can therefore learn better admission policies while still preserving the local warrants, qualifiers, rebuttals, residuals, and obstructions that make a claim inspectable\.

## Code Availability

The predecessorDemocrituscodebase is publicly available as theDemocritus\_OpenAIrepository\(Mahadevan,[2025d](https://arxiv.org/html/2606.27593#bib.bib19)\)\.Odysseycurrently builds on this released causal\-extraction lineage and adds the Scylla–Homer–Athena– Prometheus–Toulmin product layer for foundry construction, Topos World Model instantiation, Claims Atlas navigation, persistent state, argument tickets, FSQL/TICKET admission, and grounded counterfactual execution\. A publicOdysseyrelease is planned once the system has matured into a stable, documented research product\.

## Appendix ATICKET Specifications for GUI Foundries

TheOdysseyGUI exposes a TICKET card for each foundry in the foundry algebra\. The concise card syntax in the GUI suppresses the categorical data\. We spell it out here\. A candidate artifactxxdetermines a small source context category𝒞x\\mathcal\{C\}\_\{x\}, whose objects are source\-side charts such as files, runs, tables, benchmark slices, dashboards, model cards, extracted claim sets, or process traces\. The target foundryYYdetermines a target site𝒞Y\\mathcal\{C\}\_\{Y\}, whose objects are the local contexts that Athena has declared for that foundry\. A TICKET card declares a functor

Fx,Y:𝒞x⟶𝒞YF\_\{x,Y\}:\\mathcal\{C\}\_\{x\}\\longrightarrow\\mathcal\{C\}\_\{Y\}that sends each source chart to the target context in which it is allowed to be interpreted\. For example, a 10\-K source chart may be sent to a company\-year\-financial context, a review shard to a product\-use context, or a repair step trace to a procedural state/action context\.

Let𝖣𝖺𝗍𝖺\\mathsf\{Data\}denote the finite artifact category used by the implementation: typed records, finite predicate tables, local PSR cells, provenance links, diagnostics, and promotion metadata\. The candidate appears as a presheaf

X:𝒞xop⟶𝖣𝖺𝗍𝖺,X:\\mathcal\{C\}\_\{x\}^\{op\}\\longrightarrow\\mathsf\{Data\},and the maintained target foundry state appears as a presheaf

MY:𝒞Yop⟶𝖣𝖺𝗍𝖺\.M\_\{Y\}:\\mathcal\{C\}\_\{Y\}^\{op\}\\longrightarrow\\mathsf\{Data\}\.The notationLanFx,Y\\mathrm\{Lan\}\_\{F\_\{x,Y\}\}in the cards is shorthand for the left Kan extension ofXXalong the opposite functor,

LanFx,YX:=LanFx,YopX:𝒞Yop⟶𝖣𝖺𝗍𝖺\.\\mathrm\{Lan\}\_\{F\_\{x,Y\}\}X:=\\mathrm\{Lan\}\_\{F\_\{x,Y\}^\{op\}\}X:\\mathcal\{C\}\_\{Y\}^\{op\}\\longrightarrow\\mathsf\{Data\}\.ThusLanFx,YX\\mathrm\{Lan\}\_\{F\_\{x,Y\}\}Xis the least target\-side candidate obtained by transporting the source charts through the declared TICKET map\. Pointwise, it is computed by the finite colimit over source charts mapping into a target context:

\(LanFx,YX\)\(U\)≅colim\(Fx,Y\(V\)→U\)X\(V\)\.\(\\mathrm\{Lan\}\_\{F\_\{x,Y\}\}X\)\(U\)\\cong\\operatorname\*\{colim\}\_\{\(F\_\{x,Y\}\(V\)\\to U\)\}X\(V\)\.In the software this colimit is realized as typed aggregation of source records, provenance\-preserving joins, adapter outputs, and diagnostic summaries\. The important point is that the functorFx,YF\_\{x,Y\}, not the name of the target foundry alone, specifies what transport is permitted\.

The same functor can also support a dual consistency pass, following the Universal Decision Learner pattern of composing left and right Kan extensions\(Mahadevan,[2025c](https://arxiv.org/html/2606.27593#bib.bib18)\)\. Precomposition withFx,YF\_\{x,Y\}gives a restriction functor

Fx,Y∗:\[𝒞Yop,𝖣𝖺𝗍𝖺\]⟶\[𝒞xop,𝖣𝖺𝗍𝖺\],F\_\{x,Y\}^\{\*\}:\[\\mathcal\{C\}\_\{Y\}^\{op\},\\mathsf\{Data\}\]\\longrightarrow\[\\mathcal\{C\}\_\{x\}^\{op\},\\mathsf\{Data\}\],andFx,Y∗F\_\{x,Y\}^\{\*\}has both a left and a right Kan adjoint\. TICKET usesLanFx,Y\\mathrm\{Lan\}\_\{F\_\{x,Y\}\}as its admission direction: it freely assembles the least target\-side candidate compatible with the source charts\. A stronger audit can then apply the right Kan direction

RanFx,YFx,Y∗A,A=LanFx,YX,\\mathrm\{Ran\}\_\{F\_\{x,Y\}\}F\_\{x,Y\}^\{\*\}A,\\qquad A=\\mathrm\{Lan\}\_\{F\_\{x,Y\}\}X,as a target\-side consistency envelope for the admitted candidate\. Pointwise, this is a finite limit over target probes back through the same declared interface:

\(RanFx,YFx,Y∗A\)\(U\)≅lim\(U→Fx,Y\(V\)\)A\(Fx,Y\(V\)\)\.\(\\mathrm\{Ran\}\_\{F\_\{x,Y\}\}F\_\{x,Y\}^\{\*\}A\)\(U\)\\cong\\operatorname\*\{lim\}\_\{\(U\\to F\_\{x,Y\}\(V\)\)\}A\(F\_\{x,Y\}\(V\)\)\.The comparison

A⟶RanFx,YFx,Y∗AA\\longrightarrow\\mathrm\{Ran\}\_\{F\_\{x,Y\}\}F\_\{x,Y\}^\{\*\}Ais therefore a round\-trip check: after the source artifact has been extended into the target foundry, it asks whether all target\-side obligations that can be observed through the source interface are jointly satisfiable\. In the document and evidence setting used by Toulmin, the left Kan move is the rollout that produces a target\-side document claim, while the right Kan / pullback move is the consistency scrutiny that asks whether the claim, grounds, warrant, qualifier, and rebuttal survive the target cover and maintained obstruction ledger\. Thus the sameFx,YF\_\{x,Y\}carries both the left\-Kan admission move and the right\-Kan Toulmin\-scrutiny move: the TICKET analogue of the UDLLan/Ran\\mathrm\{Lan\}/\\mathrm\{Ran\}loop\.

After transport, TICKET checks that the transported candidate can be restricted back to overlaps in𝒞Y\\mathcal\{C\}\_\{Y\}and compared with maintained state\. For a cover\{Ui→U\}i∈I\\\{U\_\{i\}\\to U\\\}\_\{i\\in I\}in𝒞Y\\mathcal\{C\}\_\{Y\}, write

A=LanFx,YX\.A=\\mathrm\{Lan\}\_\{F\_\{x,Y\}\}X\.A family of local candidate sectionsai∈A\(Ui\)a\_\{i\}\\in A\(U\_\{i\}\)and maintained sectionsmi∈MY\(Ui\)m\_\{i\}\\in M\_\{Y\}\(U\_\{i\}\)*glues*overUUwhen every declared overlapUij=Ui×UUjU\_\{ij\}=U\_\{i\}\\times\_\{U\}U\_\{j\}satisfies the target compatibility predicate:

κY\(ρi,ijA\(ai\),ρj,ijA\(aj\),ρi,ijM\(mi\),ρj,ijM\(mj\)\)∈\{PLAUSIBLE,SUPPORTED,⊤\}\.\\kappa\_\{Y\}\\\!\\left\(\\rho^\{A\}\_\{i,ij\}\(a\_\{i\}\),\\rho^\{A\}\_\{j,ij\}\(a\_\{j\}\),\\rho^\{M\}\_\{i,ij\}\(m\_\{i\}\),\\rho^\{M\}\_\{j,ij\}\(m\_\{j\}\)\\right\)\\in\\\{\\mathrm\{PLAUSIBLE\},\\mathrm\{SUPPORTED\},\\top\\\}\.Hereρ\\rhodenotes restriction, andκY\\kappa\_\{Y\}is Athena’s finite truth\-valued overlap test for the target foundry\. Numerically, this is the same condition implemented by the gluing audit: shared predicates must be at least plausible, no paired predicate may be bottom, and weighted PSR or claim gaps must remain below the target tolerance\. When the condition holds,glueY\(A,MY\)\\mathrm\{glue\}\_\{Y\}\(A,M\_\{Y\}\)returns a promoted section in the maintained foundry\. When it fails, the partial operation is undefined as promotion and instead returns an obstruction record naming the failed overlap, source provenance, finite truth values, and next\-observation recommendation\.

With this expanded notation, the cards all share the same admission template:

TICKET\(xcandidate\)→LanFx,Y\(X\)→glueY\(LanFx,YX,MY\),\\textsc\{TICKET\}\(x\_\{\\mathrm\{candidate\}\}\)\\;\\to\\;\\mathrm\{Lan\}\_\{F\_\{x,Y\}\}\(X\)\\;\\to\\;\\mathrm\{glue\}\_\{Y\}\(\\mathrm\{Lan\}\_\{F\_\{x,Y\}\}X,M\_\{Y\}\),whereYYis the target foundry named in the table andFx,YF\_\{x,Y\}is the source\-to\-target transport functor declared by that card\. The corresponding FSQL surface is

FROM generic\_foundries SLICE BY foundry\("x"\) TICKET BY target\_foundry\.\\texttt\{FROM generic\\\_foundries SLICE BY foundry\("x"\) TICKET BY target\\\_foundry\}\.Every card uses the same required checks:typed\_manifest,subobject\_classifier,restriction\_maps,gluing\_audit,j\_closure, andpromotion\_gate\. The acceptance gate is that the typed manifest, local predicate classifier, restriction maps, gluing audit,jj\-closure, and promotion gate must agree before the candidate becomes durable foundry state\.

Table 19:TICKET cards for generic and process foundries listed in theOdysseyGUI\. Each row instantiates the shared TICKET template with a source family and target foundry\.Table 20:TICKET cards for specialized, concrete, and candidate foundries listed in theOdysseyGUI\. These rows are restrictions of generic foundries to particular domains, datasets, corpora, or model\-ingestion lanes\.
## Appendix BTICKET as a Monad

The preceding appendix describes TICKET as an admission and consistency operator\. There is a slightly higher\-level categorical view that is useful for locating TICKET inside the Universal Decision Learner pattern, but that we do not yet exploit fully in the implementation\. Once a TICKET card declares a source\-to\-target functor

Fx,Y:𝒞x⟶𝒞Y,F\_\{x,Y\}:\\mathcal\{C\}\_\{x\}\\longrightarrow\\mathcal\{C\}\_\{Y\},precomposition gives the restriction functor

Fx,Y∗:\[𝒞Yop,𝖣𝖺𝗍𝖺\]⟶\[𝒞xop,𝖣𝖺𝗍𝖺\]\.F\_\{x,Y\}^\{\*\}:\[\\mathcal\{C\}\_\{Y\}^\{op\},\\mathsf\{Data\}\]\\longrightarrow\[\\mathcal\{C\}\_\{x\}^\{op\},\\mathsf\{Data\}\]\.When the relevant finite Kan extensions exist, this restriction functor has a right Kan adjoint

Fx,Y∗⊣RanFx,Y\.F\_\{x,Y\}^\{\*\}\\dashv\\mathrm\{Ran\}\_\{F\_\{x,Y\}\}\.The adjunction induces a monad on target\-side foundry candidates:

𝖳x,Y=RanFx,YFx,Y∗:\[𝒞Yop,𝖣𝖺𝗍𝖺\]⟶\[𝒞Yop,𝖣𝖺𝗍𝖺\]\.\\mathsf\{T\}\_\{x,Y\}=\\mathrm\{Ran\}\_\{F\_\{x,Y\}\}F\_\{x,Y\}^\{\*\}:\[\\mathcal\{C\}\_\{Y\}^\{op\},\\mathsf\{Data\}\]\\longrightarrow\[\\mathcal\{C\}\_\{Y\}^\{op\},\\mathsf\{Data\}\]\.For a candidate target presheafAA,𝖳x,Y\(A\)\\mathsf\{T\}\_\{x,Y\}\(A\)is its right\-Kan consistency envelope: restrictAAback through the TICKET interface, then pull forward all target\-side obligations that are forced by that restricted view\. The unit

ηA:A⟶𝖳x,Y\(A\)\\eta\_\{A\}:A\\longrightarrow\\mathsf\{T\}\_\{x,Y\}\(A\)is the canonical comparison from the candidate to its consistency envelope\. The multiplication

μA:𝖳x,Y𝖳x,Y\(A\)⟶𝖳x,Y\(A\)\\mu\_\{A\}:\\mathsf\{T\}\_\{x,Y\}\\mathsf\{T\}\_\{x,Y\}\(A\)\\longrightarrow\\mathsf\{T\}\_\{x,Y\}\(A\)says that applying the same TICKET consistency pass twice collapses to one normalized pass\.

The left\-Kan rollout still plays the constructive admission role\. A raw source artifactX:𝒞xop→𝖣𝖺𝗍𝖺X:\\mathcal\{C\}\_\{x\}^\{op\}\\to\\mathsf\{Data\}first becomes the target\-side candidate

A=LanFx,YX\.A=\\mathrm\{Lan\}\_\{F\_\{x,Y\}\}X\.TICKET then tests this candidate against the monadic envelope

LanFx,YX→𝜂𝖳x,Y\(LanFx,YX\)\.\\mathrm\{Lan\}\_\{F\_\{x,Y\}\}X\\xrightarrow\{\\eta\}\\mathsf\{T\}\_\{x,Y\}\(\\mathrm\{Lan\}\_\{F\_\{x,Y\}\}X\)\.Thus the implemented TICKET pipeline can be read as the composite

X⟼LanFx,YX⟼𝖳x,Y\(LanFx,YX\)\.X\\longmapsto\\mathrm\{Lan\}\_\{F\_\{x,Y\}\}X\\longmapsto\\mathsf\{T\}\_\{x,Y\}\(\\mathrm\{Lan\}\_\{F\_\{x,Y\}\}X\)\.In words: roll out source artifacts by a left Kan extension, then apply the right\-Kan monad that records the admission, gluing, obstruction, provenance, and repair effects visible through the declared interface\. Promotion occurs only when this monadic consistency pass is compatible with Athena’s target gluing policy and with the maintained foundry stateMYM\_\{Y\}\.

This view also points to computational semantics that are not yet made explicit inOdyssey\. An Eilenberg–Moore algebra for𝖳x,Y\\mathsf\{T\}\_\{x,Y\}is a target foundry stateAAequipped with a structure map

α:𝖳x,Y\(A\)⟶A\.\\alpha:\\mathsf\{T\}\_\{x,Y\}\(A\)\\longrightarrow A\.Operationally, such an algebra is a maintained foundry that knows how to absorb its own TICKET consistency envelope back into stable state\. A Kleisli arrow

A⟶𝖳x,Y\(B\)A\\longrightarrow\\mathsf\{T\}\_\{x,Y\}\(B\)is a computation that produces a candidateBBtogether with TICKET effects: admission obligations, restriction diagnostics, gluing constraints, obstructions, provenance, and possible repair recommendations\. This matches the practical shape of Odyssey workflows: Scylla intents, Homer jobs, Prometheus bundles, Toulmin tickets, and TICKET decisions are not pure maps of artifacts, but computations that carry admission and consistency effects\.

We leave the explicit development of these Eilenberg–Moore and Kleisli semantics to future work\. The present paper uses the monad only to clarify the categorical status of TICKET: TICKET is not merely a gate after model construction\. It is the monadic admission semantics induced by the left\-Kan rollout and right\-Kan pullback structure of UDL\.

## Appendix CSystem Interface Screenshots

[Figure4](https://arxiv.org/html/2606.27593#A3.F4)shows representative screens from the currentOdysseyprototype\. These are generated from the static HTML artifacts emitted by the repository rather than from hand\-drawn mockups\. The screenshots show the user\-facing workbench, the TICKET ingestion console, the MyFixIt procedural PSR interface, the evaluation\-harness sheaf explorer, and the grounded Toulmin/local\-LLM Scylla interface\.

![Refer to caption](https://arxiv.org/html/2606.27593v1/artifacts/system_screenshots/odyssey_workbench.png)\(a\)Odysseyworkbench landing surface for browsing foundry families and generated artifacts\.
![Refer to caption](https://arxiv.org/html/2606.27593v1/artifacts/system_screenshots/odyssey_ingestion_console.png)\(b\)TICKET ingestion console for admitting Prometheus runs into durableOdysseyfoundry state\.
![Refer to caption](https://arxiv.org/html/2606.27593v1/artifacts/system_screenshots/myfixit_scylla_interface.png)\(c\)MyFixIt Scylla interface summarizing the repair\-manual PSR seed, gluing status, and image\-grounding obstruction\.
![Refer to caption](https://arxiv.org/html/2606.27593v1/artifacts/system_screenshots/evaluation_harness_sheaf_explorer.png)\(d\)Evaluation\-harness sheaf explorer for the MyFixIt retrieval restriction and failure\-slice workflow\.

Figure 4:Screenshots of the currentOdysseyinterface and generated foundry artifacts\. The visual surfaces are intentionally artifact\-backed: each screen is a view over JSON, HTML, and audit files emitted by the foundry pipeline\.![Refer to caption](https://arxiv.org/html/2606.27593v1/artifacts/system_screenshots/local_llm_grounded_toulmin_scylla.png)Figure 5:Grounded Toulmin/local\-LLM Scylla interface\. A local LLM run over Democritus/Prometheus claims identifies a Dinosaur Extinction case in which the model reformulates the extracted document claim while the underlying source event remains under alignment review\. Scylla exposes both problems as Toulmin\-level flags: claim mismatch and weak source alignment\.
## Appendix DSystem Genealogy and GUI Modes

Prometheusinherits part of its interface genealogy from our earlier CLIFF chatbot and local research interface\(Mahadevan,[2025b](https://arxiv.org/html/2606.27593#bib.bib17)\)\. CLIFF began as a Categories\-for\-AGI companion system for interactive retrieval, teaching, and research workflows\. ThePrometheusGUI reuses several lessons from that system: a natural\-language query box, long\-running local sessions, background execution, artifact dashboards, route\-specific reports, and persistent run directories\. The conceptual boundary is different\. CLIFF remains oriented toward the Categories for AGI book, course material, and general retrieval\-conditioned chatbot workflows, whereasPrometheusis reserved for causal research artifacts, local PSR construction, gluing diagnostics, persistent world state, and Claims Atlas navigation\.

The GUI is designed to accept broad natural\-language research requests and route them to specialized workflows\. In the current implementation, route families include literature and paper\-corpus synthesis, Democritus\-style causal\-claim analysis, SEC and company\-filing workflows, product\-feedback world models, targeted\-sentiment review benchmarks, Rock–Paper–Scissors and network\-economy agent traces, and small Topos/OOM experiments\. A route may emit several artifacts: a human\-readable report, a technical dashboard, a JSON world\-model bundle, a persistent\-state file, and auxiliary provenance or Claims Atlas HTML\.

The GUI exposes three execution modes\.*Quick*mode runs the most compact version of a workflow and is useful for quick checks or shallow artifact inspection\.*Interactive*mode keeps the local session open while background runs complete, letting a researcher launch follow\-up queries and inspect completed artifacts from the session list\.*Deep*mode allocates more work to acquisition, extraction, synthesis, and report generation, and is the intended setting for the case\-study style runs described in this paper\. The GUI also exposes an analysis\-mode choice:*standard*runs the routed workflow in its ordinary reporting mode, while*Topos World Model*attaches thePrometheuslayer when supported, producing local PSRs, restrictions, gluing diagnostics, and persistent\-state artifacts\.

Several additional controls specialize particular routes rather than changing the overall framework\. Democritus\-style claim analysis can run with full, lightweight, or mixture\-of\-experts manifold modes, optional dry\-run behavior, and optional deep\-dive report generation\. Filing workflows can use dry\-run paths for debugging\. Product\-feedback and persistent\-state workflows can take a parent state or state query, allowing a follow\-up run to compare against an earlier world model\. These options are engineering controls, not separate theoretical models; they let users trade runtime, cost, and depth while keeping the same artifact contract\.

## References

- Ben\-Shabat et al\. \(2021\)Yizhak Ben\-Shabat, Xin Yu, Fatemeh Saleh, Dylan Campbell, Cristian Rodriguez\-Opazo, Hongdong Li, and Stephen Gould\.The IKEA ASM dataset: Understanding people assembling furniture through actions, objects and pose\.In*Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision*, pages 847–859, 2021\.
- Bhattacharjya et al\. \(2024\)Debarun Bhattacharjya, Junkyu Lee, Don Joven Agravante, Balaji Ganesan, and Radu Marinescu\.Foundation model sherpas: Guiding foundation models through knowledge and reasoning, 2024\.URL[https://arxiv\.org/abs/2402\.01602](https://arxiv.org/abs/2402.01602)\.
- Bommasani et al\. \(2021\)Rishi Bommasani, Drew A\. Hudson, Ehsan Adeli, Russ Altman, Simran Arora, Sydney von Arx, Michael S\. Bernstein, Jeannette Bohg, Antoine Bosselut, Emma Brunskill, et al\.On the opportunities and risks of foundation models, 2021\.URL[https://arxiv\.org/abs/2108\.07258](https://arxiv.org/abs/2108.07258)\.
- Chen et al\. \(2024\)Haolong Chen, Hanzhi Chen, Zijian Zhao, Kaifeng Han, Guangxu Zhu, Yichen Zhao, Ying Du, Wei Xu, and Qingjiang Shi\.An overview of domain\-specific foundation model: Key technologies, applications and challenges, 2024\.URL[https://arxiv\.org/abs/2409\.04267](https://arxiv.org/abs/2409.04267)\.
- Cheng et al\. \(2025\)Qing Cheng, Zefan Zeng, Xingchen Hu, Yuehang Si, and Zhong Liu\.A survey of event causality identification: Taxonomy, challenges, assessment, and prospects\.*ACM Computing Surveys*, 58\(3\):1–37, 2025\.doi:[10\.1145/3756009](https://doi.org/10.1145/3756009)\.URL[https://dl\.acm\.org/doi/10\.1145/3756009](https://dl.acm.org/doi/10.1145/3756009)\.
- Feder et al\. \(2022\)Amir Feder, Katherine A\. Keith, Emaad Manzoor, Reid Pryzant, Dhanya Sridhar, Zach Wood\-Doughty, Jacob Eisenstein, Justin Grimmer, Roi Reichart, Margaret E\. Roberts, Brandon M\. Stewart, Victor Veitch, and Diyi Yang\.Causal inference in natural language processing: Estimation, prediction, interpretation and beyond\.*Transactions of the Association for Computational Linguistics*, 10:1138–1158, 2022\.doi:[10\.1162/tacl\_a\_00511](https://doi.org/10.1162/tacl_a_00511)\.URL[https://aclanthology\.org/2022\.tacl\-1\.66/](https://aclanthology.org/2022.tacl-1.66/)\.
- Garg and Fetzer \(2025\)Prashant Garg and Thiemo Fetzer\.Testing causal claims in economics\.*arXiv preprint arXiv:2501\.06873*, 2025\.URL[https://arxiv\.org/abs/2501\.06873](https://arxiv.org/abs/2501.06873)\.Dataset and analysis of causal claims extracted from economics papers\.
- Gupta et al\. \(2024\)Ankita Gupta, Ethan Zuckerman, and Brendan O’Connor\.Harnessing Toulmin’s theory for zero\-shot argument explication\.In*Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics \(Volume 1: Long Papers\)*, pages 10259–10276, Bangkok, Thailand, 2024\. Association for Computational Linguistics\.doi:[10\.18653/v1/2024\.acl\-long\.552](https://doi.org/10.18653/v1/2024.acl-long.552)\.URL[https://aclanthology\.org/2024\.acl\-long\.552/](https://aclanthology.org/2024.acl-long.552/)\.
- Hassanzadeh et al\. \(2020\)Oktie Hassanzadeh, Debarun Bhattacharjya, Mark Feblowitz, Michael Perrone, Shirin Sohrabi, Kavitha Srinivas, and Michael Katz\.Causal knowledge extraction through large\-scale text mining\.In*Proceedings of the AAAI Conference on Artificial Intelligence*, volume 34, pages 13520–13527, 2020\.
- Hou et al\. \(2024\)Yupeng Hou, Jiacheng Li, Zhankui He, An Yan, Xiusi Chen, and Julian McAuley\.Bridging language and items for retrieval and recommendation\.*arXiv preprint arXiv:2403\.03952*, 2024\.URL[https://arxiv\.org/abs/2403\.03952](https://arxiv.org/abs/2403.03952)\.
- iFixit \(2026\)iFixit\.iFixit repair guides\.Online repair\-guide corpus, 2026\.URL[https://www\.ifixit\.com/Guide](https://www.ifixit.com/Guide)\.
- Kıcıman et al\. \(2024\)Emre Kıcıman, Robert Osazuwa Ness, Amit Sharma, and Chenhao Tan\.Causal reasoning and large language models: Opening a new frontier for causality\.*Transactions on Machine Learning Research*, 2024\.URL[https://openreview\.net/forum?id=mqoxLkX210](https://openreview.net/forum?id=mqoxLkX210)\.Accepted by TMLR; arXiv:2305\.00050\.
- Kim and Yang \(2026\)Yundong Kim and Heyoung Yang\.TRACE: Toulmin\-based reasoning assessment through constructive elements for LLM CoT evaluation, 2026\.URL[https://arxiv\.org/abs/2605\.29656](https://arxiv.org/abs/2605.29656)\.
- Le et al\. \(2024\)Hao Duong Le, Xin Xia, and Zhang Chen\.Multi\-agent causal discovery using large language models\.*arXiv preprint arXiv:2407\.15073*, 2024\.
- Liu et al\. \(2024\)Yue Liu, Sin Kit Lo, Qinghua Lu, Liming Zhu, Dehai Zhao, Xiwei Xu, Stefan Harrer, and Jon Whittle\.Agent design pattern catalogue: A collection of architectural patterns for foundation model based agents, 2024\.URL[https://arxiv\.org/abs/2405\.10467](https://arxiv.org/abs/2405.10467)\.
- Mahadevan \(2025a\)Sridhar Mahadevan\.Large causal models from large language models, 2025a\.URL[https://arxiv\.org/abs/2512\.07796](https://arxiv.org/abs/2512.07796)\.
- Mahadevan \(2025b\)Sridhar Mahadevan\.CLIFF\_CatAgi: Categories for AGI local research interface\.GitHub repository, 2025b\.URL[https://github\.com/sridharmahadevan/CLIFF\_CatAgi](https://github.com/sridharmahadevan/CLIFF_CatAgi)\.
- Mahadevan \(2025c\)Sridhar Mahadevan\.Categories for AGI\.Book manuscript, 2025c\.URL[https://people\.cs\.umass\.edu/~mahadeva/papers/catagi\.pdf](https://people.cs.umass.edu/~mahadeva/papers/catagi.pdf)\.
- Mahadevan \(2025d\)Sridhar Mahadevan\.Democritus\_OpenAI: Whygraphs from large language models\.GitHub repository, 2025d\.URL[https://github\.com/sridharmahadevan/Democritus\_OpenAI](https://github.com/sridharmahadevan/Democritus_OpenAI)\.
- Mahadevan \(2026a\)Sridhar Mahadevan\.Infinitesimal causality, 2026a\.URL[https://arxiv\.org/abs/2606\.24621](https://arxiv.org/abs/2606.24621)\.
- Mahadevan \(2026b\)Sridhar Mahadevan\.Kan extension transformers: A categorical unification of attention, diffusion, and predict\-detach self\-conditioning, 2026b\.URL[https://arxiv\.org/abs/2605\.27259](https://arxiv.org/abs/2605.27259)\.
- Mahadevan \(2026c\)Sridhar Mahadevan\.Latent confounded causal discovery via lie bracket geometry, 2026c\.URL[https://arxiv\.org/abs/2606\.19610](https://arxiv.org/abs/2606.19610)\.
- Mahadevan \(2026d\)Sridhar Mahadevan\.PROMETHEUS: Automating deep causal research integrating text, data and models, 2026d\.URL[https://arxiv\.org/abs/2605\.12835](https://arxiv.org/abs/2605.12835)\.
- Nair \(2026\)Ashish Nair\.How non\-linguistic is the indus sign system? a synthetic\-baseline scorecard, 2026\.URL[https://arxiv\.org/abs/2604\.17828](https://arxiv.org/abs/2604.17828)\.
- Parpola \(2009\)Asko Parpola\.*Deciphering the Indus Script*\.Cambridge University Press, reissue edition, 2009\.ISBN 978\-0521795661\.
- Pearl \(2009\)Judea Pearl\.*Causality: Models, Reasoning, and Inference*\.Cambridge University Press, 2 edition, 2009\.
- Radinsky et al\. \(2012\)Kira Radinsky, Sagie Davidovich, and Shaul Markovitch\.Learning causality for news events prediction\.In*Proceedings of the 21st International Conference on World Wide Web*, pages 909–918, 2012\.doi:[10\.1145/2187836\.2187958](https://doi.org/10.1145/2187836.2187958)\.
- Toulmin \(1958\)Stephen E\. Toulmin\.*The Uses of Argument*\.Cambridge University Press, 1958\.
- White et al\. \(2024\)Matt White, Ibrahim Haddad, Cailean Osborne, Xiao\-Yang Yanglet Liu, Ahmed Abdelmonsef, Sachin Varghese, and Arnaud Le Hors\.The model openness framework: Promoting completeness and openness for reproducibility, transparency, and usability in artificial intelligence, 2024\.URL[https://arxiv\.org/abs/2403\.13784](https://arxiv.org/abs/2403.13784)\.
- Xu et al\. \(2024\)Xinyi Xu, Zhaoxuan Wu, Rui Qiao, Arun Verma, Yao Shu, Jingtan Wang, Xinyuan Niu, Zhenfeng He, Jiangwei Chen, Zijian Zhou, et al\.Data\-centric AI in the age of large language models, 2024\.URL[https://arxiv\.org/abs/2406\.14473](https://arxiv.org/abs/2406.14473)\.
- Yamada et al\. \(2025\)Yutaro Yamada, Robert Tjarko Lange, Cong Lu, Shengran Hu, Chris Lu, Jakob Foerster, Jeff Clune, and David Ha\.The AI scientist\-v2: Workshop\-level automated scientific discovery via agentic tree search\.*arXiv preprint arXiv:2504\.08066*, 2025\.doi:[10\.48550/arXiv\.2504\.08066](https://doi.org/10.48550/arXiv.2504.08066)\.
- Yang et al\. \(2022\)Jie Yang, Soyeon Caren Han, and Josiah Poon\.A survey on extraction of causal relations from natural language text\.*Knowledge and Information Systems*, 64\(5\):1161–1186, 2022\.doi:[10\.1007/s10115\-022\-01665\-w](https://doi.org/10.1007/s10115-022-01665-w)\.
- Yang et al\. \(2026\)Yifan Yang, Ziyang Gong, Weiquan Huang, Qihao Yang, Ziwei Zhou, Zisu Huang, Yan Li, Xuemei Gao, Qi Dai, Bei Liu, Kai Qiu, Yuqing Yang, Dongdong Chen, Xue Yang, and Chong Luo\.Skillopt: Executive strategy for self\-evolving agent skills, 2026\.URL[https://arxiv\.org/abs/2605\.23904](https://arxiv.org/abs/2605.23904)\.
- Zhou et al\. \(2024\)Jiahang Zhou, Yanyu Chen, Zicong Hong, Wuhui Chen, Yue Yu, Tao Zhang, Hui Wang, Chuanfu Zhang, and Zibin Zheng\.Training and serving system of foundation models: A comprehensive survey, 2024\.URL[https://arxiv\.org/abs/2401\.02643](https://arxiv.org/abs/2401.02643)\.
Odyssey: Constructing Verifiable Local Truth-Preserving Foundation Models

Similar Articles

Composition Collapse: Stable Factual Knowledge Does Not Imply Compositional Reasoning

DeFAb: A Verifiable Benchmark for Defeasible Abduction in Foundation Models

World Models Meet Language Models: On the Complementarity of Concrete and Abstract Reasoning

The Point of No Return: Counterfactual Localization of Deceptive Commitment in Language-Model Reasoning

Formal Conjectures: An Open and Evolving Benchmark for Verified Discovery in Mathematics

Submit Feedback

Similar Articles

Composition Collapse: Stable Factual Knowledge Does Not Imply Compositional Reasoning
DeFAb: A Verifiable Benchmark for Defeasible Abduction in Foundation Models
World Models Meet Language Models: On the Complementarity of Concrete and Abstract Reasoning
The Point of No Return: Counterfactual Localization of Deceptive Commitment in Language-Model Reasoning
Formal Conjectures: An Open and Evolving Benchmark for Verified Discovery in Mathematics