NeSyCat Torch: A Differentiable Tensor Implementation of Categorical Semantics for Neurosymbolic Learning

arXiv cs.AI 06/18/26, 04:00 AM Papers

Summary

This paper introduces NeSyCat Torch, a differentiable tensor implementation of categorical semantics for neurosymbolic learning, unifying classical, fuzzy, and probabilistic semantics under a monadic framework and demonstrating superior speed and accuracy on MNIST addition compared to existing systems like LTN and DeepProbLog.

arXiv:2606.19279v1 Announce Type: new Abstract: Neurosymbolic semantics is fragmented: classical, fuzzy, probabilistic and neural systems each define truth by their own inductive rules. NeSyCat, extending ULLER, subsumes them under a single inductive definition of truth, parametric in a strong monad and an aggregation structure on truth-values. NeSyCat has so far lacked an account of predicates and functions learned by neural networks. We provide NeSyCat Torch as the missing link and interpret computational symbols via neural networks, implementing the framework in probabilistic programming and tensor-based backends. We use the distribution monad for reference semantics and metric evaluation, and complement it by a monad for numerically stable, differentiable training: the lazy log-tensor monad over the log-semiring. For efficient training in batches, we furthermore employ a batch monad. The axioms are the source code: written once in monad-based do-notation, monadic bind performs marginalisation, lazily pruning unneeded branches. On MNIST addition, our HaskTorch, JAX, and PyTorch implementations outperform LTN and DeepProbLog in speed and accuracy, while achieving nearly the accuracy of DeepStochLog. However, unlike DeepStochLog, we stay in a uniform framework that applies to many first-order NeSy approaches. Namely, the construction is parametric in the monad; instantiating it with, e.g., the Giry monad extends the approach to continuous probability (working out a neural representation here is left for future work).

Original Article

View Cached Full Text

Cached at: 06/18/26, 05:42 AM

# NeSyCat Torch\xspace: A Differentiable Tensor Implementation of Categorical Semantics for Neurosymbolic Learning
Source: [https://arxiv.org/html/2606.19279](https://arxiv.org/html/2606.19279)
\\clearauthor\\Name

Daniel Romero Schellhorn\\Emaildaniel\.schellhorn@uni\-osnabrueck\.de \\NameTill Mossakowski\\Emailtill\.mossakowski@uni\-osnabrueck\.de \\NameBjörn Gehrke\\Emailbjoern\.gehrke@uni\-osnabrueck\.de \\addrUniversity of Osnabrück, Osnabrück, Germany

###### Abstract

Neurosymbolic semantics is fragmented: classical, fuzzy, probabilistic and neural systems each define truth by their own inductive rules\. NeSyCat, extending ULLER, subsumes them under a single inductive definition of truth, parametric in a strong monad and an aggregation structure on truth\-values\. NeSyCat has so far lacked an account of predicates and functions learned by neural networks\. We provideNeSyCat Torch\\xspaceas the missing link and interpret computational symbols via neural networks, implementing the framework in probabilistic programming and tensor\-based backends\. We use the distribution monad for reference semantics and metric evaluation, and complement it by a monad for numerically stable, differentiable training: the lazy log\-tensor monad over the log\-semiring\. For efficient training in batches, we furthermore employ a batch monad\. The axioms*are*the source code: written once in monad\-baseddo\-notation\\xspace, monadic bind performs marginalisation, lazily pruning unneeded branches\. On MNIST addition, our HaskTorch, JAX, and PyTorch implementations outperform LTN and DeepProbLog in speed and accuracy, while achieving nearly the accuracy of DeepStochLog\. However, unlike DeepStochLog, we stay in a uniform framework that applies to many first\-order NeSy approaches\. Namely, the construction is parametric in the monad; instantiating it with, e\.g\., the Giry monad extends the approach to continuous probability \(working out a neural representation here is left for future work\)\.

## 1Introduction

Neurosymbolic \(NeSy\) AI combines the perceptual strength of neural networks with the structured, verifiable reasoning of symbolic logic\. A recurring obstacle is fragmentation: classical, fuzzy, and probabilistic NeSy systems each come with their own logical language and semantics, so knowledge bases and learning objectives rarely transfer between them\. ULLER \- the Unified Language for Learning and Reasoning\(vankriekenULLER2024\)\- endows First\-Order\-Logic \(FOL\) syntax with three pairwise\-independent semantics \- classical, fuzzy, probabilistic \- each carrying its own inductive definition of truth\.

A recent line of work\(schellhornNeSyCatCategorical2026\)reformulates all three semantics as instances of a single*categorical*framework built on*monads*, Moggi’s construct for computational effects in functional programming\(moggiNotionsComputation1991\)\. The key observation is that an ULLER computation formulax:=m\(T1,…,Tn\)\(F\)x:=m\(T\_\{1\},\\dots,T\_\{n\}\)\\,\(F\), interpreted as “run modelmm, then bind its result toxx, then evaluateFF”, is exactly monadicdo\-notation\. Fixing a strong monadℳ\\mathcal\{M\}\(the effect\) and an aggregated truth\-value spaceΩ\\Omegawith connectives and quantifiers yields a*NeSy framework*; classical, fuzzy, probabilistic, LTN, and possibilistic semantics all reappear as choices ofℳ\\mathcal\{M\}andΩ\\Omega, evaluated by*one*inductive definition of truth\.

For efficiency reasons, we use*lazy*monads\. The lifting operation in the distribution monad computes probabilities using marginalization; a lazy monad ensures that marginalization is only done in cases where it is actually needed\.

Besides usual FOL function and predicate symbols, we consider computational function symbolsX→ℳYX\\to\\mathcal\{M\}Yand computational predicate symbolsX→ℳΩX\\to\\mathcal\{M\}\\Omega\. At the deep learning level, we need also to consider two\-sided computational function symbolsℳX→ℳY\\mathcal\{M\}X\\to\\mathcal\{M\}Yand two\-sided computational predicate symbolsℳX→ℳΩ\\mathcal\{M\}X\\to\\mathcal\{M\}\\Omega\.

We now recall monads and present the monads used in this paper in TableLABEL:tab:monads\-nesycat:

## 2Monads for Computational Effects

###### Definition 2\.1\(Monad\(kohlSchwaigerMonads2021, §3\.1\)\)\.

A monad is given by a triple m is a type constructor mapping a type m a of computational effects with values from return embeds values into computation and mintedhaskell return :: a \-¿ m a \(¿¿=\) :: m a \-¿ \(a \-¿ m b\) \-¿ m b Here,cca and passes its value\(s\) to a functionffdelivering a computation over type definition

Haskell provides the do x ¡\- y; f is syntactic sugar for table\[t\]

## Appendix ACategorical Background

### Monads\.

Categorically, a monad on a category𝒞\\mathcal\{C\}is a functorT:𝒞→𝒞T\\colon\\mathcal\{C\}\\to\\mathcal\{C\}with natural transformationsη:id𝒞⇒T\\eta\\colon\\mathrm\{id\}\_\{\\mathcal\{C\}\}\\Rightarrow T\(*unit*\) andμ:TT⇒T\\mu\\colon TT\\Rightarrow T\(*multiplication*\) satisfyingμ∘ηT=id=μ∘Tη\\mu\\mathbin\{\\circ\}\\eta T=\\mathrm\{id\}=\\mu\\mathbin\{\\circ\}T\\etaandμ∘Tμ=μ∘μT\\mu\\mathbin\{\\circ\}T\\mu=\\mu\\mathbin\{\\circ\}\\mu T\. The programming definition above corresponds one\-to\-one to the equivalent presentation as a*Kleisli triple*\(T,η,\(⋅\)ℳ\)\(T,\\eta,\(\\cdot\)^\{\\mathcal\{M\}\}\), wherefℳ:TA→TBf^\{\\mathcal\{M\}\}\\colon TA\\to TBforf:A→TBf\\colon A\\to TBsatisfiesηAℳ=idTA\\eta\_\{A\}^\{\\mathcal\{M\}\}=\\mathrm\{id\}\_\{TA\},fℳ∘ηAℳ=ff^\{\\mathcal\{M\}\}\\mathbin\{\\circ\}\\eta\_\{A\}^\{\\mathcal\{M\}\}=f, andgℳ∘fℳ=\(gℳ∘f\)ℳg^\{\\mathcal\{M\}\}\\mathbin\{\\circ\}f^\{\\mathcal\{M\}\}=\(g^\{\\mathcal\{M\}\}\\mathbin\{\\circ\}f\)^\{\\mathcal\{M\}\}: \(¿¿=\) is the Kleisli lift\(⋅\)ℳ\(\\cdot\)^\{\\mathcal\{M\}\}\(applied flipped\), and thedo\-notation\\xspaceof Section[2](https://arxiv.org/html/2606.19279#S2)\. is its syntactic sugar\.

### States\.

We work over a*concrete*Cartesian category𝒞\\mathcal\{C\}: objects are sets equipped with structure \(for example measurable spaces, tensor spaces or also plain sets\), morphisms are structure\-preserving maps, finite products exist, and the terminal object11is the one\-element set\. Effectful maps are always written explicitly as mapsf:S→ℳTf\\colon S\\to\\mathcal\{M\}\\,Tof the chosen strong monadℳ\\mathcal\{M\}\. Because𝒞\\mathcal\{C\}is concrete and Cartesian, a*state*onSS, formally a Kleisli point1→ℳS1\\to\\mathcal\{M\}\\,S, is the same thing as an*element*ofℳS\\mathcal\{M\}\\,S; we use this identification throughout and simply writeD∈ℳSD\\in\\mathcal\{M\}\\,S\.

###### Proposition A\.1\(Pointwise evaluation\)\.

For eachi∈B¯i\\in\\underline\{B\}, evaluationevi:ℬℳX→ℳX\\mathrm\{ev\}\_\{i\}\\colon\\mathcal\{B\}\\mathcal\{M\}\\,X\\to\\mathcal\{M\}\\,X,m↦m\(i\)m\\mapsto m\(i\), is a monad morphism, and therefore commutes with the interpretation ofdo\-notation\\xspaceprograms: for a batchs:B¯→Ss\\colon\\underline\{B\}\\to S,

⟦φ⟧ℬℳ\(s\)\(i\)=⟦φ⟧ℳ\(si\)\(i∈B¯\)\.\\llbracket\\varphi\\rrbracket^\{\\mathcal\{B\}\\mathcal\{M\}\}\(s\)\(i\)\\;=\\;\\llbracket\\varphi\\rrbracket^\{\\mathcal\{M\}\}\(s\_\{i\}\)\\qquad\(i\\in\\underline\{B\}\)\.

One batched run thus yields all per\-sample truth values; over the full sample these are theNNnumbersL^\\widehat\{L\}averages, a mini\-batch gives an unbiased estimate ofL^\\widehat\{L\}\.

NeSyCat Torch: A Differentiable Tensor Implementation of Categorical Semantics for Neurosymbolic Learning

Similar Articles

A homotopy-type-theoretic generalization of neurosymbolic inference

The Cognitive Categorical Transformer: Category-Theoretic Inductive Biases for Language Modeling

CATS: Cascaded Adaptive Tree Speculation for Memory-Limited LLM Inference Acceleration

DisjunctiveNet: Neural Symbolic Learning via Differentiable Convexified Optimization Layers

Neuro-Symbolic Injection of LTLf Constraints in Autoregressive Reinforcement Learning Policies

Submit Feedback

Similar Articles

A homotopy-type-theoretic generalization of neurosymbolic inference

The Cognitive Categorical Transformer: Category-Theoretic Inductive Biases for Language Modeling

CATS: Cascaded Adaptive Tree Speculation for Memory-Limited LLM Inference Acceleration

DisjunctiveNet: Neural Symbolic Learning via Differentiable Convexified Optimization Layers

Neuro-Symbolic Injection of LTLf Constraints in Autoregressive Reinforcement Learning Policies