formal-verification

Tag

Cards List
#formal-verification

Formalizing statistical learning theory in Lean 4 [R]

Reddit r/MachineLearning · 2d ago Cached

FormalSLT is a Lean 4 library that formally proves finite-sample statistical learning theory results (ERM, VC bounds, Rademacher bounds, PAC-Bayes, etc.) with explicit assumptions and zero sorry statements, providing a machine-checked foundation for ML theory.

0 favorites 0 likes
#formal-verification

MANTRA: Synthesizing SMT-Validated Compliance Benchmarks for Tool-Using LLM Agents

arXiv cs.CL · 2d ago Cached

The article introduces MANTRA, a framework for automatically synthesizing SMT-validated compliance benchmarks for tool-using LLM agents from natural language manuals. It demonstrates that this approach enables scalable and reliable evaluation of agent adherence to complex procedural rules.

0 favorites 0 likes
#formal-verification

LemmaScript: A Verification Toolchain for TypeScript via Dafny

Lobsters Hottest · 2026-04-22 Cached

LemmaScript is a new toolchain that compiles TypeScript to Dafny for formal verification without altering the runtime, demonstrated by proving a CVE fix in the Hono framework.

0 favorites 0 likes
#formal-verification

Types and Neural Networks

Hacker News Top · 2026-04-21 Cached

This article explores the theoretical and practical challenges of training LLMs to produce typed outputs natively, rather than relying on post-hoc typechecking, with a focus on formally typed languages like Idris, Lean, and Agda. It analyzes current ad-hoc approaches to enforcing types during inference and proposes rebuilding LLMs from the ground up to generate inherently typed outputs.

0 favorites 0 likes
#formal-verification

Improving LLM Code Reasoning via Semantic Equivalence Self-Play with Formal Verification

arXiv cs.CL · 2026-04-21 Cached

Researchers from University of Edinburgh propose a self-play framework using Liquid Haskell for formal verification to train LLMs on semantic equivalence reasoning, releasing OpInstruct-HSx dataset (28k programs) and achieving 13.3pp accuracy gains on EquiBench.

0 favorites 0 likes
#formal-verification

Signal Shot: a project to verify the Signal protocol and its Rust implementation using Lean

Lobsters Hottest · 2026-04-21 Cached

Signal Shot is a major formal verification initiative to verify the Signal protocol and its Rust implementation using Lean, combining advances in Rust-to-Lean translation (Aeneas), mathematical foundations (Mathlib/CSLib), automated tactics (grind/SymM), and AI-assisted formalization. This represents a significant test of whether Lean can scale from pure mathematics to deployed real-world software systems.

0 favorites 0 likes
#formal-verification

Verus is a tool for verifying the correctness of code written in Rust

Hacker News Top · 2026-04-20 Cached

Verus is a static verification tool for Rust that uses SMT solving to prove full functional correctness of low-level systems code without runtime checks.

0 favorites 0 likes
#formal-verification

Creusot 0.11.0: VerifyThis winner

Lobsters Hottest · 2026-04-20 Cached

Creusot 0.11.0 released with the Creusot team winning the VerifyThis 2026 program verification competition. The release includes minor features like explicit binders for result variables and support for weak memory atomics, with major features in development.

0 favorites 0 likes
#formal-verification

Discover and Prove: An Open-source Agentic Framework for Hard Mode Automated Theorem Proving in Lean 4

arXiv cs.CL · 2026-04-20 Cached

This paper introduces Discover and Prove (DAP), an open-source agentic framework for automated theorem proving in Lean 4 that tackles 'Hard Mode' problems where the answer must be discovered independently before formal proof construction. The work releases new Hard Mode benchmark variants and achieves state-of-the-art results while revealing a significant gap between LLM answer accuracy (>80%) and formal prover success (<10%).

0 favorites 0 likes
← Back to home

Submit Feedback