code

#code

Beyond Document Grounding: Span-Level Hallucination Detection over Code, Tool Output, and Documents

arXiv cs.CL ↗ · 2d ago Cached

This paper introduces a unified benchmark for span-level hallucination detection in RAG systems that extends beyond natural language to code, tool output, and structured documents, and presents a fine-tuned Qwen3.5-2B detector that outperforms existing methods on these new domains while remaining competitive on standard NLP benchmarks.

0 favorites 0 likes

#code

Kimi and GLM on frontier code

Reddit r/LocalLLaMA ↗ · 5d ago

Moonshot AI's Kimi and Zhipu AI's GLM have achieved notable results on frontier code benchmarks.

0 favorites 0 likes

#code

@adithya_s_k: You can now train on 350+ RL Environments from OpenReward with TRL with just a few lines of code

X AI KOLs Following ↗ · 2026-06-17 Cached

OpenReward and TRL now support training on over 350 reinforcement learning environments with minimal code.

0 favorites 0 likes

#code

Your Teacher Can't Help You Here: Combating Supervision Fidelity Decay in On-Policy Distillation

arXiv cs.CL ↗ · 2026-06-01 Cached

Identifies Supervision Fidelity Decay (SFD) in on-policy distillation, where teacher supervision degrades as student sequences lengthen, and proposes Lookahead Group Reward (LGR) to mitigate SFD, improving performance on math and code benchmarks.

0 favorites 0 likes

#code

@dongxi_nlp: I saw discussions about whether to use Python for building Agents. Go check out Shunyu Yao's ReAct source code – just a few notebooks. I remember running those simple lines of code and collapsing into my chair; it was one of the rare experiences in life. No exaggeration, these note…

X AI KOLs Timeline ↗ · 2026-05-26 Cached

A tweet discussing whether to use Python for writing AI Agents, citing Shunyu Yao's ReAct source code (just a few Jupyter notebooks), claiming these notebooks kicked off the Agent era, and criticizing the attitude of looking down on specific programming languages.

0 favorites 0 likes

#code

@GregKamradt: "Code and math are taking off because they are easy to verify, the next frontier is domains that are hard to verify" Th…

X AI KOLs Timeline ↗ · 2026-05-21 Cached

Greg Kamradt proposes a 7-level spectrum of verification difficulty for AI, ranging from instantly verifiable domains like math and code to civilization-scale systems with slow, noisy feedback.

0 favorites 0 likes

#code

Framed

Product Hunt ↗ · 2026-05-21

Framed is a tool that turns screenshots, videos, and code into polished visuals, available on Product Hunt.

0 favorites 0 likes

#code

What Really Improves Mathematical Reasoning: Structured Reasoning Signals Beyond Pure Code

arXiv cs.AI ↗ · 2026-05-20

This paper challenges the belief that code improves reasoning in language models, finding through controlled pretraining experiments that code alone primarily enhances programming ability, while reasoning gains come from structured reasoning traces like code-text and math-text mixtures.

0 favorites 0 likes

#code

@leopardracer: RENAISSANCE TECHNOLOGIES EARNS 66% A YEAR. ZERO PUBLIC CODE. SOMEONE JUST LEAKED THEIR ENTIRE STRUCTURE ON GITHUB funda…

X AI KOLs Timeline ↗ · 2026-05-18 Cached

A tweet reports that Renaissance Technologies' entire trading structure has been leaked on GitHub under an Apache 2.0 license, amassing 76,800 stars.

0 favorites 0 likes

#code

@haider1: Yann LeCun says LLMs are strongest in domains where language itself is the substrate of reasoning, like math and code T…

X AI KOLs Following ↗ · 2026-05-15 Cached

Yann LeCun states that LLMs are strongest in domains where language is the substrate of reasoning, like math and code, but they are not creative mathematicians, software architects, or computer scientists.

0 favorites 0 likes

#code

@YoheiNishitsuji: #TweetGLSL float i,e,R,s;vec3 q,p,d=vec3((FC.xy-.5*r)/r.y,.6);for(q.z--;i++<97.;i>86.){o.rgb+=hsv(.08,-e,e/5e1)+.003;p=q…

X AI KOLs Following ↗ · 2026-05-08 Cached

A Twitter post sharing a compact GLSL shader program (fractal/raymarching implementation) by user @YoheiNishitsuji

0 favorites 0 likes

#code

A sufficiently comprehensive spec is not (necessarily) code

Hillel Wayne — Computer Things ↗ · 2026-04-15 Cached

This article argues that a comprehensive specification is not equivalent to code, because a spec defines a set of possible implementations while code is one concrete instance. It discusses the role of abstraction and why programmers are still needed to write specs even with automated code generation.

0 favorites 0 likes

code

Submit Feedback