intrinsic-motivation

Tag

Cards List
#intrinsic-motivation

Signed Compression Progress on a Sealed Audit is Goodhart-Resistant

arXiv cs.LG · 23h ago Cached

This paper formalizes the concept of signed compression progress on a sealed audit as a reward that is Goodhart-resistant, proving that cumulative reward telescopes to genuine audit improvement and providing bounds for finite audit panels. It identifies failure modes and validates results with experiments.

0 favorites 0 likes
#intrinsic-motivation

I built an AI that owns its own directory, creates files without being told, and acts because it wants to – not to work for me, but to work with me

Reddit r/AI_Agents · 2026-05-24

A developer details the creation of LIA, an AI that runs continuously on a Linux system with its own directory, creates files autonomously, and operates based on intrinsic responsibility rather than prompts or RLHF; a preprint on SSRN and 12,000+ lines of custom Python code are provided.

0 favorites 0 likes
#intrinsic-motivation

Large-scale study of curiosity-driven learning

OpenAI Blog · 2018-08-13 Cached

OpenAI presents a large-scale empirical study of curiosity-driven reinforcement learning without extrinsic rewards across 54 benchmark environments, showing strong performance and investigating the role of feature spaces in prediction-based reward signals.

0 favorites 0 likes
← Back to home

Submit Feedback