human-ai-alignment

Tag

Cards List
#human-ai-alignment

Probing Outcome-Level Resemblance and Mechanism-Level Alignment in LLM Risk Decisions: Evidence from the St. Petersburg Game

Hugging Face Daily Papers · 2026-06-03

Researchers evaluate 28 LLMs on the St. Petersburg game to distinguish between outcome-level resemblance and mechanism-level alignment in risk decision-making, finding that LLMs often produce human-like bids without underlying human-consistent reasoning mechanisms. The study demonstrates that behavioral alignment can be superficial, urging high-stakes evaluations to go beyond outcome similarity.

0 favorites 0 likes
#human-ai-alignment

Do Benchmarks Underestimate LLM Performance? Evaluating Hallucination Detection With LLM-First Human-Adjudicated Assessment

arXiv cs.CL · 2026-05-12 Cached

This paper investigates whether standard benchmarks underestimate LLM performance by re-evaluating hallucination detection datasets using an LLM-first, human-adjudicated assessment method. The study finds that incorporating LLM reasoning into the adjudication process improves agreement and suggests that model-assisted re-evaluation yields more reliable benchmarks for ambiguity-prone tasks.

0 favorites 0 likes
#human-ai-alignment

Cognition amplifiers: The battle for your brain is here

Reddit r/singularity · 2026-05-10

This article argues that AI acts as a 'cognition amplifier,' shifting the bottleneck from execution to imagination and creating a feedback loop that could lead to a merger of human intention and machine intelligence. It emphasizes the critical importance of keeping these systems open and widely available rather than centralized.

0 favorites 0 likes
← Back to home

Submit Feedback