methodology

Tag

Cards List
#methodology

METR evaluated an early version of Claude Mythos

Reddit r/singularity · 12h ago

METR evaluated an early version of Claude Mythos Preview in March 2026 using their time-horizons task suite, estimating a 50%-time-horizon of at least 16 hours, indicating the model is at the upper end of what current benchmarks can measure, with caveats about stability at longer time ranges.

0 favorites 0 likes
#methodology

@jaynitx: https://x.com/jaynitx/status/2052734499319091384

X AI KOLs Timeline · yesterday Cached

A personal reflection on first principles thinking versus reasoning by analogy, using examples from Elon Musk's approach to reducing rocket costs at SpaceX, and the author's own startup failure.

0 favorites 0 likes
#methodology

Advancing red teaming with people and AI

OpenAI Blog · 2024-11-21 Cached

OpenAI publishes a white paper detailing their approach to external red teaming for AI models, outlining methods for selecting diverse red team members, determining model access levels, providing testing infrastructure, and synthesizing feedback to improve AI safety and policy coverage.

0 favorites 0 likes
← Back to home

Submit Feedback