Tag
Anthropic launched Claude Science, a flagship product for scientific research that can autonomously carry out tasks in computational biology and drug development, signaling a major bet on AI for science.
This perspective paper develops a conceptual and methodological framework for evaluating evidence-licensed claims in AI-assisted research, emphasizing calibration as a mechanism for managing scientific assertion rights and distinguishing between different AI research routes.
Anthropic launched Claude Science, a new flagship product for autonomous scientific research in computational biology and drug development, available to all paid Claude subscribers.
Discusses the challenge of verifying AI-generated hypotheses in scientific discovery where no ground truth exists, and presents Apodex's multi-agent approach with independent verifier agents as a solution.
This position paper argues that a scientific understanding of AI must go beyond post-hoc analysis and instead study the training dynamics that shape model behavior, with implications for predicting, intervening, and designing training procedures for desired properties like capabilities and safety.
MIT Technology Review's newsletter covers three major stories: Anthropic's Code with Claude event showing developers increasingly shipping AI-written code without review, the upcoming Enhanced Games for athletes using performance-enhancing drugs, and Google I/O's shift towards agentic AI for science with Gemini for Science.
The article discusses how current AI systems can assist parts of the scientific workflow, potentially accelerating incremental discovery in data-rich fields, but they remain limited by dependence on existing literature and human-defined objectives, risking epistemic homogenization.
Google I/O keynote highlighted a shift in AI-driven science, contrasting specialized tools like WeatherNext with the rise of agentic AI systems that can conduct research autonomously, signaling a realignment in resources and enthusiasm.
DeepMind's Co-Scientist AI tool bridges the expertise of two researchers from different biological fields to accelerate ALS research by generating testable hypotheses and identifying RNA-based mechanisms for potential therapies.
Tim O'Reilly discusses the challenges of integrating AI into scientific publishing, including hallucinated citations, propagation of retracted papers, and training on compromised literature, and calls for adapting existing scientific infrastructure for AI use.
Google DeepMind announces a strategic partnership with the Republic of Korea's Ministry of Science and ICT to support national AI strategy and scientific breakthroughs. The collaboration includes establishing an AI Campus in Seoul to provide access to advanced models like AlphaFold and AlphaGenome for local research institutions.
MIT hosted a 2025 workshop on the future of AI and mathematical/physical sciences, bringing together leading researchers to explore how these domains can advance each other. The resulting white paper emphasizes that AI and science should have a two-way relationship, with science informing AI development and AI improving scientific discovery.
OpenAI introduces FrontierScience, a new benchmark for measuring expert-level AI scientific capabilities across physics, chemistry, and biology, with GPT-5.2 achieving 77% on olympiad-style tasks and 25% on research-style tasks. The paper presents early evidence that GPT-5 meaningfully accelerates real scientific workflows, shortening work from weeks to hours while establishing metrics for tracking progress toward AI-accelerated science.
OpenAI organized a "1,000 Scientist AI Jam Session" across nine U.S. Department of Energy national labs, bringing together over 1,000 scientists to test advanced AI models like o3-mini for accelerating scientific discovery. The event demonstrates a major public-private collaboration aimed at strengthening U.S. AI leadership while addressing research challenges in materials science, renewable energy, and astrophysics.