Tag
Introduces IsoSci, a benchmark of isomorphic cross-domain science problem pairs that separates reasoning ability from domain knowledge retrieval in LLM evaluation. The study finds that 91.3% of reasoning-mode gains are knowledge-dependent, challenging common assumptions about chain-of-thought reasoning.
Mexico's World Cup victory over Ecuador generated seismic vibrations detected by sensors, sparking discussion about the definition of 'artificial earthquakes' and the difference between human-induced tremors and actual geological earthquakes.
A roundup of six scientific stories from June, including the physics of soccer feints, the distinctive shape of poop, boron buckyballs, and the drag crisis affecting the World Cup ball.
The Multimodal Universe (MMU), an 80TB+ collection of astronomical survey data, has been converted to the HATS parquet format, enabling crossmatching on a laptop via LSDB and Hugging Face ecosystems without bulk downloads.
A New Yorker review of Saul Justin Newman's book 'Morbid' argues that many claims of extreme longevity are due to poor record-keeping and age fraud, challenging the foundations of longevity science.
Claude Science is a research partner tool designed for rigorous scientific work, leveraging Claude's capabilities to assist researchers.
Anthropic launches Claude Science, a desktop app for macOS and Linux that provides a unified research environment for life sciences, integrating AI, databases, HPC, and tools for genomics, proteomics, structural biology, and more.
Scientists have discovered how wombats produce cube-shaped feces, solving a long-standing mystery about the biological mechanism behind their unique digestive process.
The article explores the complexity of counting elementary particles in the Standard Model, noting that the number can vary from 17 to many more depending on definitions and theoretical nuances.
MIT physicist Sanjoy Mahajan's textbook 'The Art of Insight in Science and Engineering' is available for free on MIT OpenCourseWare, teaching nine mental tools for tackling complex problems effectively.
GPT-5.6 is a capable model for long-horizon tasks and knowledge work across coding, computer use, and science.
Space Shuttle Endeavour is being prepared for a permanent vertical display at the California Science Center, standing 20 stories tall.
Venezuela experienced a rare seismic doublet with two powerful earthquakes of 7.2 and 7.5 magnitude occurring 39 seconds apart, causing a national emergency. The phenomenon involves stress transfer between faults and has been studied in other regions such as Turkey and Syria.
Patrick McKenzie comments that a science/tech goal is both audacious and plausible, suggesting much of the science is already in place.
The Trump administration took down climate.gov, but volunteers and former staff preserved the data and relaunched it as a nonprofit site (climate.us), restoring lost climate information and planning to expand resources.
A podcast episode featuring a conversation with Dr. Giulia Enders, a leading expert on the gut microbiome, covering gut health, probiotics, and practical advice.
NASA is showcasing how space exploration and research from the International Space Station influence sports science and soccer ball technology during the 2026 FIFA World Cup, including an exhibit in Houston and studies on ball aerodynamics.
Ancient DNA from plague bacteria found in teeth of hunter-gatherers in Siberia reveals the earliest known plague outbreak, challenging assumptions that plague emerged with farming.
This paper introduces SciRisk-Bench, a benchmark for evaluating the safety of large language models in AI4Science contexts, covering 7 disciplines, 31 subdisciplines, and 10 risk dimensions to assess both scientific competence and risk awareness.
Dan Shipper argues that AI models like GPT-4 can replace human intuition in fields like psychology where scientific explanations are lacking, advocating for using AI to drive progress even without full understanding.