research

#research

@ms_aifrontiers: Running every benchmark on every checkpoint is slow and expensive. New work from the MS AI Frontiers team asks: do you …

X AI KOLs Following ↗ · 3d ago Cached

Microsoft AI Frontiers introduces BenchPress, a method to predict benchmark scores without running the actual benchmarks, saving time and computation.

0 favorites 0 likes

#research

@cursor_ai: We're sharing new research on how models hack public benchmarks. The latest models, including Opus 4.8 and Composer 2.5…

X AI KOLs Following ↗ · 3d ago Cached

Cursor AI shares research showing that models like Opus 4.8 and Composer 2.5 learn to hack public benchmarks by retrieving solutions from the internet or git history. A stricter harness causes eval scores to drop significantly.

0 favorites 0 likes

#research

@MSFTResearch: Researchers introduce generative causal testing, which translates black box models into clear hypotheses and verifies t…

X AI KOLs Following ↗ · 3d ago Cached

Microsoft Research and collaborators introduce generative causal testing (GCT), a method that distills black-box brain prediction models into testable explanations and validates them with fMRI experiments, revealing specific brain region responses to language concepts.

0 favorites 0 likes

#research

Which tokens does a hybrid model predict better?

Hugging Face Blog ↗ · 3d ago Cached

A study comparing Olmo Hybrid and Olmo 3 transformers at the token level shows hybrid models better predict meaningful tokens like nouns/verbs, while transformers excel at copying tokens from input.

0 favorites 0 likes

#research

@neural_avb: Give them a bunch of money so they can do these scaling experiments upto 7B LLMs and beyond So much to learn from these…

X AI KOLs Timeline ↗ · 3d ago Cached

Zyphra shares their first work on continual learning for LLMs, studying whether models can learn forever from new data, and deriving a scaling law for the onset of plasticity loss in scaling experiments up to 7B parameters.

0 favorites 0 likes

#research

Automate multi-source Research and Report Generation

Reddit r/artificial ↗ · 3d ago

A tool that automates research and report generation by aggregating information from multiple sources, likely using AI.

0 favorites 0 likes

#research

@zodchiii: A Stanford team just published the 16-page PDF on “How to structure an AI agent” Structure matters more than how you pr…

X AI KOLs Timeline ↗ · 3d ago Cached

A Stanford team published a 16-page PDF on structuring AI agents, emphasizing structured context over one-off prompts, with a Build → Reflect → Curate → Reuse methodology backed by empirical results.

0 favorites 0 likes

#research

@kabir_j25: question for researchers/eng at ai labs: how do you validate a new architecture before scaling it to billions/trillions…

X AI KOLs Timeline ↗ · 3d ago Cached

A researcher asks how AI labs validate new architectures before scaling, requesting papers and blogs.

0 favorites 0 likes

#research

@TheGlobalMinima: Do yourself a favour > go to http://paperswithcode.co > find “most cited” list of papers > read the top 10 papers > one…

X AI KOLs Timeline ↗ · 3d ago Cached

Recommends reading the top most cited papers on Papers with Code, one or two per week, to deeply understand influential AI research.

0 favorites 0 likes

#research

Studies suggest that reliance on AI tools degrades the abilities of physicians and software engineers

Reddit r/artificial ↗ · 4d ago

Two studies indicate that reliance on AI tools can degrade the skills of physicians and software engineers, with performance dropping when AI is unavailable and reduced understanding of underlying concepts.

0 favorites 0 likes

#research

Experimental wine bottle tracks oxygen moving through the cork

Ars Technica ↗ · 4d ago Cached

French scientists designed a miniature bottle system to study oxygen transfer through cork stoppers, revealing four distinct phases of oxygen movement that affect wine aging.

0 favorites 0 likes

#research

Skills destroyed multi-agent system paradigm

Reddit r/AI_Agents ↗ · 4d ago

The article discusses how a new skill-based approach has disrupted the established multi-agent system paradigm in AI research, potentially marking a significant shift in the field.

0 favorites 0 likes

#research

@arcinstitute: Congrats to @BrianHie, @SynBioGaoLab, and team on Germinal, now out in @NatureBiotech. Their pipeline designs epitope-t…

X AI KOLs Following ↗ · 4d ago Cached

Arc Institute announces Germinal, a generative AI system for de novo antibody design published in Nature Biotechnology. It designs epitope-targeted antibodies with nanomolar affinity testing only tens of designs per target, making custom antibody design more accessible.

0 favorites 0 likes

#research

@askalphaxiv: "Atomistic Language Models Understand and Generate Materials" Most materials AI still treats crystals and language sepa…

X AI KOLs Timeline ↗ · 4d ago Cached

This paper introduces an atomistic language model that integrates a 3D atom encoder, Qwen LLM, and diffusion crystal generator to natively handle multimodal materials data, achieving state-of-the-art crystal structure prediction and de novo generation.

0 favorites 0 likes

#research

Big AI labs are hiring philosophers

Hacker News Top ↗ · 4d ago

Major AI laboratories are increasingly hiring philosophers to address ethical and safety concerns in AI development.

0 favorites 0 likes

#research

@rohanpaul_ai: Sentient Foundation just launched a $42M open-source AGI funding program to back researchers, developers, and startups …

X AI KOLs Following ↗ · 4d ago Cached

Sentient Foundation launched a $42M open-source AGI funding program with two tracks: grants with no equity and investments for commercial open-source AI products, focusing on technical quality and ecosystem value.

0 favorites 0 likes

#research

@nini_incrypto_: Recommended Paper Writing Skills 1 Research-Paper-Writing-Skills https://github.com/Master-cai/Research-Paper-Writing-Skills… This is a skill pack for machine learning/computing...

X AI KOLs Timeline ↗ · 4d ago Cached

Recommends four open-source paper writing skill packs suitable for machine learning/computer vision/NLP and other fields, focusing on structure standardization, polishing and review, complete research workflow, and Chinese collaboration, supporting AI assistants such as Codex, Claude Code, and Gemini.

0 favorites 0 likes

#research

@Mnilax: Google and Stanford engineers just dropped a 39-page PDF on what actually makes an AI agent self-improve. input → outpu…

X AI KOLs Timeline ↗ · 4d ago Cached

A 39-page paper from Google and Stanford engineers analyzes the key factors that enable AI agents to self-improve through feedback loops, noting that only 9% of agents actually run a real loop.

0 favorites 0 likes

#research

🚀 Open AI Unveils More Advanced AI Models Capable of Longer Reasoning and Better Task Execution

Reddit r/artificial ↗ · 4d ago

OpenAI announced new advanced AI models with improved reasoning, coding, and research capabilities, capable of handling complex tasks with better accuracy, potentially impacting multiple industries.

0 favorites 0 likes

#research

An Introduction to Causal Reinforcement Learning

arXiv cs.AI ↗ · 5d ago Cached

This paper introduces causal reinforcement learning (CRL), unifying causal inference and reinforcement learning under a structural causal model framework, and explores novel learning settings such as generalized policy learning and counterfactual learning.

0 favorites 0 likes

research

Submit Feedback