Tag
Bellwethr is developing an open methodology for tracking the real USD cost of a single inference token from capable models, with a draft benchmark suite and community contributions underway.
This paper surveys the capabilities and limitations of AI across the full research lifecycle, from idea generation to dissemination, identifying a sharp boundary between reliable assistance and unreliable autonomy. It provides a taxonomy, benchmark suite, tool inventory, and design principles for human-governed AI collaboration in research.