Tag
This paper proposes that reliability in AI-assisted social science research depends on decision architecture—how cognitive labor is divided between humans and machines. Through a pre-specified factorial experiment, the authors show that an unconstrained multi-agent baseline fails in 72% of runs, while one organized with three architectural commitments (LLMs restricted to reasoning, deterministic data/estimation, and three human decision gates) fails in only 16%.
This paper analyzes 35,361 GitHub code comments referencing AI use to develop a taxonomy of AI-assisted development activities, finding that developers primarily use LLMs for code implementation and enhancement, with subsequent human refactoring and bug fixes, and a temporal shift toward conceptual support over direct code generation.
A discussion on the ethical implications of fully autonomous AI agents, focusing on accountability, decision-making, privacy, and human oversight.
This paper surveys the capabilities and limitations of AI across the full research lifecycle, from idea generation to dissemination, identifying a sharp boundary between reliable assistance and unreliable autonomy. It provides a taxonomy, benchmark suite, tool inventory, and design principles for human-governed AI collaboration in research.
The article discusses how companies can integrate EU AI Act compliance into their product development from the design phase, highlighting transparency, guardrails, and human oversight as key architectural changes.
This paper introduces the Functional Intentionality Test (FIT) and FIT-Eval framework to quantify the degree of intentional-like behavior in agentic AI systems for governance and accountability purposes.