tool-calling

#tool-calling

TextGen is now a native desktop app. Open-source alternative to LM Studio (formerly text-generation-webui).

Reddit r/LocalLLaMA ↗ · 3h ago

TextGen, formerly text-generation-webui, has been updated to a native, no-install desktop application for Windows, Linux, and macOS, offering enhanced privacy, ik_llama.cpp support, and native tool-calling capabilities as an open-source alternative to LM Studio.

0 favorites 0 likes

#tool-calling

A 26M tool-router suggests tool calling should be split from reasoning

Reddit r/AI_Agents ↗ · 14h ago

The article introduces Needle, a 26M parameter model by Cactus-Compute designed for single-shot tool calling, arguing that tool routing should be separated from reasoning as a structured prediction task to improve agent efficiency and latency.

0 favorites 0 likes

#tool-calling

Do Agents Need to Plan Step-by-Step? Rethinking Planning Horizon in Data-Centric Tool Calling

arXiv cs.CL ↗ · yesterday Cached

This paper argues that full-horizon planning with lazy replanning is more efficient than step-by-step execution for data-centric LLM agent tasks, using fewer tokens while maintaining accuracy.

0 favorites 0 likes

#tool-calling

Switchcraft: AI Model Router for Agentic Tool Calling

arXiv cs.AI ↗ · 2d ago Cached

This paper introduces Switchcraft, the first AI model router specifically optimized for agentic tool calling to reduce inference costs. By using a lightweight DistilBERT classifier, it achieves significant cost savings while maintaining high accuracy in tool-use tasks.

0 favorites 0 likes

#tool-calling

MIST: Multimodal Interactive Speech-based Tool-calling Conversational Assistants for Smart Homes

arXiv cs.CL ↗ · 2d ago Cached

The paper introduces MIST, a synthetic dataset and framework for training multimodal voice assistants to control IoT devices in smart homes. It highlights significant performance gaps between open and closed-weight models in handling complex, speech-based tool-calling tasks.

0 favorites 0 likes

#tool-calling

@omarsar0: Cool paper from Apple. Most evaluation of tool-calling agents happens after the trajectory is over. By then the wrong c…

X AI KOLs Timeline ↗ · 2d ago Cached

This Apple research paper introduces 'Reinforced Agent,' a method that moves evaluation into the execution loop using a specialized reviewer agent to correct tool-calling errors in real-time. It demonstrates significant accuracy improvements on benchmarks like BFCL and τ²-Bench without retraining the base agent.

0 favorites 0 likes

#tool-calling

@RhysSullivan: I'm now building Executor full time as a startup! The state of tool calling is a mess: - Everyone is using different ag…

X AI KOLs Timeline ↗ · 4d ago Cached

Rhys Sullivan is building Executor, an open-source integration layer for AI agents that provides a unified tool catalog with access controls, approval flows for destructive actions, and support for MCP, OpenAPI, GraphQL, and more. It aims to standardize tool calling across different agents like Cursor and Claude Code.

0 favorites 0 likes

#tool-calling

BioTool: A Comprehensive Tool-Calling Dataset for Enhancing Biomedical Capabilities of Large Language Models

arXiv cs.CL ↗ · 5d ago Cached

BioTool introduces a comprehensive biomedical tool-calling dataset with 34 tools and 7,040 human-verified query-API pairs, enabling fine-tuned LLMs to outperform GPT-5.1 on biomedical tool use and significantly enhance answer quality.

0 favorites 0 likes

#tool-calling

@codewithimanshu: Stanford professor just gave away the entire foundation of how AI Agents & automation actually works. 1-hour lecture. T…

X AI KOLs Timeline ↗ · 2026-04-22 Cached

Stanford professor released a free 1-hour lecture covering the fundamentals of AI agents, tool calling, multi-step workflows, planning and reflection.

0 favorites 0 likes

#tool-calling

ibm-granite/granite-4.1-8b · Hugging Face

Reddit r/LocalLLaMA ↗ · 2026-04-21 Cached

IBM releases Granite-4.1-8B, an Apache 2.0 licensed 8B parameter long-context instruct model with enhanced tool-calling and multilingual support.

0 favorites 0 likes

#tool-calling

@KKaWSB: Moonshot just open-sourced Kimi K2.6—4,000 tool calls in one 12-hour session, 300 sub-agents in parallel building a full codebase. SOTA on SWE-Bench Pro, BrowseComp, HLE and more, ties Claude Opus 4.6 and G…

X AI KOLs Timeline ↗ · 2026-04-20 Cached

Moonshot has open-sourced the Kimi K2.6 model, supporting 4,000 tool calls in a single session and 300 parallel sub-agents, achieving SOTA on benchmarks like SWE-Bench Pro and claiming performance on par with Claude Opus 4.6 and GPT-5.4.

0 favorites 0 likes

#tool-calling

PolicyBank: Evolving Policy Understanding for LLM Agents

arXiv cs.CL ↗ · 2026-04-20 Cached

PolicyBank proposes a memory mechanism that enables LLM agents to autonomously refine their understanding of organizational policies through iterative interaction and corrective feedback, closing specification gaps that cause systematic behavioral divergence from true requirements. The work introduces a systematic testbed and demonstrates PolicyBank can close up to 82% of policy-gap alignment failures, significantly outperforming existing memory mechanisms.

0 favorites 0 likes

#tool-calling

New tools and features in the Responses API

OpenAI Blog ↗ · 2025-05-21 Cached

OpenAI announced new tools and features for the Responses API, including support for remote Model Context Protocol (MCP) servers, image generation, Code Interpreter, and improved file search capabilities. The update also enables o3 and o4-mini models to call tools directly within their chain-of-thought, with new enterprise features like background mode and encrypted reasoning items.

0 favorites 0 likes

tool-calling

Submit Feedback