Tag
Cognition introduces Devin Fusion, an adaptive model router that reduces cost by 35% while maintaining real frontier intelligence for agentic coding tasks.
The paper proposes DIF, a model-agnostic method for denoising implicit feedback in cold-start recommendation by using pseudo-labels from content-similar warm items and uncertainty estimation, achieving significant improvements in a billion-user video app.
Signature filtering is a detection-time module that improves statistical watermark detection in LLMs by learning and removing 'signature' tokens that make watermark tests unreliable, achieving large gains in detection rates while keeping false positives low.
Introduces P²CE, a model-agnostic algorithm for generating plausible Pareto-optimal counterfactual explanations that balances feasibility, plausibility, and computational efficiency using an isolation forest outlier detector and SHAP values.
Former Datadog engineers launch Niteshift, an AI coding cloud that routes between models to reduce lock-in, raising $7M seed round led by Greylock.
ExpGraph is a model-agnostic framework that enables LLM agents to reuse past experiences via a self-evolving graph of skills and failures, improving task performance by 12–21% without retraining the executor.
This paper proposes CR4T, a model-agnostic safeguarding framework that rewrites unsafe or refusal-style LLM outputs into developmentally appropriate, guidance-oriented responses for adolescents, offering a more human-centered alternative to traditional refusal-centric guardrails.
INSIGHTS is a model-agnostic approach for providing global explanations of time-series models by generating diverse, informative sample summaries that capture domain-specific behaviors, outperforming local attribution methods in user studies.
This paper introduces COSMOS, a model-agnostic personalized federated learning framework that uses clustered server models and pseudo-label-only communication. It provides theoretical analysis showing exponential personalization risk contraction and demonstrates superior performance over existing baselines in heterogeneous environments.