arXiv

Articles from arXiv

Cards List

Modularity-Free Conflict-Averse Training for Generalized PINNs

arXiv cs.AI · 3d ago Cached

This paper identifies a capacity-induced failure mode in physics-informed neural networks (PINNs) where overparameterized networks develop functional modularity that hinders convergence, and proposes Modular-Sparsity Synchronization (ModSync), a framework that penalizes task-exclusive connections to maintain cross-objective interaction and achieve state-of-the-art accuracy.

0 favorites 0 likes

BIM-Edit: Benchmarking Large Language Models for IFC-Based Building Information Modeling

arXiv cs.AI · 3d ago Cached

BIM-Edit is a benchmark for evaluating LLMs on natural-language editing of Building Information Models (BIM) in IFC format. Results show a substantial gap, with the best model achieving only 49.5% average score across geometric, semantic, and topological metrics.

0 favorites 0 likes

RACL: Reasoning-Agent Control Layers for Continuous Metaheuristic Learning

arXiv cs.AI · 3d ago Cached

Introduces RACL, a reasoning-agent control layer that improves metaheuristic optimization by learning to control internal search behavior from operational memory, showing cost improvements in vehicle routing tests.

0 favorites 0 likes

Learning to Prompt: Improving Student Engagement with Adaptive LLM-based High-School Tutoring

arXiv cs.AI · 3d ago Cached

This paper proposes an adaptive, subject-aware prompt routing framework for LLM-based high-school tutoring, using 14 pedagogical features to switch strategies. A/B testing with 359 students shows improved efficiency and conversion rates over static baselines.

0 favorites 0 likes

ScaffoldAgent: Utility-Guided Dynamic Outline Optimization for Open-Ended Deep Research

arXiv cs.AI · 3d ago Cached

ScaffoldAgent introduces a utility-guided dynamic outline optimization framework for open-ended deep research, using expansion, contraction, and revision operations to improve long-form report generation and factual grounding.

0 favorites 0 likes

Multi-Head Attention-Based Feature Extractor Integration with Soft Actor-Critic for Porosity Prediction and Process Parameter Optimization in Additive Manufacturing

arXiv cs.AI · 3d ago Cached

This paper proposes a novel architecture integrating multi-head attention with the Soft Actor-Critic algorithm for porosity prediction and process parameter optimization in additive manufacturing, achieving faster convergence and higher rewards than standard RL methods.

0 favorites 0 likes

Residual-Space Evolutionary Optimization via Flow-based Generative Models

arXiv cs.AI · 3d ago Cached

Introduces a framework combining flow-based generative editing with evolutionary algorithms to perform optimization in residual space, enabling controllable data editing with non-differentiable objectives. Validated on MorphoMNIST and crystal data.

0 favorites 0 likes

Process-Verified Reinforcement Learning for Theorem Proving via Lean

arXiv cs.AI · 3d ago Cached

This paper presents Process-Verified Reinforcement Learning, using the Lean proof assistant as a process oracle to provide fine-grained tactic-level feedback during training, improving theorem proving performance.

0 favorites 0 likes

Autonomous Event-Driven Multi-Agent Orchestration for Enterprise AI at Scale

arXiv cs.AI · 3d ago Cached

This paper evaluates multi-agent orchestration architectures (DAG Plan and Execute, ReAct) at enterprise scales and introduces a Task Manager for continuous event-driven operation, showing improvements in latency and correctness.

0 favorites 0 likes

Reward as An Agent for Embodied World Models

arXiv cs.AI · 3d ago Cached

This paper introduces Reward as an Agent and DynDiff-GRPO to address reward hacking and limited exploration in reinforcement learning for embodied world models, achieving significant accuracy gains.

0 favorites 0 likes

Advancing DialNav through Automatic Embodied Dialog Augmentation

arXiv cs.AI · 3d ago Cached

This paper proposes an automatic generation pipeline to create a large-scale training dataset (RAINbow) for DialNav, a dialog-based vision-and-language navigation task. Combined with dual-strategy training and a localization model, it achieves substantial gains over the baseline.

0 favorites 0 likes

PhysDrift: Bridging the Embodiment Gap in Humanoid Co-Speech Motion Generation

arXiv cs.AI · 3d ago Cached

This paper identifies an embodiment gap in humanoid co-speech motion generation caused by human-centric pipelines, and proposes PhysDrift, an embodiment-aware framework that directly predicts executable humanoid joint trajectories from speech, improving speech-motion alignment and physical plausibility.

0 favorites 0 likes

The Tao of Agency: Autotelic AI, Embedded Agency and Dissolution of the Self

arXiv cs.AI · 3d ago Cached

This paper explores autotelic AI, where agents generate their own goals, and discusses implications for intrinsic motivation, embeddedness, and the dissolution of the self boundary. It proposes a framework extending to quantum formulation, non-dual philosophy, and LLM-based instantiation.

0 favorites 0 likes

eCNNTO: A Highly Generalizable ConvNet for Accelerating Topology Optimization

arXiv cs.AI · 3d ago Cached

This paper proposes eCNNTO, a CNN with residual connections to accelerate density-based topology optimization by predicting near-optimal densities from early iteration histories, achieving up to 97% reduction in iterations and strong generalization across different boundary conditions, geometries, and mesh resolutions.

0 favorites 0 likes

Multi-Agent Transactive Memory

arXiv cs.AI · 3d ago Cached

Proposes Multi-Agent Transactive Memory (MATM), a framework for population-level storage and retrieval of agent-generated trajectories to improve task performance and reduce interaction steps in interactive environments like ALFWorld and WebArena.

0 favorites 0 likes

MetaResearcher: Scaling Deep Research via Self-Reflective Reinforcement Learning in Adversarial Virtual Environments

arXiv cs.AI · 3d ago Cached

MetaResearcher proposes a framework for training deep research agents using self-reflective reinforcement learning in adversarial virtual environments, addressing limitations of static environments and fact-retrieval-only tasks.

0 favorites 0 likes

A Systematic Evaluation of Black-Box Uncertainty Estimation Methods for Large Language Models

arXiv cs.AI · 3d ago Cached

This paper presents a systematic review and benchmark of 24 black-box uncertainty estimation methods for large language models across 4 models and 4 dataset settings, finding that no single method dominates but hybrid methods that combine multiple uncertainty signals perform well.

0 favorites 0 likes

TelcoAgent: A Scalable 5G Multi-KPM Forecasting With 3GPP-Grounded Explainability

arXiv cs.AI · 3d ago Cached

TelcoAgent is a foundation model-based framework for scalable and explainable multi-KPM forecasting in 5G networks, using automated 3GPP knowledge graph construction and a time-series foundation model for zero-shot prediction.

0 favorites 0 likes

Human-on-the-Loop Orchestration for AI-Assisted Legal Discovery

arXiv cs.AI · 3d ago Cached

This paper proposes a human-on-the-loop orchestration framework for AI-assisted legal discovery, introducing a taxonomy of agentic failures and a four-layer verification architecture to reduce privilege-waiver risk.

0 favorites 0 likes

CombEval: A Framework for Evaluating Combinatorial Counting in Large Language Models

arXiv cs.AI · 3d ago Cached

CombEval is a dynamic benchmark for evaluating combinatorial counting in large language models, using typed specifications to generate problems with solver-verified answers. It tests 11 LLMs under direct and code-augmented settings and finds brittleness on ordered objects, indistinguishable elements, relative constraints, and nested dependencies.

0 favorites 0 likes
Next →
← Back to home

Submit Feedback