arxiv

Tag

Cards List
#arxiv

ASALT: Adaptive State Alignment for Lateral Transfer in Multi-agent Reinforcement Learning

arXiv cs.AI · 23h ago Cached

ASALT introduces adaptive state and observation-level adapters for lateral transfer in multi-agent reinforcement learning, enabling effective strategy transfer between domains with mismatched state-space dimensionalities and reducing negative transfer.

0 favorites 0 likes
#arxiv

CALIBER: Calibrating Confidence Before and After Reasoning in Language Models

arXiv cs.CL · 23h ago Cached

The paper introduces CALIBER, a method for calibrating confidence in reasoning language models by eliciting confidence estimates both before and after reasoning, with supervision targets matched to the information state. It achieves significant reductions in Expected Calibration Error (up to 52.5%) and strong Brier scores and AUROC across multiple benchmarks.

0 favorites 0 likes
#arxiv

An Introduction to Causal Reinforcement Learning

arXiv cs.AI · 23h ago Cached

This paper introduces causal reinforcement learning (CRL), unifying causal inference and reinforcement learning under a structural causal model framework, and explores novel learning settings such as generalized policy learning and counterfactual learning.

0 favorites 0 likes
#arxiv

@N8Programs: Excited to announce an arXiv note on an interesting mathematical symmetry I noticed... that connects the classic MLP to…

X AI KOLs Timeline · yesterday Cached

Announces an arXiv note on a mathematical symmetry connecting classic MLP to Gated MLP, going beyond empirical performance.

0 favorites 0 likes
#arxiv

@Gracker_Gao: AI Papers: Strong AI Doesn't Write Code by Writing Code Two recent arXiv papers reveal a counterintuitive finding: when encountering an unfamiliar programming language, GPT-5.4 and Claude Opus 4.6 don't directly write code in the target language—instead, they write a Python program to generate the target code, then debug it locally. This "meta-…

X AI KOLs Timeline · yesterday Cached

Two recent arXiv papers found that GPT-5.4 and Claude Opus 4.6 employ a metaprogramming strategy when handling unfamiliar programming languages — generating target code with Python and debugging locally — rather than writing the target language code directly. This strategy is key to distinguishing top-tier agents from average ones, and strategy sophistication matters more than model parameter scale.

0 favorites 0 likes
#arxiv

Thermodynamic Measure Of Intelligence

Reddit r/singularity · 2d ago Cached

This paper proposes a thermodynamic measure of intelligence defined as 'rare-valid lift' and argues that recursive self-simulation is necessary and nearly sufficient for high thermodynamic intelligence, making intelligence measurable on a universal scale.

0 favorites 0 likes
#arxiv

@QingQ77: A local-first academic paper management desktop application supporting paper discovery, management, and visualization from sources like arXiv. https://github.com/linxiv-dev/linXiv… A local paper management tool for researchers, all data stored locally, nothing uploaded to any external service…

X AI KOLs Timeline · 2d ago Cached

A local-first academic paper management desktop application linXiv, supporting paper discovery, management, and visualization from sources like arXiv, integrating SQLite database, AI annotation, Obsidian notes, and paper network graph.

0 favorites 0 likes
#arxiv

@akshay_pachaar: Turn any paper into running code. Just swap arxiv → autoarxiv in the paper url. That hands the paper to an AI agent fro…

X AI KOLs Following · 3d ago Cached

autoarxiv lets you turn any arxiv paper into running code by simply changing the URL to autoarxiv.org. An AI agent from alphaXiv reads the paper, clones the repo, sets up dependencies, and runs a minimal reproduction to verify claims, logging everything live.

1 favorites 1 likes
#arxiv

Autonomous Event-Driven Multi-Agent Orchestration for Enterprise AI at Scale

arXiv cs.AI · 4d ago Cached

This paper evaluates multi-agent orchestration architectures (DAG Plan and Execute, ReAct) at enterprise scales and introduces a Task Manager for continuous event-driven operation, showing improvements in latency and correctness.

0 favorites 0 likes
#arxiv

PhysDrift: Bridging the Embodiment Gap in Humanoid Co-Speech Motion Generation

arXiv cs.AI · 4d ago Cached

This paper identifies an embodiment gap in humanoid co-speech motion generation caused by human-centric pipelines, and proposes PhysDrift, an embodiment-aware framework that directly predicts executable humanoid joint trajectories from speech, improving speech-motion alignment and physical plausibility.

0 favorites 0 likes
#arxiv

TelcoAgent: A Scalable 5G Multi-KPM Forecasting With 3GPP-Grounded Explainability

arXiv cs.AI · 4d ago Cached

TelcoAgent is a foundation model-based framework for scalable and explainable multi-KPM forecasting in 5G networks, using automated 3GPP knowledge graph construction and a time-series foundation model for zero-shot prediction.

0 favorites 0 likes
#arxiv

Which Sections of a Research Paper Best Reveal Its Research Methods? Evidence from Library and Information Science

arXiv cs.CL · 6d ago Cached

This paper proposes a segment combination strategy for automatically classifying research methods in academic papers by partitioning full-text content. Experiments on an annotated corpus from Library and Information Science journals show that methodological information is unevenly distributed, with middle-to-late segments having higher discriminative power.

0 favorites 0 likes
#arxiv

Learning Robust Pair Confidence for Multimodal Emotion-Cause Pair Extraction

arXiv cs.CL · 6d ago Cached

This paper introduces RPCL, a training-only framework for robust pair confidence learning in multimodal emotion-cause pair extraction, which improves discriminative separation of gold pairs from hard negatives and achieves significant gains in Pair F1 and AUPRC on three datasets.

0 favorites 0 likes
#arxiv

VISUALSKILL: Multimodal Skills for Computer-Use Agents

arXiv cs.CL · 6d ago Cached

VisualSkill proposes a hierarchical multimodal skill library for computer-use agents that combines text and figures, achieving a 15.3 point absolute lift on CUA benchmarks over text-only baselines by retaining visual information for GUI interaction.

0 favorites 0 likes
#arxiv

How Well Do Large Language Models Capture Human Personality?

arXiv cs.AI · 6d ago Cached

This paper systematically evaluates assumptions about LLM persona prompting and identifies 'persona manifold collapse,' where richer persona descriptions reduce behavioral diversity and simulation fidelity. The findings show that simple age-gender personas often outperform more detailed profiles.

0 favorites 0 likes
#arxiv

QSignAI: Quantum-Randomness-Seeded Identity Signatures at the Intersection of AI for Science and Science for AI

arXiv cs.AI · 6d ago Cached

QSignAI is a production-deployed open-source platform that combines quantum randomness from a Toeplitz two-source extractor with an AI bot on Telegram to generate unique identity signatures, demonstrating a bidirectional relationship between artificial intelligence and quantum science.

0 favorites 0 likes
#arxiv

Human-AI Coevolution Dynamics: A Formal Theory of Social Intelligence Emergence Through Long-Term Interaction

arXiv cs.AI · 6d ago Cached

Proposes the Human-AI Coevolution Dynamics Framework (HACD-H) as a formal model of human-AI interaction, integrating emotional adaptation, relational organization, social memory, and personality consistency. Results show social intelligence emerges from long-term social cognitive coevolution.

0 favorites 0 likes
#arxiv

@nickscamara_: New discoveries are gonna come from models that can reason over the latest science The rate of scientific progress beco…

X AI KOLs Timeline · 2026-06-17 Cached

Firecrawl released a state-of-the-art research index for AI/ML papers, claiming 18% better recall on arXivQA than competitors, designed for autonomous research agents.

0 favorites 0 likes
#arxiv

MM++: Unsupervised Scale-Invariant Multilayer OOD Detection via Top-K Gated Feature Fusion

arXiv cs.LG · 2026-06-17 Cached

MM++ is a fully unsupervised, post-hoc framework for out-of-distribution detection that fuses discriminative intermediate layers via top-K gated feature fusion and uses a regularized tied covariance matrix for scale-invariant distance estimation.

0 favorites 0 likes
#arxiv

@rohanpaul_ai: This paper shows a strange weakness in AI reasoning: models can solve math, yet fail to judge reasoning. The unsettling…

X AI KOLs Following · 2026-06-16 Cached

This paper introduces the Valid-Answer-Invalid-Reasoning (VAIR) benchmark to expose the production-evaluation gap in AI reasoning models, where models can generate correct answers but fail to detect flawed reasoning, revealing answer confirmation bias.

0 favorites 0 likes
Next →
← Back to home

Submit Feedback