end-to-end

Tag

Cards List
#end-to-end

Selective Capability Unlearning in End-to-End Spoken Language Understanding

arXiv cs.CL · 12h ago Cached

Proposes BindingSubspace (BSU), a representation-level framework that isolates and attenuates intent-conditioned directions in end-to-end spoken language understanding models to prevent capability persistence, where suppressing an intent still allows slot generation under forced prefixes. The method reduces forced-prefix recoverability while preserving retained performance on SLU benchmarks.

0 favorites 0 likes
#end-to-end

ORAgentBench: Can LLM Agents Solve Challenging Operations Research Tasks End to End?

arXiv cs.AI · 4d ago Cached

本文介绍ORAgentBench,一个用于评估LLM代理在端到端运筹学任务中表现的执行基准,包含107个经过人工审查的任务。实验表明,当前最佳代理仅通过35.51%的任务,揭示了在可靠决策制定方面的重大不足。

0 favorites 0 likes
#end-to-end

Speech-Driven End-to-End Language Discrimination towards Chinese Dialects

arXiv cs.CL · 6d ago Cached

This paper investigates speech-driven features for fine-grained discrimination among Chinese dialects, using an end-to-end model that combines MFCC-based features with word-level embeddings via a CNN, outperforming text-driven methods.

0 favorites 0 likes
#end-to-end

For the first time ever, 8 Codex-AutoResearch agents BRING LIFE TO A ROBOT FLEET achieving end-to-end success in solving a task in the physical world with with NO HUMAN BRIDGE in between...SELF IMPROVING a part of Nvidia Gear Lab

Reddit r/singularity · 6d ago

Researchers at Nvidia Gear Lab achieved a milestone where 8 Codex-AutoResearch agents autonomously controlled a robot fleet to solve a physical world task without human intervention, demonstrating self-improvement.

0 favorites 0 likes
#end-to-end

Towards End-to-End Automation of AI Research

arXiv cs.AI · 2026-06-16 Cached

A paper presenting The AI Scientist, a system that automates the entire research lifecycle from idea generation to peer review, demonstrating AI's growing capacity for scientific contribution.

0 favorites 0 likes
#end-to-end

Hy-Embodied-0.5-VLA: From Vision-Language-Action Models to a Real-World Robot Learning Stack

Hugging Face Daily Papers · 2026-06-12 Cached

HyVLA-0.5 is an end-to-end robotic learning system that integrates data collection, model design, pre-training, fine-tuning, and reinforcement learning for real-world deployment.

0 favorites 0 likes
#end-to-end

zai-org/SCAIL-2 · Hugging Face

Reddit r/LocalLLaMA · 2026-06-09 Cached

SCAIL-2 is an open-source model for end-to-end controlled character animation that animates a reference character with a driving video, supporting character replacement and multi-character scenarios without intermediate pose representations.

0 favorites 0 likes
#end-to-end

SCAIL-2: Unifying Controlled Character Animation with End-to-end In-Context Conditioning

Hugging Face Daily Papers · 2026-06-09 Cached

SCAIL-2 is a framework that achieves end-to-end controlled character animation by directly transferring motion from driving videos without intermediate representations, using unified task decomposition, synthetic data (MotionPair-60K), and novel conditioning techniques like in-context mask conditioning and Bias-Aware DPO.

0 favorites 0 likes
#end-to-end

LLMBridge: An LLM Pipeline for End-to-end Referential Bridging Resolution in English

arXiv cs.CL · 2026-05-29 Cached

LLMBridge introduces an LLM-based pipeline for end-to-end referential bridging resolution, achieving state-of-the-art performance on three English datasets. The system combines heuristic pre/post-processing with LLM natural language inference.

0 favorites 0 likes
#end-to-end

LELA: An End-to-end LLM-based Entity Linking Framework with Zero-shot Domain Adaptation

arXiv cs.AI · 2026-05-27 Cached

LELA is an LLM-based entity linking framework that combines zero-shot NER and entity disambiguation into an end-to-end Python library, validated across diverse settings.

0 favorites 0 likes
#end-to-end

FormalASR: End-to-End Spoken Chinese to Formal Text

arXiv cs.CL · 2026-05-20 Cached

FormalASR presents two compact end-to-end models that directly transcribe spoken Chinese into formal written text, achieving significant error reduction and eliminating the need for a separate LLM post-processing stage, enabling lightweight on-device deployment.

0 favorites 0 likes
#end-to-end

RankE: End-to-End Post-Training for Discrete Text-to-Image Generation with Decoder Co-Evolution

Hugging Face Daily Papers · 2026-05-20 Cached

RankE introduces an end-to-end post-training framework for discrete text-to-image generation that jointly optimizes both the generator and decoder to address the latent covariate shift problem, improving alignment and fidelity simultaneously.

0 favorites 0 likes
#end-to-end

@itsolelehmann: I remember in 2023-24 having to duct-tape so many different tools together to get half-decent AI outputs GPT 4 for writ…

X AI KOLs Following · 2026-05-15 Cached

Reflecting on the fragmented AI tool landscape of 2023-24, the user highlights the arrival of Higgsfield AI's Supercomputer, a cloud-native AI agent that consolidates 40+ tools for end-to-end task execution.

0 favorites 0 likes
#end-to-end

MetaAgent-X : Breaking the Ceiling of Automatic Multi-Agent Systems via End-to-End Reinforcement Learning

arXiv cs.AI · 2026-05-15 Cached

MetaAgent-X introduces an end-to-end reinforcement learning framework that jointly optimizes the design and execution of automatic multi-agent systems, overcoming the frozen-executor ceiling and achieving up to 21.7% gains over existing baselines.

0 favorites 0 likes
#end-to-end

A Calculus-Based Framework for Determining Vocabulary Size in End-to-End ASR

arXiv cs.CL · 2026-05-15 Cached

This paper presents a calculus-based framework that uses first and second derivative tests to estimate the optimal vocabulary size hyper-parameter for end-to-end ASR systems, improving performance on the Librispeech corpus.

0 favorites 0 likes
#end-to-end

All-in-one AI platforms are quietly taking over end-to-end production. Thoughts?

Reddit r/artificial · 2026-05-14

Higgsfield is an all-in-one AI video platform handling character consistency, generation, audio, and distribution, contrasting with single-model specialists like Kling, Runway, and Veo. The discussion questions whether vertical integration or specialized quality will dominate AI video production.

0 favorites 0 likes
#end-to-end

EVA-Bench: A New End-to-end Framework for Evaluating Voice Agents

Hugging Face Daily Papers · 2026-05-13 Cached

EVA-Bench introduces a comprehensive end-to-end framework for evaluating voice agents, simulating realistic multi-turn conversations and measuring performance across voice-specific failure modes with novel accuracy (EVA-A) and experience (EVA-X) metrics. The benchmark includes 213 scenarios across enterprise domains and a perturbation suite for accent and noise robustness, revealing substantial gaps in current systems.

0 favorites 0 likes
#end-to-end

@wsl8297: When learning AI, the scariest part is getting stuck at "understanding the theory" and freezing when it's time to write code — not knowing where to start, and unable to find decent practice projects. I unearthed a practical treasure trove on GitHub: AI-Project-Gallery. It collects 30+ high-quality AI projects, covering classic topics like house price prediction and disease classification, as well as hot applications like Gemini chatbot and document generator...

X AI KOLs Timeline · 2026-05-12 Cached

This post shares a curated GitHub repository containing over 30 practical AI projects, covering domains from regression to generative AI, with many end-to-end examples, suitable for learners and developers.

0 favorites 0 likes
#end-to-end

@paulabartabajo_: Advice for AI engineers If you're building voice agents, stop wiring up 3 separate models, for audio-to-text, text-to-a…

X AI KOLs Timeline · 2026-05-08 Cached

Announces liquid-audio, an open-source repository for Liquid AI's end-to-end speech-to-speech LFM models (LFM2-Audio-1.5B and LFM2.5-Audio-1.5B) with interleaved and sequential generation modes and fine-tuning support.

0 favorites 0 likes
#end-to-end

zhen-nan/L2P

Hugging Face Models Trending · 2026-05-03 Cached

L2P proposes an efficient transfer paradigm that leverages pre-trained latent diffusion models to build pixel-space diffusion models, enabling high-quality generation with minimal computational overhead and data requirements, and supporting native 4K resolution.

0 favorites 0 likes
Next →
← Back to home

Submit Feedback