encoder-decoder

#encoder-decoder

Robust Explanations for User Trust in Enterprise NLP Systems

arXiv cs.CL ↗ · 2026-07-20 Cached

This paper proposes a unified black-box robustness evaluation framework for token-level explanations in enterprise NLP, comparing encoder (BERT, RoBERTa) and decoder (Qwen, Llama) models. It finds decoder LLMs produce substantially more stable explanations, with stability improving with scale, and provides a cost-robustness tradeoff curve for pre-deployment model selection.

0 favorites 0 likes

#encoder-decoder

Beyond Clean Text: Evaluating Encoder and Decoder Robustness for Bangla Event Detection in Noisy Text

arXiv cs.CL ↗ · 2026-07-01 Cached

This paper introduces a Bangla event detection benchmark with noisy text (ASR, orthographic corruption) and evaluates encoder-only and decoder-only LLMs, finding decoder models more robust to noise.

0 favorites 0 likes

#encoder-decoder

KaLM-Reranker-V1: Fast but Not Late Interaction for Compressed Document Reranking

Hugging Face Daily Papers ↗ · 2026-06-22 Cached

KaLM-Reranker-V1 is a fast reranker that decouples query and passage computation using an encoder-decoder architecture with Matryoshka embedding pooling and cross-attention, achieving state-of-the-art reranking performance on BEIR and competitive results on multilingual benchmarks.

0 favorites 0 likes

#encoder-decoder

A Deep Reinforcement Learning (DRL)-Based Transformer Method for Solving the Open Shop Scheduling Problem

arXiv cs.AI ↗ · 2026-06-15 Cached

Presents a Transformer-based scheduling policy trained with reinforcement learning for the open shop scheduling problem, showing that a model trained on small instances can generalize to much larger problems and compete with classical dispatching heuristics.

0 favorites 0 likes

#encoder-decoder

Selective Synergistic Learning for Video Object-Centric Learning

Hugging Face Daily Papers ↗ · 2026-06-14 Cached

Selective Synergistic Learning (SSync) improves video object-centric learning by selectively distilling reliable cues via pseudo-labeling and transitive merging, avoiding error propagation from indiscriminate dense alignment.

0 favorites 0 likes

#encoder-decoder

End-to-End Context Compression at Scale

Hugging Face Daily Papers ↗ · 2026-06-08 Cached

This paper presents Latent Context Language Models (LCLMs), a family of encoder-decoder compressors that efficiently handle long contexts through architectural search and large-scale pretraining, outperforming traditional KV cache methods in accuracy, speed, and memory usage.

0 favorites 0 likes

#encoder-decoder

Task-Routed Mixture-of-Experts with Cognitive Appraisal for Implicit Sentiment Analysis

arXiv cs.CL ↗ · 2026-05-21 Cached

This paper proposes a task-routed mixture-of-experts model with cognitive appraisal theory for implicit sentiment analysis, introducing auxiliary tasks to improve reasoning about sentiment from context and outperforming existing approaches.

0 favorites 0 likes

#encoder-decoder

Physics-informed convolutional neural networks for fluid flow through porous media

arXiv cs.LG ↗ · 2026-05-21 Cached

This paper presents a physics-informed convolutional encoder–decoder network to predict pore-scale velocity fields from porous media geometry, and demonstrates that using network predictions to initialize Lattice-Boltzmann simulations accelerates convergence in over 90% of cases.

0 favorites 0 likes

#encoder-decoder

Block-Based Double Decoders

arXiv cs.LG ↗ · 2026-05-20 Cached

Proposes block-based double decoders, a novel transformer architecture using doubly-causal block-based attention masks to combine decoder-only training efficiency with encoder-decoder inference efficiency, achieving strong scaling performance and reduced KV-cache memory.

0 favorites 0 likes

#encoder-decoder

Reference-Free Reinforcement Learning Fine-Tuning for MT: A Seq2Seq Perspective

arXiv cs.CL ↗ · 2026-05-18 Cached

This paper applies Group Relative Policy Optimization (GRPO) to encoder-decoder Seq2Seq models for machine translation fine-tuning, using reference-free rewards (LaBSE and COMET-Kiwi) that require no parallel data, and achieves consistent improvements across 13 languages.

0 favorites 0 likes

#encoder-decoder

@lateinteraction: guess what NVIDIA used here for an "attention-based encoder-decoder to retrieve directly from its own internal represen…

X AI KOLs Following ↗ · 2026-05-08 Cached

NVIDIA utilized late interaction, a form of sparse attention, for an attention-based encoder-decoder to retrieve directly from internal representations.

0 favorites 0 likes

#encoder-decoder

Retrieval from Within: An Intrinsic Capability of Attention-Based Models

Hugging Face Daily Papers ↗ · 2026-05-08 Cached

INTRA demonstrates that attention-based models can perform retrieval directly from internal representations, unifying retrieval and generation while improving evidence recall and answer quality.

0 favorites 0 likes

#encoder-decoder

SAM 3D Body: Robust Full-Body Human Mesh Recovery

Papers with Code Trending ↗ · 2026-02-17 Cached

SAM 3D Body is a promptable 3D human mesh recovery model using a novel parametric representation (MHR) and encoder-decoder architecture, achieving state-of-the-art performance with strong generalization. The model supports auxiliary prompts and is open-source.

0 favorites 0 likes

#encoder-decoder

T5Gemma: A new collection of encoder-decoder Gemma models

Google DeepMind Blog ↗ · 2025-10-25 Cached

Google introduces T5Gemma, a new collection of encoder-decoder models adapted from the Gemma 2 decoder-only architecture, offering improved quality-efficiency trade-offs for tasks like summarization and translation.

0 favorites 0 likes

#encoder-decoder

Moonshine: Speech Recognition for Live Transcription and Voice Commands

Papers with Code Trending ↗ · 2024-10-21 Cached

Moonshine presents a family of encoder-decoder transformer models for speech recognition that use Rotary Position Embedding (RoPE) and are optimized for live transcription and voice commands, achieving a 5x reduction in compute compared to Whisper tiny.en with no increase in word error rate.

0 favorites 0 likes

encoder-decoder

Submit Feedback