encoder-decoder

Tag

Cards List
#encoder-decoder

End-to-End Context Compression at Scale

Hugging Face Daily Papers · 3d ago Cached

This paper presents Latent Context Language Models (LCLMs), a family of encoder-decoder compressors that efficiently handle long contexts through architectural search and large-scale pretraining, outperforming traditional KV cache methods in accuracy, speed, and memory usage.

0 favorites 0 likes
#encoder-decoder

Task-Routed Mixture-of-Experts with Cognitive Appraisal for Implicit Sentiment Analysis

arXiv cs.CL · 2026-05-21 Cached

This paper proposes a task-routed mixture-of-experts model with cognitive appraisal theory for implicit sentiment analysis, introducing auxiliary tasks to improve reasoning about sentiment from context and outperforming existing approaches.

0 favorites 0 likes
#encoder-decoder

Physics-informed convolutional neural networks for fluid flow through porous media

arXiv cs.LG · 2026-05-21 Cached

This paper presents a physics-informed convolutional encoder–decoder network to predict pore-scale velocity fields from porous media geometry, and demonstrates that using network predictions to initialize Lattice-Boltzmann simulations accelerates convergence in over 90% of cases.

0 favorites 0 likes
#encoder-decoder

Block-Based Double Decoders

arXiv cs.LG · 2026-05-20 Cached

Proposes block-based double decoders, a novel transformer architecture using doubly-causal block-based attention masks to combine decoder-only training efficiency with encoder-decoder inference efficiency, achieving strong scaling performance and reduced KV-cache memory.

0 favorites 0 likes
#encoder-decoder

Reference-Free Reinforcement Learning Fine-Tuning for MT: A Seq2Seq Perspective

arXiv cs.CL · 2026-05-18 Cached

This paper applies Group Relative Policy Optimization (GRPO) to encoder-decoder Seq2Seq models for machine translation fine-tuning, using reference-free rewards (LaBSE and COMET-Kiwi) that require no parallel data, and achieves consistent improvements across 13 languages.

0 favorites 0 likes
#encoder-decoder

@lateinteraction: guess what NVIDIA used here for an "attention-based encoder-decoder to retrieve directly from its own internal represen…

X AI KOLs Following · 2026-05-08 Cached

NVIDIA utilized late interaction, a form of sparse attention, for an attention-based encoder-decoder to retrieve directly from internal representations.

0 favorites 0 likes
#encoder-decoder

Retrieval from Within: An Intrinsic Capability of Attention-Based Models

Hugging Face Daily Papers · 2026-05-08 Cached

INTRA demonstrates that attention-based models can perform retrieval directly from internal representations, unifying retrieval and generation while improving evidence recall and answer quality.

0 favorites 0 likes
#encoder-decoder

SAM 3D Body: Robust Full-Body Human Mesh Recovery

Papers with Code Trending · 2026-02-17 Cached

SAM 3D Body is a promptable 3D human mesh recovery model using a novel parametric representation (MHR) and encoder-decoder architecture, achieving state-of-the-art performance with strong generalization. The model supports auxiliary prompts and is open-source.

0 favorites 0 likes
#encoder-decoder

T5Gemma: A new collection of encoder-decoder Gemma models

Google DeepMind Blog · 2025-10-25 Cached

Google introduces T5Gemma, a new collection of encoder-decoder models adapted from the Gemma 2 decoder-only architecture, offering improved quality-efficiency trade-offs for tasks like summarization and translation.

0 favorites 0 likes
← Back to home

Submit Feedback