dimensionality-reduction

#dimensionality-reduction

Visualizing High-Dimensional Graph Embeddings via Informed Multi-View Projections

arXiv cs.LG ↗ · 3h ago Cached

Proposes a method to embed graphs in high-dimensional space and search for informative 2D viewpoints that optimize aesthetic and readability metrics, enabled by a novel differentiable surrogate for edge crossings. Introduces an interactive system, DataFly, for exploring multiple candidate viewpoints.

0 favorites 0 likes

#dimensionality-reduction

Graph Dimensionality Reduction for Contextual Bandits: Structure-Specific Regret Bounds under Approximate Smoothness and Noisy Eigenspaces

arXiv cs.LG ↗ · 2d ago Cached

Proposes GraphDR-LinUCB, a method for contextual bandits with graph-structured arms that projects features onto the graph's low-frequency spectral subspace. Achieves the first regret bound for spectral-projection-based contextual bandits and demonstrates 15x regret reduction on real datasets over full-dimensional LinUCB.

0 favorites 0 likes

#dimensionality-reduction

Exact Schur-Sylvester Dimensionality Reductions for Non-Smooth Stochastic Complexity and Manifold Sampling

arXiv cs.LG ↗ · 2026-06-24 Cached

This paper presents exact dimensionality reductions using Schur complement and Sylvester's determinant identity to reduce computational complexity from O(N^3) to O(k^3+N^2k) per step for non-smooth NML estimation, achieving over 14,000x speedup while maintaining numerical precision.

0 favorites 0 likes

#dimensionality-reduction

Dual Dimensionality for Local and Global Attention

arXiv cs.CL ↗ · 2026-06-18 Cached

Proposes Distance-Adaptive Representation (DAR) which reduces key-value dimensionality for distant tokens while preserving full dimensionality for nearby tokens, improving KV cache efficiency without performance loss.

0 favorites 0 likes

#dimensionality-reduction

@vintcessun: Turns out LLM text embeddings are hijacked by high-frequency tokens (periods, articles)! The unembedding matrix implicitly defines a low-rank subspace dominated by these uninformative expressions. This is the root cause of LLMs' poor performance as universal embeddings, and the contamination is subtle. EmbedFilter…

X AI KOLs Timeline ↗ · 2026-06-12 Cached

This study reveals that LLM text embeddings are hijacked by high-frequency tokens (e.g., periods, articles) and proposes EmbedFilter, which performs SVD on the unembedding matrix and subtracts the projection component to release true semantics, achieving zero-training-cost dimensionality reduction and retrieval efficiency gains.

0 favorites 0 likes

#dimensionality-reduction

Your UnEmbedding Matrix is Secretly a Feature Lens for Text Embeddings

Hugging Face Daily Papers ↗ · 2026-06-05 Cached

The paper identifies that LLM text embeddings overly express high-frequency uninformative tokens and proposes EmbedFilter, a linear transformation that filters out this subspace to improve semantic representations and enable dimensionality reduction.

0 favorites 0 likes

#dimensionality-reduction

@kyndinfo: Principal Component Analysis 主成分分析 [Math short write-up] https://notion.so/kyndinfo/Principal-Component-Analysis-351019…

X AI KOLs Timeline ↗ · 2026-06-04

A short mathematical write-up on Principal Component Analysis (PCA), explaining the concept and its applications.

0 favorites 0 likes

#dimensionality-reduction

ScaleMAP: Preserving Local Density and Neighborhood Structure in Low-Dimensional Embeddings

arXiv cs.LG ↗ · 2026-06-01 Cached

ScaleMAP is a new nonlinear dimensionality reduction method that preserves local density and neighborhood structure by rescaling embedding distances based on original-space local radii, achieving better density preservation than DensMAP while maintaining UMAP-level neighborhood preservation.

0 favorites 0 likes

#dimensionality-reduction

DIVE: Embedding Compression via Self-Limiting Gradient Updates

arXiv cs.CL ↗ · 2026-05-21 Cached

Proposes DIVE, a compression adapter for embedding dimensionality reduction that uses self-limiting gradient updates and head-wise NT-Xent contrastive loss to prevent overfitting on small datasets, outperforming existing methods on BEIR benchmarks.

0 favorites 0 likes

#dimensionality-reduction

Supervised Latent Restructuring for Small-Data Quantum Learning in Plant Phenomics

arXiv cs.LG ↗ · 2026-05-21 Cached

This paper proposes a hybrid quantum-classical workflow for plant phenomics classification under small-data regimes, using supervised latent restructuring (PCA + LDA) to improve geometric separability before quantum kernel alignment. Experiments show improved separability but highlight compression trade-offs and the difficulty of achieving strong quantum performance.

0 favorites 0 likes

#dimensionality-reduction

Unsupervised learning of acquisition variability in structural connectomes via hybrid latent space modeling

arXiv cs.LG ↗ · 2026-05-15 Cached

This paper introduces an unsupervised framework for modeling acquisition-related variability in structural connectomes using hybrid latent space modeling, eliminating the need for manual capacity tuning by architecturally annealing encoder outputs.

0 favorites 0 likes

#dimensionality-reduction

Rank Is Not Capacity: Spectral Occupancy for Latent Graph Models

arXiv cs.LG ↗ · 2026-05-13 Cached

This paper proposes Spectra, a method using spectral occupancy to analyze and control the realized capacity of latent graph models, arguing that rank is not equivalent to model capacity.

0 favorites 0 likes

#dimensionality-reduction

@probnstat: One theorem every ML engineer should know: The Johnson–Lindenstrauss Lemma. It states that high-dimensional data can be…

X AI KOLs Following ↗ · 2026-05-09

This post highlights the Johnson–Lindenstrauss Lemma, explaining its importance for ML engineers in understanding dimensionality reduction, random projections, and embedding efficiency.

0 favorites 0 likes

#dimensionality-reduction

A polynomial autoencoder beats PCA on transformer embeddings

Hacker News Top ↗ · 2026-05-05 Cached

This article introduces a polynomial autoencoder that improves upon PCA for compressing transformer embeddings by using a quadratic decoder to capture nonlinear variance. Benchmarks on BEIR show it significantly outperforms standard PCA and Matryoshka embeddings in retrieval quality while maintaining high compression ratios.

0 favorites 0 likes

#dimensionality-reduction

Spectral Tempering for Embedding Compression in Dense Passage Retrieval

arXiv cs.CL ↗ · 2026-04-20 Cached

Spectral Tempering (SpecTemp) proposes a learning-free method for embedding compression in dense passage retrieval that adaptively determines optimal spectral scaling based on signal-to-noise ratio analysis, outperforming fixed hyperparameter approaches like PCA and whitening.

0 favorites 0 likes

dimensionality-reduction

Submit Feedback