text-embeddings

#text-embeddings

A Comparative Study on Affective Cues in Text Embeddings Across Psychological Emotion Theories

arXiv cs.CL ↗ · 3d ago Cached

This paper evaluates twelve recent text encoders on their ability to encode affective cues from three psychological emotion theories, finding that instruction-aware open-weight encoders match or exceed proprietary ones at word level, while task-tuned embeddings are superior at sentence level.

0 favorites 0 likes

#text-embeddings

Does My Embedding Reflect That $A = B$? Evaluating Mathematical Equivalence in Embedding Models

arXiv cs.CL ↗ · 2026-06-24 Cached

This paper introduces the MELD dataset for evaluating whether text embedding models capture mathematical equivalence across different terminologies, and finds that current models fail. It proposes a contrastive learning approach to align informal and formal mathematical statements, improving retrieval on both informal-formal and natural language tasks.

0 favorites 0 likes

#text-embeddings

Measuring Curriculum Alignment across Topical Coverage, Competency, and Cognitive Depth: A Longitudinal Framework Applied to CS2013 and CS2023

arXiv cs.AI ↗ · 2026-06-20 Cached

This paper presents a human-in-the-loop pipeline for measuring a computer science program's coverage of curricular guidelines, applied longitudinally to CS2013 and CS2023. The framework reveals near-constant topical coverage but a gap in cognitive depth due to the newer guideline's raised expectations.

0 favorites 0 likes

#text-embeddings

Your UnEmbedding Matrix is Secretly a Feature Lens for Text Embeddings

Hugging Face Daily Papers ↗ · 2026-06-05 Cached

The paper identifies that LLM text embeddings overly express high-frequency uninformative tokens and proposes EmbedFilter, a linear transformation that filters out this subspace to improve semantic representations and enable dimensionality reduction.

0 favorites 0 likes

#text-embeddings

JFinTEB: Japanese Financial Text Embedding Benchmark

arXiv cs.CL ↗ · 2026-04-20 Cached

JFinTEB introduces the first comprehensive benchmark for evaluating Japanese financial text embeddings, addressing a gap in domain-specific and language-specific evaluation resources. The benchmark includes retrieval and classification tasks evaluated across Japanese-specific, multilingual, and commercial embedding models, with datasets and evaluation framework publicly released.

0 favorites 0 likes

#text-embeddings

New and improved embedding model

OpenAI Blog ↗ · 2022-12-15 Cached

OpenAI released text-embedding-ada-002, a unified embedding model that consolidates five previous models into one with superior performance, 4x longer context (8192 tokens), smaller dimensionality (1536), and 99.8% lower pricing than previous Davinci embeddings.

0 favorites 0 likes

#text-embeddings

krthr/clip-embeddings

Replicate Explore ↗ · 2026-05-09 Cached

A CLIP-based embedding model hosted on Replicate that generates 768-dimensional embeddings for both images and text using the clip-vit-large-patch14 architecture, costing ~$0.00022 per run.

0 favorites 0 likes

text-embeddings

A Comparative Study on Affective Cues in Text Embeddings Across Psychological Emotion Theories

Does My Embedding Reflect That $A = B$? Evaluating Mathematical Equivalence in Embedding Models

Measuring Curriculum Alignment across Topical Coverage, Competency, and Cognitive Depth: A Longitudinal Framework Applied to CS2013 and CS2023

Your UnEmbedding Matrix is Secretly a Feature Lens for Text Embeddings

JFinTEB: Japanese Financial Text Embedding Benchmark

New and improved embedding model

krthr/clip-embeddings

Submit Feedback