Tag
This paper evaluates twelve recent text encoders on their ability to encode affective cues from three psychological emotion theories, finding that instruction-aware open-weight encoders match or exceed proprietary ones at word level, while task-tuned embeddings are superior at sentence level.
This paper introduces the MELD dataset for evaluating whether text embedding models capture mathematical equivalence across different terminologies, and finds that current models fail. It proposes a contrastive learning approach to align informal and formal mathematical statements, improving retrieval on both informal-formal and natural language tasks.
This paper presents a human-in-the-loop pipeline for measuring a computer science program's coverage of curricular guidelines, applied longitudinally to CS2013 and CS2023. The framework reveals near-constant topical coverage but a gap in cognitive depth due to the newer guideline's raised expectations.
The paper identifies that LLM text embeddings overly express high-frequency uninformative tokens and proposes EmbedFilter, a linear transformation that filters out this subspace to improve semantic representations and enable dimensionality reduction.
JFinTEB introduces the first comprehensive benchmark for evaluating Japanese financial text embeddings, addressing a gap in domain-specific and language-specific evaluation resources. The benchmark includes retrieval and classification tasks evaluated across Japanese-specific, multilingual, and commercial embedding models, with datasets and evaluation framework publicly released.
OpenAI released text-embedding-ada-002, a unified embedding model that consolidates five previous models into one with superior performance, 4x longer context (8192 tokens), smaller dimensionality (1536), and 99.8% lower pricing than previous Davinci embeddings.
A CLIP-based embedding model hosted on Replicate that generates 768-dimensional embeddings for both images and text using the clip-vit-large-patch14 architecture, costing ~$0.00022 per run.