Tag
JFinTEB introduces the first comprehensive benchmark for evaluating Japanese financial text embeddings, addressing a gap in domain-specific and language-specific evaluation resources. The benchmark includes retrieval and classification tasks evaluated across Japanese-specific, multilingual, and commercial embedding models, with datasets and evaluation framework publicly released.
OpenAI released text-embedding-ada-002, a unified embedding model that consolidates five previous models into one with superior performance, 4x longer context (8192 tokens), smaller dimensionality (1536), and 99.8% lower pricing than previous Davinci embeddings.
A CLIP-based embedding model hosted on Replicate that generates 768-dimensional embeddings for both images and text using the clip-vit-large-patch14 architecture, costing ~$0.00022 per run.