beautyyuyanli/multilingual-e5-large

Replicate Explore Models

Summary

Multilingual E5-large embedding model is now available on Replicate, costing ~$0.00098 per run and completing in ~1 second on Nvidia L40S.

beautyyuyanli / multilingual-e5-large

Original Article

View Cached Full Text

Cached at: 04/23/26, 01:44 PM

# beautyyuyanli/multilingual-e5-large – Replicate Source: [https://replicate.com/beautyyuyanli/multilingual-e5-large](https://replicate.com/beautyyuyanli/multilingual-e5-large) ## Run time and cost This model costs approximately $0\.00098 to run on Replicate, or 1020 runs per $1, but this varies depending on your inputs\. It is also open source and you can[run it on your own computer with Docker](https://replicate.com/beautyyuyanli/multilingual-e5-large/api)\. This model runs on[Nvidia L40S GPU hardware](https://replicate.com/docs/billing)\. Predictions typically complete within 1 seconds\.

Similar Articles

LiquidAI/LFM2.5-Embedding-350M

Hugging Face Models Trending

Liquid AI releases LFM2.5-Embedding-350M, a dense bi-encoder for multilingual retrieval supporting 11 languages, as a drop-in replacement for RAG pipelines.

krthr/clip-embeddings

Replicate Explore

A CLIP-based embedding model hosted on Replicate that generates 768-dimensional embeddings for both images and text using the clip-vit-large-patch14 architecture, costing ~$0.00022 per run.

@liquidai: Introducing LFM2.5-Embedding-350M and LFM2.5-ColBERT-350M: two multilingual retrieval models built for ultra-fast and a…

X AI KOLs Following

Liquid AI introduces LFM2.5-Embedding-350M and LFM2.5-ColBERT-350M, two multilingual retrieval models optimized for fast and accurate search across 11 languages, with latency as low as 1.5ms.

Nemotron-3-Embed 1B/8B

Reddit r/LocalLLaMA

NVIDIA released Nemotron-3-Embed 1B and 8B models, state-of-the-art multilingual text embedding models for retrieval and semantic similarity, optimized for RAG systems.

Benchmarking Google Embeddings 2 against Open-Source Models for Multilingual Dense Retrieval and RAG Systems

arXiv cs.CL

This paper benchmarks Google Embeddings 2 against five open-source models for multilingual dense retrieval and RAG, finding GE2 top in accuracy but slower, with mE5-L as a competitive low-latency alternative.