zsxkib/embedding-gemma-300m

Replicate Explore Models

Summary

zsxkib/embedding-gemma-300m is a Replicate deployment of Google's EmbeddingGemma-300M model for generating 768-dimensional text embeddings, supporting flexible output dimensions via Matryoshka representation learning.

zsxkib / embedding-gemma-300m
Original Article
View Cached Full Text

Cached at: 06/03/26, 11:38 PM

# zsxkib/embedding-gemma-300m – Replicate Source: [https://replicate.com/zsxkib/embedding-gemma-300m](https://replicate.com/zsxkib/embedding-gemma-300m) ## Run time and cost This model costs approximately $0\.00022 to run on Replicate, or 4545 runs per $1, but this varies depending on your inputs\. It is also open source and you can[run it on your own computer with Docker](https://replicate.com/zsxkib/embedding-gemma-300m/api)\. This model runs on[Nvidia T4 GPU hardware](https://replicate.com/docs/billing)\. Predictions typically complete within 1 seconds\. ## Readme ## EmbeddingGemma\-300M 🧠✨ ## Overview 🔊 **EmbeddingGemma\-300M**is a text\-to\-vector model that transforms any text into 768\-dimensional embeddings\. This tool is built upon the amazing work of[Google DeepMind](https://deepmind.google/)and their[EmbeddingGemma research](https://arxiv.org/abs/2408.10957)\. We’ve wrapped their[embedding\-gemma\-300m model](https://huggingface.co/google/embedding-gemma-300m)to work on Replicate\! Allowing us to generate high\-quality embeddings for search, recommendations, and AI applications\! Support Google DeepMind and learn more about their work through: - [EmbeddingGemma Paper](https://arxiv.org/abs/2408.10957) - [Model Card on Hugging Face](https://huggingface.co/google/embedding-gemma-300m) ## Pre\-loaded Efficiency 🚀 The EmbeddingGemma\-300M comes pre\-loaded with**base64 output format**for immediate use with 3x smaller response sizes compared to arrays\. ## Getting Different Dimensions 💥 You can customize your EmbeddingGemma\-300M’s output using Matryoshka representation learning\. Here’s how: 1. Choose your embedding dimension: 128, 256, 512, or 768 2. Set the`embedding\_dim`parameter to your desired size 3. The model automatically truncates the full 768\-dimensional embedding to your chosen size 4. Example usage: 5. `embedding\_dim = 256`→ Get 256\-dimensional vectors 6. `embedding\_dim = 512`→ Get 512\-dimensional vectors 7. `embedding\_dim = 768`→ Get full 768\-dimensional vectors \(default\) *Note: Smaller dimensions are perfect for storage optimization while maintaining strong performance for most tasks\.* ## Code Examples 📚 **Python:** ``` import replicate, base64, numpy as np # Get base64 embedding (default) b64 = replicate.run("zsxkib/embedding-gemma-300m", input={"text": "Hello world"}) # Decode to numpy array vec = np.frombuffer(base64.b64decode(b64), dtype=np.float32) print(vec.shape) # (768,) ``` **JavaScript:** ``` const b64 = await replicate.run("zsxkib/embedding-gemma-300m", { input: { text: "Hello world" } }); const embedding = new Float32Array(Buffer.from(b64, 'base64').buffer); console.log(embedding.length); // 768 ``` ## Terms of Use 📚 The use of this embedding model for the following purposes is prohibited: - Generating embeddings for harmful or malicious content\. - Creating systems designed to manipulate or deceive users\. - Building applications that violate user privacy or data protection laws\. - Commercial use without proper licensing from Google\. - Redistributing the model weights without permission\. - Using embeddings to create biased or discriminatory systems\. ## Disclaimer ‼️ I am not liable for any direct, indirect, consequential, incidental, or special damages arising out of or in any way connected with the use/misuse or inability to use this software\. --- ⭐ Star the repo on[GitHub](https://github.com/zsxkib/cog-google-embeddinggemma-300m)\! 🐦 Follow[@zsakib\_](https://twitter.com/zsakib_)on X 💻 Check out more projects[@zsxkib](https://github.com/zsxkib)on GitHub Model created8 months, 4 weeks ago

Similar Articles

google/diffusiongemma-26B-A4B-it

Hugging Face Models Trending

Google DeepMind releases DiffusionGemma, a 26B-parameter Mixture-of-Experts model that uses discrete diffusion for faster text generation, supporting multimodal inputs and a 256K token context.

Introducing Gemma 3

Google DeepMind Blog

Google introduces Gemma 3, a collection of lightweight open models (1B, 4B, 12B, 27B) designed to run on single GPUs or TPUs, featuring support for 140+ languages, 128k context window, and multimodal capabilities. The models outperform larger competitors like Llama 3 and DeepSeek-V3 while maintaining efficiency for on-device deployment.

Google Gemma 4 12B

Product Hunt

Google's Gemma 4 12B model enables local multimodal AI using an encoder-free architecture.

google/gemma-4-26B-A4B-it

Hugging Face Models Trending

Google DeepMind releases Gemma 4, a family of open-weight multimodal models ranging from 2.3B to 31B parameters with support for text, image, video, and audio inputs. The models feature 256K context windows, MoE and dense architectures, enhanced reasoning capabilities, and are optimized for deployment across devices from mobile to servers.