MedGemma: Our most capable open models for health AI development

Google DeepMind Blog 10/25/25, 06:02 PM Models

google-deepmind medgemma healthcare-ai open-source multimodal medical-imaging

Summary

Google DeepMind released MedGemma 27B Multimodal and MedSigLIP, expanding their open-source Health AI Developer Foundations to include high-performing, privacy-preserving models for medical text and imaging tasks.

We’re announcing new multimodal models in the MedGemma collection, our most capable open models for health AI development.

Original Article

View Cached Full Text

Cached at: 05/08/26, 09:47 AM

# MedGemma: Our most capable open models for health AI development Source: [https://research.google/blog/medgemma-our-most-capable-open-models-for-health-ai-development/](https://research.google/blog/medgemma-our-most-capable-open-models-for-health-ai-development/) Healthcare is increasingly embracing AI to improve workflow management, patient communication, and diagnostic and treatment support\. It’s critical that these AI\-based systems are not only high\-performing, but also efficient and privacy\-preserving\. It’s with these considerations in mind that we built and recently released[Health AI Developer Foundations](http://goo.gle/hai-def)\(HAI\-DEF\)\. HAI\-DEF is a collection of lightweight open models designed to offer developers robust starting points for their own health research and application development\. Because HAI\-DEF models are open, developers retain full control over privacy, infrastructure and modifications to the models\. In[May](https://research.google/blog/google-research-at-google-io-2025/)of this year, we expanded the HAI\-DEF collection with[MedGemma](https://deepmind.google/models/gemma/medgemma/), a collection of generative models based on[Gemma 3](https://deepmind.google/models/gemma/gemma-3/)that are designed to accelerate healthcare and lifesciences AI development\. Today, we’re proud to announce two new models in this collection\. The first is MedGemma 27B Multimodal, which complements the previously\-released 4B Multimodal and 27B text\-only models by adding support for complex multimodal and longitudinal electronic health record interpretation\. The second new model is MedSigLIP, a lightweight image and text encoder for classification, search, and related tasks\. MedSigLIP is based on the same image encoder that powers the 4B and 27B MedGemma models\. MedGemma and MedSigLIP are strong starting points for medical research and product development\. MedGemma is useful for medical text or imaging tasks that require generating free text, like report generation or visual question answering\. MedSigLIP is recommended for imaging tasks that involve structured outputs like classification or retrieval\. All of the above models can be run on a single GPU, and MedGemma 4B and MedSigLIP can even be adapted to run on mobile hardware\. Full details of MedGemma and MedSigLIP development and evaluation can be found in the[MedGemma technical report](https://arxiv.org/abs/2507.05201)\. ## MedGemma: A multimodal generative model for health The MedGemma collection includes variants in 4B and 27B sizes, both of which now accept image and text inputs and produce text outputs\. - **MedGemma 4B Multimodal**: MedGemma 4B scores 64\.4% on[MedQA](https://arxiv.org/abs/2009.13081), which ranks it among the best very small \(<8B\) open models\. In an unblinded study, 81% of MedGemma 4B–generated chest X\-ray reports were judged by a US board certified radiologist to be of sufficient accuracy to result in similar patient management compared to the original radiologist reports\. It additionally achieves performance on medical image classification tasks that is competitive with task\-specific state\-of\-the\-art models\. - **MedGemma 27B Text**and**MedGemma 27B Multimodal**: Based on internal and published evaluations, the MedGemma 27B models are among the best performing small open models \(<50B\) on the MedQA medical knowledge and reasoning benchmark; the text variant scores 87\.7%, which is within 3 points of[DeepSeek R1](https://github.com/deepseek-ai/DeepSeek-R1), a leading open model, but at approximately one tenth the inference cost\. The MedGemma 27B models are competitive with larger models across a variety of benchmarks, including retrieval and interpretation of electronic health record data\. We developed these models by training a medically optimized image encoder \(independently released as MedSigLIP, described below\), followed by training the corresponding 4B and 27B versions of the[Gemma 3 model](https://huggingface.co/docs/transformers/main/en/model_doc/gemma3)on medical data\. We took care to retain the general \(non\-medical\) capabilities of Gemma throughout this process\. This allows MedGemma to perform well on tasks that mix medical and non\-medical information and preserve instruction\-following and capabilities in non\-English languages\. A key aspect of these models is their adaptability\. For instance, after fine\-tuning, MedGemma 4B is able to achieve state\-of\-the\-art performance on chest X\-ray report generation, with a[RadGraph F1](https://arxiv.org/abs/2106.14463)score of 30\.3\. The straightforward ability for developers to improve performance on their target applications highlights the value of MedGemma as a starting point for developers looking to build AI for healthcare\. ## MedSigLIP: A specialized image encoder for healthcare MedSigLIP is a lightweight image encoder of only 400M parameters that uses the[Sigmoid loss for Language Image Pre\-training](https://arxiv.org/abs/2303.15343)\(SigLIP\) architecture\. MedSigLIP was adapted from SigLIP via tuning with diverse medical imaging data, including chest X\-rays,[histopathology](https://en.wikipedia.org/wiki/Histopathology)patches, dermatology images, and[fundus images](https://en.wikipedia.org/wiki/Fundus_photography), allowing the model to learn nuanced features specific to these modalities\. Importantly, we also took care to ensure that MedSigLIP retains strong performance on the natural images on which the original SigLIP model was trained, maintaining its versatility\. MedSigLIP is designed to bridge the gap between medical images and medical text by encoding them into a common embedding space\. MedSigLIP achieves similar or improved classification performance compared to task\-specific vision embedding models while being far more versatile across medical imaging domains\. MedSigLIP is ideal for: - *Traditional image classification:*Build performant models to classify medical images\. - *Zero\-shot image classification:*Classify images without specific training examples by comparing image embeddings to the embeddings of textual class labels\. - *Semantic image retrieval:*Find visually or semantically similar images from large medical image databases\. ## The power of open models Because the MedGemma collection is open, the models can be downloaded, built upon, and fine\-tuned to support developers’ specific needs\. Particularly in the medical space, this open approach offers several distinct advantages over API\-based models: - *Flexibility and privacy:*Models can be run on proprietary hardware in the developer’s preferred environment, including on Google Cloud Platform or locally, which can address privacy concerns or institutional policies\. - *Customization for high performance:*Models can be fine\-tuned and modified to achieve optimal performance on target tasks and datasets\. - *Reproducibility and stability:*Because the models are distributed as snapshots, their parameters are frozen and unlike an API, will not change unexpectedly over time\. This stability is particularly crucial for medical applications where consistency and reproducibility are paramount\. To ensure broad accessibility and ease of use, our[Hugging Face collection](https://huggingface.co/collections/google/medgemma-release-680aade845f90bec6a3f60c4)offers MedSigLIP and MedGemma in the popular[Hugging Face safetensors](https://huggingface.co/docs/safetensors/en/index)format\. ## What developers are building with MedGemma & MedSigLIP Researchers and developers have been exploring the MedGemma models for their use cases and have found the models adept at solving some crucial problems\. Developers at[DeepHealth](https://deephealth.com/)in Massachusetts, USA have been exploring MedSigLIP to improve their chest X\-ray triaging and nodule detection\. Researchers at[Chang Gung Memorial Hospital](https://www.cgmh.org.tw/eng)in Taiwan noted that MedGemma works well with traditional Chinese\-language medical literature and can respond well to medical staff questions\. Developers at[Tap Health](https://tap.health/)in Gurgaon, India, remarked on MedGemma’s superior medical grounding, noting its reliability on tasks that require sensitivity to clinical context, such as summarizing progress notes or suggesting guideline\-aligned nudges\. We’re excited to continue to learn about these and other use cases from developers as they create the next generation of Health AI tools with MedGemma and MedSigLIP\. ## Get started and explore To help developers get started, we’ve provided detailed notebooks on GitHub for[MedGemma](https://github.com/google-health/medgemma)and[MedSigLIP](https://github.com/google-health/medsiglip)that demonstrate how to create instances of MedSigLIP and MedGemma for both inference and fine\-tuning on Hugging Face\. When developers are ready to scale, MedGemma and MedSigLIP can be seamlessly deployed in[Vertex AI](https://cloud.google.com/vertex-ai)as dedicated endpoints, and we provide examples in GitHub of how to run inference on these endpoints\. We’ve also added a[new demo](https://huggingface.co/spaces/google/appoint-ready)to our HAI\-DEF Hugging Face[demo collection](https://huggingface.co/collections/google/hai-def-concept-apps-6837acfccce400abe6ec26c1)that shows how MedGemma can be built into an application to streamline pre\-visit information gathering ahead of a patient appointment\. Refer to the following table to understand which model from the MedGemma family is ideal for your use case\. Please visit[the HAI\-DEF site](https://goo.gle/hai-def)for these resources and to learn more about the MedGemma collection and other Health AI Developer Foundations models\. The[HAI\-DEF forum](https://discuss.ai.google.dev/c/hai-def/62)is available for questions or feedback\. ## Note on training datasets Models were trained on a mix of public and private de\-identified datasets\. Google and its partners utilize datasets that have been rigorously anonymized or de\-identified to ensure the protection of individual research participants and patient privacy\. ## Disclaimer MedGemma and MedSigLIP are intended to be used as a starting point that enables efficient development of downstream healthcare applications involving medical text and images\. MedGemma and MedSigLIP are not intended to be used without appropriate validation, adaptation and/or making meaningful modification by developers for their specific use case\. The outputs generated by these models are not intended to directly inform clinical diagnosis, patient management decisions, treatment recommendations, or any other direct clinical practice applications\. Performance benchmarks highlight baseline capabilities on relevant benchmarks, but even for image and text domains that constitute a substantial portion of training data, inaccurate model output is possible\. All model outputs should be considered preliminary and require independent verification, clinical correlation, and further investigation through established research and development methodologies\. ## Acknowledgements *MedGemma is the product of a collaboration between Google Research and Google DeepMind\. We thank the many people who contributed to this work, including the engineering and cross\-functional members of the Google Health AI and Gemma teams, as well as our sponsors in Google Research and Google Deepmind\.*

MedGemma: Our most capable open models for health AI development

Similar Articles

google/gemma-4-26B-A4B-it

@aiDotEngineer: Gemma, DeepMind's Family of Open Models https://youtube.com/watch?v=_gVFUEdhCyI… In the first ever public talk after th…

Welcome Gemma 4: Frontier multimodal intelligence on device

google/gemma-4-31B-it-assistant

Gemma 4: Byte for byte, the most capable open models

Submit Feedback

Similar Articles

@aiDotEngineer: Gemma, DeepMind's Family of Open Models https://youtube.com/watch?v=_gVFUEdhCyI… In the first ever public talk after th…

Welcome Gemma 4: Frontier multimodal intelligence on device

google/gemma-4-31B-it-assistant

Gemma 4: Byte for byte, the most capable open models