Google this week unveiled two of its latest artificial intelligence-enabled models, both of which could be impactful in the medical imaging industry. 

MedGemma 1.5, Google’s latest vision-language model, and MedASR, the company’s new automated speech recognition model, were both designed and trained specifically for healthcare. Google is making each model available publicly so researchers and developers can have access to the new technology. 

The updated version of the tech giant’s vision-language model can analyze medical images and written text. It has been revised to include enhanced multimodal reasoning and also includes improved options for fine-tuning in accordance with specialized datasets. MedGemma 1.5’s is intended for answering image-based medical questions, drafting radiology reports and extracting pertinent clinical data. The MedGemma applications deployed on Google Cloud also include full DICOM support. 

Google indicates the updated vision-language model’s accuracy in classifying disease-related findings has improved since the rollout of its predecessor, MedGemma 1. The company believes the model will be helpful for many research-related tasks. It has signaled that the response to the first MedGemma model “has been incredible.” 

 
 

“The adoption of artificial intelligence in healthcare is accelerating dramatically, with the healthcare industry adopting AI at twice the rate of the broader economy,” a news release from Google notes. “In support of this transformation, last year Google published the MedGemma collection of open medical generative AI models through our Health AI Developer Foundations (HAI-DEF) program. HAI-DEF models like MedGemma are intended as starting points for developers to evaluate and adapt to their medical use cases, and they can be easily scaled on Google Cloud through Vertex AI.” 

MedASR is a speech to text model designed for medical settings. It was trained on healthcare-specific language and has been fine-tuned to improve medical dictation accuracy. Google says that compared to other generalist ASR models, MedASR records 58% fewer errors for general imaging dictations and up to 82% fewer errors related to rare diseases and diverse speakers. It can be adapted to different medical specialties and varying workflows as needed. 

Both models are available now. Learn more here.