Article Source
A new paper published in the Journal of Medical Internet Research describes how generative models such as DALL-E 2, a novel deep learning model for text-to-image generation, could represent a promising future tool for image generation, augmentation, and manipulation in health care.
Do generative models have sufficient medical domain knowledge to provide accurate and useful results? Dr Lisa C Adams and colleagues explore this topic in their latest viewpoint titled “What Does DALL-E 2 Know About Radiology?”
First introduced by OpenAI in April 2022, DALL-E 2 is an artificial intelligence (AI) tool that has gained popularity for generating novel photorealistic images or artwork based on textual input. DALL-E 2’s generative capabilities are powerful, as it has been trained on billions of existing text-image pairs off the internet.
To understand whether these capabilities can be transferred to the medical domain to create or augment data, researchers from Germany and the United States examined DALL-E 2’s radiological knowledge in creating and manipulating x-ray, computed tomography (CT), magnetic resonance imaging (MRI), and ultrasound images.
The study’s authors found that DALL-E 2 has learned relevant representations of x-ray images and shows promising potential for text-to-image generation. Specifically, DALL-E 2 was able to create realistic x-ray images based on short text prompts, but it did not perform very well when given specific CT, MRI, or ultrasound image prompts. It was also able to reasonably reconstruct missing aspects within a radiological image.
It could do much more—for example, create a complete, full-body radiograph by using only one image of the knee as a starting point. However, DALL-E 2 was limited in its capabilities to generate images with pathological abnormalities.
Synthetic data generated by DALL-E 2 could greatly accelerate the development of new deep learning tools for radiology, as well as address privacy concerns related to data sharing between institutions. The study’s authors note that generated images should be subjected to quality control by domain experts to reduce the risk of incorrect information entering a generated data set.
They also emphasize the need for further research to fine-tune these models to medical data and incorporate medical terminology to create powerful models for data generation and augmentation in radiology research. Although DALL-E 2 is not available to the public to fine-tune, other generative models like Stable Diffusion are, which could be adapted to generate a variety of medical images.
Overall, this viewpoint published by JMIR Publications provides a promising outlook for the future of AI image generation in radiology. Further research and development in this area could lead to exciting new tools for radiologists and medical professionals.
While there are limitations to be addressed, the potential benefits of using tools like DALL-E 2 and ChatGPT in research and medical training and education are significant. To this end, JMIR Medical Education is now inviting submissions for a new e-collection on the use of generative language models in medical education, as announced in a recent editorial by Dr Gunther Eysenbach.
What Does DALL-E 2 Know About Radiology?
Generative models, such as DALL-E 2 (OpenAI), could represent promising future tools for image generation, augmentation, and manipulation for artificial intelligence research in radiology, provided that these models have sufficient medical domain knowledge.
Herein, we show that DALL-E 2 has learned relevant representations of x-ray images, with promising capabilities in terms of zero-shot text-to-image generation of new images, the continuation of an image beyond its original boundaries, and the removal of elements; however, its capabilities for the generation of images with pathological abnormalities (eg, tumors, fractures, and inflammation) or computed tomography, magnetic resonance imaging, or ultrasound images are still limited.
The use of generative models for augmenting and generating radiological data thus seems feasible, even if the further fine-tuning and adaptation of these models to their respective domains are required first.