10/08/2025 | Press release | Distributed by Public on 10/08/2025 14:49
Published: October 8, 2025
By Daniel Punch
Researchers at the University of Toronto and the Ontario Institute for Cancer Research (OICR) have developed an artificial intelligence system that can create stimulated cancer genomes, paving the way for more accurate cancer diagnoses and effective treatments without breaching patient confidentiality.
The system, called OncoGAN, uses generative AI to simulate tumour genomes across eight types of cancer, including breast, prostate and pancreatic cancers. The synthetic genomes simulate realistic patterns of genetic alterations and can be used to train and improve the algorithms that drive precision oncology.
OncoGAN is described in a new paper published in the journal Cell Genomics.
"With OncoGAN, we are creating realistic genomes out of nothing, with no connection to any real person, yet a huge amount of value scientifically," says the study's senior author Lincoln Stein, professor of molecular genetics at U of T's Temerty Faculty of Medicine and acting scientific director at OICR.
"These synthetic genomes don't contain any personal health information, and so they can be shared without limitation."
Ander Díaz-Navarro, left, and Lincoln Stein (supplied images)The analysis of tumour genomes and the variations within their DNA has enabled new discoveries about how cancer develops and led to a surge of cutting-edge tests and medicines. It is the cornerstone of precision oncology, where cancer treatment is personalized to the unique biology of a patient's tumour.
But the algorithms used to analyze genomes are limited because they have been trained on a limited set of cancer genomes, relatively few of which are publicly available. The most commonly used tools were trained on a few dozen legacy genomes and can't fully capture the necessary biological diversity.
Although more recent genome sequencing data exists, access is often restricted due to concerns around the confidentiality of the patients they were sampled from.
Beyond privacy, another advantage of OncoGAN's synthetic genomes is that their 'ground truth' - the full, error-free DNA sequence with all genomic variants identified - is known. In comparison, it is nearly impossible to know the ground truth of real-life genomes because of their complexity and the limits of sequencing technology, which means current genome analysis tools could be flawed due to their being trained on imperfect data.
By generating genomes from scratch, OncoGAN gives researchers fully known, verified DNA sequences that can enable better, more precise genomic testing and analysis.
"Knowing the 'ground truth' of the genomes means they can be used to benchmark new algorithms with full knowledge of what the correct answer is," says Ander Díaz-Navarro, a postdoctoral researcher at OICR and first author of the paper.
OncoGAN is publicly available for download. Stein, Díaz-Navarro and colleagues have also generated 800 simulated genomes, which are available with open access and are already being used to train analysis tools in Stein's lab.
With better, more accurately trained tools to analyze cancer genomes, Stein says scientists could unlock more critical insights with the potential to transform cancer care. "The more we know about the biological factors that drive cancer, the better equipped we are to detect it as early as possible, treat it more effectively and even prevent it altogether."