06/17/2026 | Press release | Distributed by Public on 06/17/2026 08:49
National Eye Institute scientists have developed an artificial intelligence (AI) system that can rapidly analyze large-scale gene expression data and generate clear, biologically meaningful explanations, a task that currently demands significant time, expertise, and manual effort from researchers.
The tool, called IAN (Intelligent system for omics data Analysis and discovery), is described in a new report published in Cell Reports Methods.
When researchers study diseases like cancer or autoimmune conditions, they generate lists of thousands of genes that that get expressed differently in sick versus healthy people. Making sense of this data to better understand what biological pathways are disrupted, which genes are acting as key control switches, and what it all means for disease, is one of the most labor-intensive challenges in biology. Current tools can organize and categorize gene data, but they stop short of synthesizing the data into a coherent biological story.
IAN acts like a team of specialized AI analysts working in parallel. It automatically cross-references a researcher's gene list against multiple biological databases (including those housing pathway, regulatory, and protein-protein interaction information) and assigns a dedicated AI "agent" to each database.
Each agent summarizes its findings, and a central large language model then weaves those summaries together into an integrated, interactive user-friendly report. The report includes a high-level biological narrative, identification of key regulatory genes, proposed disease mechanisms, and testable hypotheses.
"The goal is to bridge the gap between generating data and understanding what it means," said lead author Vijayaraj Nagarajan, Ph.D., of the Laboratory of Immunology at the NEI, part of the National Institutes of Health. By automating the synthesis step, IAN provides the potential to dramatically accelerate the pace of biological discovery and make sophisticated analysis more accessible to a broader range of researchers.
As a demonstration of IAN's hypothesis generation potential, using published data from patients with uveitis (a chronic eye inflammatory disease), IAN independently identified RELB (a key transcription factor implicated in immune regulation) as a likely key regulator of the disease's immune response. While consistent with the immunology literature, that hypothesis was generated despite no prior direct link to uveitis having been established.
In tests of IAN's accuracy, the system achieved a 100% groundedness score, meaning every gene, pathway, and metric it reported was verifiably traceable back to the input data. Expert evaluators rated IAN's outputs an average of 4.75 out of 5 for accuracy and overall satisfaction. Performance was also consistent across multiple underlying AI models.
IAN is free and open source, available at github.com/NIH-NEI/IAN. A Docker container version is also available for easy installation across different computing environments.
This work was supported by the Intramural Research Program of the National Institutes of Health, NEI.
Reference:
Nagarajan, V., Horai, R., Shi, G., Yu, C.-R., Gopalakrishnan, J., Yadav, M.K., Liew, M.H., Gentilucci, C., and Caspi, R.R. (2026). IAN, an intelligent system for omics data analysis and discovery. Cell Reports Methods, 6, 101503. doi.org/10.1016/j.crmeth.2026.101503