Oak Ridge National Laboratory

01/28/2026 | Press release | Distributed by Public on 01/28/2026 14:56

Empowering an AI foundation model to accelerate plant research

New method expedites analysis of data from ORNL's Advanced Plant Phenotyping Laboratory on the Frontier supercomputer

Published: January 28, 2026
Updated: January 28, 2026
Hyperspectral imaging in ORNL's Advanced Plant Phenotyping Laboratory captures plant biochemical composition beyond visible light, resulting in massive amounts of data used to train an AI foundation model. Credit: ORNL, U.S. Dept. of Energy

Scientists at the Department of Energy's Oak Ridge National Laboratory have created a new method that more than doubles computer processing speeds while using 75 percent less memory to analyze plant imaging data. The advance removes a major computational bottleneck and accelerates AI-guided discoveries for the development of high-performing crops.

The method is a key step in the development of an AI foundation model using data from the Advanced Plant Phenotyping Laboratory (APPL) and run on Frontier, the world's first exascale supercomputer, at ORNL. The research supports projects aligned with the Genesis Mission, DOE's bold new endeavor to build the world's most powerful scientific platform to accelerate discovery science, strengthen national security and drive energy innovation.

Foundation models are large AI systems trained on massive datasets to make predictions across different domains. In this case, they help accelerate the development of hardy bioenergy and food crops using data captured during robotic examination of new plant varieties in APPL.

The new method, Distributed Cross-Channel Hierarchical Aggregation (D-CHAG), expedites analysis of the vast amounts of data generated as plants automatically move through APPL's diverse array of imaging stations. APPL's hyperspectral cameras capture data 24/7 on plant health, chemical makeup and structure, providing early detection of disease and stress and linking genes to desirable traits. The result is a world-class biotechnology capability that can speed the creation of resilient, high-yield crops for new fuels and materials and to address food security for the nation.

The data processing challenge lies in the nature of hyperspectral images. While traditional cameras use three color channels - red, green and blue - to capture an image, hyperspectral cameras capture hundreds of channels. Each channel represents a specific wavelength of light that can provide crucial data on how plants respond to their environment, how they metabolize nutrients, or how stress and disease affect their performance. Standard processing methods for hyperspectral images are notoriously difficult, often trying to handle all the channels at once, which uses a considerable amount of computer memory and time.

D-CHAG deploys a two-step process to provide a solution. In the first step, the work of breaking the images into small pieces for analysis is split among many graphics processing units (GPUs) in a technique called distributed tokenization. Each GPU handles only a subset of the channels. Because the work is divided up, no single processor gets overwhelmed, and data are processed much faster.

Next, those smaller groups are merged in stages rather than all at once in a step called hierarchical aggregation, which combines information across the spectral regions. The approach reduces the amount of data to be processed at each stage, with the end result of lower memory requirements and faster computation. This level of efficiency means that larger foundation models can be trained on hyperspectral datasets without compromising their spatial or spectral resolution, making it possible to extract subtle yet significant patterns in plant physiology.

The new method is detailed in a paper that was presented at the prestigious International Conference for High-Performance Computing, Networking, Storage, and Analysis (SC25) held in November 2025.

Training next-generation AI models

"This project demonstrated a solution to the bottleneck that can develop when you have a very large number of parameters, such as hyperspectral data, and need to scale up into foundation models," said Aristeidis Tsaris, a research scientist working with the National Center for Computational Sciences at ORNL. "With D-CHAG, we were able to get significant performance improvements without conceding accuracy."

D-CHAG was successfully demonstrated using APPL hyperspectral data as well as a weather dataset on the Frontier exascale supercomputer at the Oak Ridge Leadership Computing Facility, a DOE Office of Science user facility at ORNL.

Key accomplishments include:

  • Up to a 75 percent reduction in memory usage compared to standard foundation model methods. This means that training that once required many high-end computers can now be done with fewer resources.
  • More than double the processing speed. Faster processing means that scientists can analyze large sets of data far more quickly than before.

D-CHAG helps plant scientists quickly accomplish tasks like measuring plant photosynthetic activity directly from an image, replacing laborious, time-intensive manual measurements, said Larry York, senior staff scientist in ORNL's Molecular and Cellular Imaging Group. "One of the project's next steps is to refine the model to predict photosynthetic efficiency of plants directly from those images. We're getting ready for a future in which hyperspectral imaging is more common, and the compute power to process it will be more widely available."

"Hyperspectral is the imaging modality that holds a lot of promise for plant transformation research," said John Lagergren, R&D associate staff member in ORNL's Plant Systems Biology Group. "But the computational complexity is a bottleneck that has prevented the training of advanced neural networks to extract meaningful biology from these images. This work is a big step to reducing that complexity and resolving the bottleneck."

Gaining faster insights at larger scales

APPL and its AI-enabled insights have enormous potential to advance the development of new crop varieties and to benefit agricultural practices. By drastically reducing the overhead associated with processing hyperspectral images, researchers can now obtain insights faster and at larger scales.

APPL's advanced phenotyping capabilities and AI foundation model also play a key role in two DOE-supported projects. Both projects are part of the DOE Genesis Mission at ORNL, linking AI with domain science to quickly deliver solutions for national priorities.

  • The Orchestrated Platform for Autonomous Laboratories (OPAL) is a multi-lab initiative combining AI, robotics and automated experimentation to create a network of labs that can learn, adapt, and expedite discoveries. OPAL integrates the work of ORNL and three collaborating DOE national laboratories - Argonne, Lawrence Berkeley and Pacific Northwest - to turn biological discovery into a self-driving process.
  • The Generative Pretrained Transformer for Genomic Photosynthesis project draws on APPL foundation model success to produce simulations of highly accurate genetic modifications in plants for faster development of energy crops with enhanced photosynthesis and productivity.

In a future in which cameras such as those used in APPL are drone-mounted and deployed across croplands, farmers could use the technology to monitor crops in real-time, detecting issues such as water stress, nutrient deficiencies, or pest infestations before they become severe.

For plant breeders, AI-aided phenotyping allows researchers to select plants with desirable traits more effectively. This knowledge can be used to develop new varieties of crops that grow faster, use water more efficiently, or produce higher yields. This high-powered method of data analysis could also lead to the discovery of plant compounds useful for medicine or bioengineering.

The integration of hyperspectral imaging from the APPL laboratory with the power of supercomputers such as Frontier represents a major leap forward in plant transformation research and AI technology. The approach supports innovation for a robust bioeconomy that contributes to the nation's energy security and economic growth.

Other ORNL scientists on the project include Xiao Wang, Isaac Lyngaas, Prasanna Balaprakash, Dan Lu and Feiyi Wang, along with Mohamed Wahib of the RIKEN Center for Computational Science. The project was supported by the Center for Bioenergy Innovation, a Bioenergy Research Center sponsored by the DOE Office of Science Biological and Environmental Research program, as well as by ORNL laboratory-directed research and development funding.

This research supports DOE's Genesis Mission, a national initiative to build the world's most powerful scientific platform to accelerate discovery science, strengthen national security, and drive energy innovation. It does so by enabling AI-driven, exascale-powered advances that enhance America's energy innovation, global competitiveness and security.

UT-Battelle manages ORNL for the Department of Energy's Office of Science, the single largest supporter of basic research in the physical sciences in the United States. The Office of Science is working to address some of the most pressing challenges of our time. For more information, please visit energy.gov/science. - Stephanie Seay

Media Contact
Kimberly A Askey , Communications Lead, Biological and Environmental Systems Science Directorate , 865.576.2841 | [email protected]
Oak Ridge National Laboratory published this content on January 28, 2026, and is solely responsible for the information contained herein. Distributed via Public Technologies (PUBT), unedited and unaltered, on January 28, 2026 at 20:56 UTC. If you believe the information included in the content is inaccurate or outdated and requires editing or removal, please contact us at [email protected]