The Ohio State University

11/13/2025 | Press release | Distributed by Public on 11/13/2025 14:08

Teaching robots how to interact with their surroundings

Robots that lack understanding of the real world are often challenged by new and complex tasks.
Photo: Getty Images
Download Media Kit
Preparing your download...
Download

An error occurred while preparing your download

13
November
2025
|
15:00 PM
America/New_York

Teaching robots how to interact with their surroundings

Researchers explore how AI models reason, solve spatial problems

Tatyana Woodall
Ohio State News

When it comes to navigating their surroundings, machines have a natural disadvantage compared to humans. To help hone the visual perception abilities they need to understand the world, researchers have developed a novel training dataset for improving spatial awareness in robots.

In new research, experiments showed that robots trained with this dataset, called RoboSpatial, outperformed those trained with baseline models at the same robotic task, demonstrating a complex understanding of both spatial relationships and physical object manipulation.

For humans, visual perception shapes how we interact with the environment, from recognizing different people to maintaining an awareness of our body's movements and position. Despite previous attempts to imbue robots with these skills, efforts have fallen short as most are trained on data that lacks sophisticated spatial understanding.

Because deep spatial comprehension is necessary for intuitive interactions, if left unaddressed, these spatial reasoning challenges could hinder future AI systems' ability to comprehend complex instructions and operate in dynamic environments, said Luke Song,lead author of the study and a current PhD student in engineering at The Ohio State University.

"To have true general-purpose foundation models, a robot needs to understand the 3D world around it," he said. "So spatial understanding is one of the most crucial capabilities for it."

The studywas recently given as an oral presentation at the Conference on Computer Vision and Pattern Recognition.

To teach robots how to better interpret perspective, RoboSpatial includes more than a million real-world indoor and tabletop images, thousands of detailed 3D scans, and 3 million labels describing rich spatial information relevant to robotics. Using these vast resources, the framework pairs 2D egocentric images with full 3D scans of the same scene so the model learns to pinpoint objects using either flat-image recognition or 3D geometry.

According to the study, it's a process that closely mimics visual cues in the real world.

For instance, while current training datasets might allow a robot to accurately describe a "bowl on the table," the model would lack the ability to discern where on the table it actually is, where it should be placed to remain accessible, or how it might fit in with other objects. In contrast, RoboSpatial could rigorously test these spatial reasoning skills in practical robotic tasks, first by demonstrating object rearrangement and then by examining the models' capacity to generalize to new spatial reasoning scenarios beyond their original training data.

"Not only does this mean improvements on individual actions like picking up and placing things, but also leads to robots interacting more naturally with humans," said Song.

One of the systems the team tested this framework on was a Kinova Jaco robot, an assistive arm that helps people with disabilities connect with their environment.

During training, it was able to answer simple close-ended spatial questions like "can the chair be placed in front of the table?" or "is the mug to the left of the laptop?" correctly.

These promising results reveal that normalizing spatial context by improving robotic perception could lead to safer and more reliable AI systems, said Song.

While there are still many unanswered questions about AI development and training, the work concludes that RoboSpatial has the potential to serve as a foundation for broader applicationsin robotics, noting that more exciting spatial advancements will likely branch from it.

"I think we will see a lot of big improvements and cool capabilities for robots in the next five to ten years," said Song.

Co-authors include Yu Su from Ohio State and Valts Blukis, Jonathan Tremblay, Stephen Tyree and Stan Birchfield from NVIDIA. This work was supported by the Ohio Supercomputer Center.

Share this

Teaching robots how to interact with their surroundings
Share on: X Share on: Facebook Share on: LinkedIn

More Ohio State News

RSS feed - More Ohio State News (opens in new window) View all headlines - More Ohio State News

How fishes of the deep sea have evolved into different shapes

November
13
,
2025
| 10:13 AM America/New_York

Fish species living in the deep sea feature a surprisingly large range of body shapes that evolved in different ways and at different rates depending on where the fishes live in the ocean, new research shows.

Read more

President Carter and Coach Day praise Ohio State ROTC students

November
12
,
2025
| 13:00 PM America/New_York

President Walter "Ted" Carter Jr. and head football coach Ryan Day spent time Wednesday morning with the cadets and midshipmen of the university's Tri-service ROTC programs. The visit to French Field House was a way to thank the students and their program leaders for their commitment to serving.

Read more

Ohio State libraries leading charge for affordable textbooks

November
12
,
2025
| 10:00 AM America/New_York

Last year, the average college student spent $285 on course materials, including books, according to BestColleges.com. That cost can be prohibitive to student success, said Amanda Larson, Affordable Educational Resources Initiative (AERI) program coordinator for University Libraries at The Ohio State University.

Read more
Show previous items Show next items
The Ohio State University published this content on November 13, 2025, and is solely responsible for the information contained herein. Distributed via Public Technologies (PUBT), unedited and unaltered, on November 13, 2025 at 20:08 UTC. If you believe the information included in the content is inaccurate or outdated and requires editing or removal, please contact us at [email protected]