Drexel University

06/03/2026 | Press release | Distributed by Public on 06/03/2026 13:12

Getting an Exercise Form Coaching Assist From AI

Share
Share Options

Getting an Exercise Form Coaching Assist From AI

Researchers demonstrate AI and computer-vision program designed to prevent exercise injuries, provide personalized coaching
June 3, 2026

Researchers from Drexel University and Michigan State University have demonstrated a program designed to use AI and computer vision to provide exercise form coaching in hopes of preventing injuries and improving outcomes.

As any athlete will tell you: perfect practice makes perfect. But for individuals who do not have regular access to coaches or trainers, maintaining good form can be tricky. In fact, during the Covid-19 pandemic when many people were exercising at home, the U.S. Consumer Product Safety Commission reported a 48% rise in injuries related to at-home exercise. In hopes of preventing some of these injuries and extending the expert guidance of coaches, researchers from Drexel University and Michigan State University have developed a prototype of a program that uses artificial intelligence and computer vision to analyze video and provide form coaching in real time.

The program, which integrates biomechanical modeling with computer vision and a vision-language model, is designed to provide live, personalized feedback and explanations of the guidance it offers during an exercise - a feat that has proven to be challenging for most fitness coaching apps. The researchers published their work ahead of presenting their prototype, called BioCoach, at the Conference on Computer Vision and Pattern Recognition, hosted by the Institute of Electrical and Electronics Engineers and the Computer Vision Foundation in June.

"Many people who exercise at home with videos and apps don't get high-quality assessment of their movements," said Feng Liu, PhD, an assistant professor in Drexel's College of Engineering and Computing, who led the research. "Feedback is often too generic or simply encouragement but no actual form coaching. Our goal with BioCoach is to provide timely, specific cues grounded in body motion, closer to the kind of guidance a knowledgeable coach would give."

Feng's Visual Intelligence Lab at Drexel applies advanced computer vision, machine learning and 3D human-body modeling to study problems in exercise coaching, clinical gait assessment and classroom education.

To prepare BioCoach, the team started with an exercise-video coaching benchmark - the publicly available Qualcomm Exercise Video Dataset (QEVD), which includes hundreds of hours of exercise footage along with time-stamped coaching feedback.

The feedback included only short coaching comments, such as "lower your body more." So the researchers created a new version by re-annotating it with more detailed biomechanical targets, "increase elbow flexion to 90 degrees at the bottom," for example. They also added short rationale for the guidance, such as "increase hip/knee flexion to distribute load."

In all, the team added more than 2,400 notes to over 200 videos used to train and test BioCoach. These annotations helped to prepare the large language model that provides coaching and guidance to the user. And because the time stamps were preserved in their annotated dataset, this new benchmark would enable the researchers to evaluate not only the guidance the system was offering, but also whether it responded at the right moment.

With the improved exercise video feedback dataset in hand, the team designed BioCoach to analyze each video through two complementary streams of information in order to access and deliver the proper guidance to the user.

One stream captures visual appearance and motion patterns using 3D convolutional neural network - a deep-learning program adept at identifying individual objects in images and videos. The other allows BioCoach to estimate 3D skeletal movements and body shape, giving the program access to information about joint angles, ranges of motion and exercise phases.

With access to these information streams, BioCoach is able to access structured biomechanical data unique to each joint. This means before providing feedback, it first identifies the joints most relevant to each exercise - for example, the hips, knees and ankles for squats or the shoulders, elbows and wrists for push-ups - so that it can provide more detailed guidance.

Through this process, the program is also able to use body-shape information and movement-quality analysis to provide the structured information its language model translates into specific, biomechanics-based feedback.

"Our goal was to build a system that does more than look at pixels and generate a generic comment," Liu said. "BioCoach exposes the model to 3D motion, joint angles and exercise-specific constraints, so the feedback can point to a concrete movement issue and explain why it matters."

After preparing the program, the team set out to test it against the top competition - video-language AI programs by research and development teams from NVIDIA, ByteDance, Alibaba, Salesforce, OpenAI, Shanghai Jiao Tong University, Chinese University of Hong Kong, Peking University and Peng Cheng Laboratory in China, and the Massachusetts Institute of Technology.

They tested the programs by showing each program a number of exercise videos - some from the original QEVD set and some that the team had annotated. The responses of each program were compared to those offered in the original QEVD dataset as well as those added by the researchers, with scoring based on how timely, accurate and detailed they were.

In responding to videos from the original dataset, BioCoach outperformed its nearest competition, Stream-VLM - a program created by researchers from MIT and NVIDIA - in text quality and judged correctness, while its timing score was close but slightly lower.

But it outpaced Stream-VLM across all metrics when its feedback was graded against that from the dataset with more specific annotations, showing particular improvements in biomechanical correctness and detailed, anatomy-specific feedback.

The researchers suggest that these results show that adding explicit 3D kinematics and biomechanical context can improve the quality and interpretability of real-time exercise feedback without substantially reducing responsiveness.

"It was encouraging to see that BioCoach was able to perform so well against programs made by some of the top researchers and companies in the AI field," Feng said. "This is still a prototype, but it shows how combining computer vision with structured biomechanical reasoning can make AI coaching systems more useful and easier to inspect."

The team plans to continue its work by enhancing the program so that it can estimate joint reaction forces and muscle activation patterns from videos in order to detect slight compensatory movements that could result in injuries during exercise.

"We believe this work could ultimately support exercise and physical-therapy apps that extend the expertise of human coaches and trainers between in-person sessions," Liu said. "A future system could help users receive more specific, timely feedback when they practice on their own, while still keeping human experts in the loop."

This research was supported by the National Science Foundation.

In addition to Feng, Yuyang Ji and Yixuan Shen, also from Drexel; and Shengjie Zhu, PhD, and Yu Kong, PhD, from Michigan State University, contributed to this research.

Read the full paper here: https://arxiv.org/abs/2603.26938

Drexel University published this content on June 03, 2026, and is solely responsible for the information contained herein. Distributed via Public Technologies (PUBT), unedited and unaltered, on June 03, 2026 at 19:12 UTC. If you believe the information included in the content is inaccurate or outdated and requires editing or removal, please contact us at [email protected]