NIST - National Institute of Standards and Technology

05/05/2026 | Press release | Archived content

Functional Profiling of Thousands of Sequence-Diverse Protease Homologs with GROQ-seq

Published
May 5, 2026

Author(s)

James McLellan, Svetlana Ikonomova, Shwetha Sreenivasan, Alan Amin, Catherine Baranowski, Amanda Reider Apel, Peter Kelly, David Ross, Aviv Spinner

Abstract

High-quality datasets that span broad sequence diversity are essential for understanding protein sequence-function relationships beyond local mutational landscapes. Here, we applied Growth-based Quantitative Sequencing (GROQ-seq) to measure function across an 11,722 member protease library, comprised of natural homologs and AI-shrunken variants. This library spans vast sequence diversity, with Levenshtein distances of up to 245 and a mean pairwise sequence identity of 41 % to TEV protease S219V. We identified sequence-divergent TEV protease homologs that preserve function against the native TEV protease substrate. These findings reveal the robustness of protease activity across highly diverse sequences. Here, we demonstrate the aptitude of the GROQ-seq assay for screening large, diverse protein libraries for function, enabling efficient data generation at scale for training machine learning models across broad sequence landscapes.
Citation
https://www.biorxiv.org/
Pub Type
Websites

Keywords

AI-Ready Biological Data

Citation

McLellan, J. , Ikonomova, S. , Sreenivasan, S. , Amin, A. , Baranowski, C. , Reider Apel, A. , Kelly, P. , Ross, D. and Spinner, A. (2026), Functional Profiling of Thousands of Sequence-Diverse Protease Homologs with GROQ-seq, https://www.biorxiv.org/, [online], https://www.biorxiv.org/ (Accessed May 7, 2026)
Additional citation formats

Issues

If you have any questions about this publication or are having problems accessing it, please contact [email protected].

NIST - National Institute of Standards and Technology published this content on May 05, 2026, and is solely responsible for the information contained herein. Distributed via Public Technologies (PUBT), unedited and unaltered, on May 07, 2026 at 09:27 UTC. If you believe the information included in the content is inaccurate or outdated and requires editing or removal, please contact us at [email protected]