09/08/2025 | Press release | Distributed by Public on 09/09/2025 01:42
News Highlights:
TOKYO - September 8, 2025 - NTT has developed Fast Sparse Modeling Technology, a suite of world-class AI algorithms for information selection that accelerates data analysis by up to 73 times compared with conventional algorithms. This technology provides a theoretical guarantee of achieving the same analytical accuracy as conventional sparse modeling methods, enabling faster data-driven decision-making without compromising quality.
Because the algorithm suite can be applied to a wide range of data formats, it is expected to significantly reduce analysis backlogs in industries such as manufacturing, healthcare, marketing, and energy, where vast amounts of data are generated. Part of this technology is already available through NTT DOCOMO BUSINESS' no-code AI development tool Node-AI.
Some of these results will be presented at ECML PKDD (1) 2025, and have also been published at NeurIPS (2) 2019 and 2024, ICML (3) 2020, and AISTATS (4) 2023.
Background
Recent advances in sensing technologies have made it possible to collect vast amounts of diverse data from people and objects. At NTT, such data is analyzed to enable data-driven decision-making across a wide range of fields. Among these approaches, analyses that identify essential information from large datasets are frequently conducted in the early stages of analysis. These methods are particularly valuable because they provide high interpretability of results, supporting the design of subsequent analytical approaches and the formulation of strategic hypotheses.
Sparse modeling is a widely used machine learning technique for this type of information selection. By assuming sparsity, which means that only a small portion of the acquired information is truly necessary while the majority is not, sparse modeling makes it possible to extract critical pieces of information from massive datasets.
However, as datasets continue to grow in scale, the processing time required for sparse modeling also increases, making it difficult to complete analyses within practical timeframes. This prolonged analysis can cause delays in PDCA cycles, such as being unable to present results at weekly meetings, ultimately slowing down decision-making. In some cases, measures such as using only a subset of the data are taken to shorten processing time. Yet, these measures leave the remaining data unanalyzed, creating idle data and raising the risk of overlooking important insights.
Summary of Results
■ Establishment of a Pruning-Based Acceleration Algorithm:
To enable sparse modeling to analyze massive datasets within practical timeframes, NTT has developed an acceleration algorithm. Sparse modeling achieves information selection by assuming that only a small portion of the data is essential. In other words, this assumption also implies that a large amount of unnecessary information is present. By developing a pruning algorithm that safely skips computations related to such unnecessary information, the company achieved up to 73 times faster performance compared with the original algorithm, without any loss of accuracy. This has been validated both theoretically and experimentally, and related papers have been accepted at top international conferences in AI and machine learning, including NeurIPS (5, 10) and ICML (6) (Figure 1). This breakthrough enables analyses that would normally take more than a month to be completed in less than a day, thereby accelerating decision-making. It also makes it possible to analyze previously unused data that had been left idle due to time constraints.
■ Extension to Diverse Data Formats:
Improving analytical accuracy requires not only data itself but also the effective use of its format as auxiliary information. For example, communication data in networks has a network-structured format, regional traffic volume data has a group-structured format, and e-commerce product data has a hierarchical (tree-structured) format based on product categories. By incorporating these structural formats into analysis along with the data, further improvements in accuracy can be achieved. NTT successfully applied the pruning algorithm to group-structured (7) and network-structured (8) data, with related research published at leading conferences such as AISTATS. Most recently, an algorithm adapted to hierarchical (tree-structured) data (9) was accepted at ECML PKDD, a top international conference in machine learning and data mining. As a result, this algorithm suite now covers the majority of practical data formats (Figure 1), offering the potential to eliminate analysis backlogs across industries.
Figure 1 Supported data structures, acceleration rates, publication venues, and usage status of the technology
Key Features of the Technology
■ Up to 73 Times Faster Analysis with Sparse Modeling:
By leveraging the sparsity assumption in sparse modeling, which states that only a small portion of the information is essential, NTT introduced a proprietary pruning algorithm that safely skips computations corresponding to unnecessary information. This approach achieved up to 73 times faster performance compared with the original algorithm (Figure 2).
Figure 2 Establishment of a pruning-based algorithm for accelerating sparse modeling
■ Theoretical Guarantee of No Accuracy Loss Compared with Conventional Techniques:
Since this technology only prunes computations corresponding to unnecessary information in conventional methods, it is theoretically guaranteed that accuracy is not compromised compared with existing techniques (Figure 3).
Figure 3 Illustration of the pruning process in this technology
■ Coverage of Practical Data Formats:
This technology can incorporate structural information when data has group, network, or hierarchical (tree) formats. By utilizing these structures in addition to the data itself, it is expected to improve analysis accuracy across various domain-specific applications (Figure 4).
Figure 4 Illustration of the data structure formats supported by this technology with example data
Applications
In the manufacturing sector, this technology can be used to optimize plant production efficiency by estimating the factors that increase or decrease production from time-series data collected by sensors installed in the plant. If the technology can identify the specific time points that affected production, it provides clues to investigate what operations were performed at those times, helping to develop and refine control strategies for higher production efficiency. For example, if sensors collect data every second, approximately 10,000 time points can be recorded in a single day. As the collection period increases, the dataset can grow to tens of thousands or even hundreds of millions of time points, which may require more than a month of analysis on a standard computer. Depending on the data characteristics and problem settings, this technology can reduce such analysis times to less than a day.
Other potential applications include estimating genetic factors of specific diseases in the medical field to aid in new drug and treatment development, identifying characteristic customer behaviors from purchase data in marketing to optimize advertising and coupons, and estimating equations describing plasma behavior from sensor data in fusion reactors to support control strategies for stable reactor operation (Figure 5).
Figure 5 Examples of applications of this technology across different fields
Overview of Product Integration
Part of the Fast Sparse Modeling Technology has already been integrated into NTT DOCOMO BUSINESS' no-code AI development tool Node-AI (11) and is available for use. Node-AI is a tool that allows users to build data analysis AI without writing code. It is particularly strong in analyzing time-series data and supports tasks such as factor analysis to provide the basis for AI predictions. With the integration of Fast Sparse Modeling Technology, users can perform fast and highly accurate information selection without the need for coding [1, 2].
Future Developments
Since this technology supports a wide range of data formats, it can be applied across various fields. We will promote its practical use, including integration into Node-AI, contributing to the expansion of NTT Group's business areas. In addition, we will continue research and development to advance sparse modeling, including improvements in accuracy and memory efficiency. We will also explore diverse applications in fields where large-scale data analysis is essential, such as natural sciences including fusion research (AI for Science (12)).
1 European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases
2 Annual Conference on Neural Information Processing Systems
3 International Conference on Machine Learning
4 International Conference on Artificial Intelligence and Statistics
5 Y. Ida, S. Kanai, A. Kumagai, T. Iwata, and Y. Fujiwara, "Fast Iterative Hard Thresholding Methods with Pruning Gradient Computations," Advances in Neural Information Processing Systems (NeurIPS), 2024.
6 Y. Ida, S. Kanai, Y. Fujiwara, T. Iwata, K. Takeuchi, and H. Kashima, "Fast Deterministic CUR Matrix Decomposition with Accuracy Assurance," International Conference on Machine Learning (ICML), 2020.
7 Y. Ida, S. Kanai, and A. Kumagai, "Fast Block Coordinate Descent for Non-Convex Group Regularizations," International Conference on Artificial Intelligence and Statistics (AISTATS), 2023.
8 Y. Ida, Y. Fujiwara, and H. Kashima, "Fast Block Coordinate Descent for Sparse Group Lasso," The Japanese Society for Artificial Intelligence, 2021.
9 Y. Ida, S. Kanai, A. Kumagai, T. Iwata, and Y. Fujiwara, "Fast Proximal Gradient Methods with Node Pruning for Tree-Structured Sparse Regularization," European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML-PKDD), 2025.
10 Y. Ida, Y. Fujiwara, and H. Kashima, "Fast Sparse Group Lasso," Advances in Neural Information Processing Systems (NeurIPS), 2019.
11 Node-AI
https://nodeai.io/
12 An initiative to accelerate the entire scientific research process using AI.
[1] FastSGL: A Fast, Highly Interpretable Model (Ver. 3.9.6), https://note.com/nodeai/n/nb947c9e692ac
[2] FastGSCAD: A Model Optimized for Learning Speed and Variable Selection Accuracy (Ver. 3.22.0), https://note.com/nodeai/n/nf879b51f36bf
About NTT
NTT contributes to a sustainable society through the power of innovation. We are a leading global technology company providing services to consumers and businesses as a mobile operator, infrastructure, networks, applications, and consulting provider. Our offerings include digital business consulting, managed application services, workplace and cloud solutions, data center and edge computing, all supported by our deep global industry expertise. We are over $90B in revenue and 340,000 employees, with $3B in annual R&D investments. Our operations span across 80+ countries and regions, allowing us to serve clients in over 190 of them. We serve over 75% of Fortune Global 100 companies, thousands of other enterprise and government clients and millions of consumers.