Our client is a global technology firm offering products and services across networking, security, collaboration, applications, and the cloud. With nearly 80,000 employees globally, the company develops, manufactures, and sells telecommunications equipment, networking hardware, software, and other high-technology products and services.
The Challenge
The client’s Business Critical Services team had a challenging task — managing a massive network of 4.3 million devices while dealing with a vast amount of weekly device health data . The provider needed to identify devices that were about to fail, but its current machine learning (ML) application was slow and took five days to run device analyses. This hindered proactive decision-making as there was no real-time/near-time prediction capability. Additionally, the client had to invest excess time and manual effort to label device reset details, which was a complex process. Unfortunately, some devices remained unlabeled, leading to data loss and a critical gap in understanding the reasons behind direct resets.
The Solution
Persistent’s solution harnessed the power of advanced ML models to predict potential disruptions within the customer network. Our ML models generated data on the likelihood of crashes based on specific hardware/software/features combinations. We categorized devices as high, medium, or low risk, enabling targeted interventions and actionable recommendations to ensure uninterrupted services. Armed with predictive insights, the client could take preemptive measures to mitigate risks. We streamlined the labor-intensive task of reset labeling by automating the process using an ensemble of Seq2Seq models (including BERT, XLNet, MPNet, ELMo, and USE) and a customized N-gram Regex model. This not only reduced manual effort but also enhanced data accuracy.
The Outcome
Our solution transformed the client’s approach to network management. The provider reduced average prediction time from an arduous five days to a remarkable 16-18 hours. Continuous learning, facilitated by ML, elevated the model’s accuracy from approx. 85% to an impressive 95%. The client also achieved a 92% reduction in manual labeling efforts.
Technology Used
- Airflow
- MongoDB
- Python
- Machine Learning
- Deep Learning
- Natural Language Processing