Use Cases

Using Firefly Lab to handle growing demand for machine learning models in cybersecurity

Case Study
Topic:Anomaly Detection

Business Context

Cybersecurity is a $150 billion market worldwide and growing rapidly. It is an ever-changing environment where attackers and defenders are in a constant race to get ahead of one other.

A cybersecurity company with a sophisticated technology edge is successful in getting ahead of attackers by applying anomaly detection to identify active attacks on network traffic.  

However, with success, comes the challenge of how to rapidly scale data science productivity to match company growth. Can it be done without compromising machine learning model performance?


Data Science questions

In this case, there are two primary challenges:

  1. Is it possible to create and maintain hundreds of highly accurate, environment-specific cybersecurity models effectively?
  2. Pinpointing cyber attacks in each specific environment often leads to very small training sets. Can technology help create accurate anomaly detection models based on small training sets?



The data science team applied the following three-step approach to adopt use of Firefly Lab:

  1. Test Firefly Lab on known cases to ensure performance – Firefly Lab was able to automatically build machine learning models in a matter of hours with comparable or better performance than laborious models created by the team.  
  2. Use Firefly Lab to quickly test potential datasets for training to expedite the data collection process and examine the feasibility of using small training sets.
  3. Build new models and retain existing models, relying on Firefly’s intelligent preprocessing and F3 search to shorten the process of building models from weeks to hours.



By relying on Firefly Lab to perform tedious feature engineering and other preprocessing tasks, the cybersecurity data science team was able to increase its deliverable throughput significantly, handling growing demand for their services. It was also able to focus its data collection effort on researching the precise features that are most relevant for cyber attack detection.