When building a machine learning model, the obvious goal is to build the best model – the one that delivers the best results. However, in order to succeed you’ll need to take a step back and consider your business needs that drive your KPIs, and thereby your understanding of the “best” ML model for your business. In the next few paragraphs we will discuss a few of the most common KPIs. 

What’s the real KPI?

Your identified business needs will determine how you measure the success of the model. There are a few potential KPIs, primarily accuracy, explainability, and latency. The real KPI is going to vary according to your industry, your business goals, and your specific use case. 

Accuracy, or “low error-distance” between the predicted and the actual results, is always the goal. However, in some cases you may have constraints that will prevent you from achieving the most accurate model. The constraints may be in the form of business requirements, such as the necessity to explain the model’s decisions, or other requirements such as prediction speed, training time limits or hardware limitation (model’s size or computational requirements).

In some situations you might have multiple KPIs, while in others you will have only one. These are some of the three most common KPIs for businesses using ML models:


In many use cases, the primary KPI is accuracy. For example, when using machine learning for predicting manufacturing machinery maintenance, the ability to accurately predict malfunctions is invaluable. Failure to predict malfunctions before they occur will result in machine downtime and extra expenses. Moreover, false alarms increase costs and eventually dissolve the model’s credibility. 

Predictive maintenance data is extremely skewed, so you’ll use the model to search for anomalies (anomaly detection). The challenge is to predict as many of the malfunctions as possible without too many false alarms. There are many ways to measure error distance, including F (F1), recall-macro, accuracy, precision, and more (for classification anomaly detection), thus the choice of measurement is a derivative of the use case.

Pictured for illustration purposes is the visualized performance of a telecommunications churn prediction ML model

How to meet a KPI for accuracy

If accuracy is your KPI, you’ll choose the pipeline (data pre-processing) and algorithms that work best with your data and produce the best predictions. This will potentially be at the expense of model’s size, training time, prediction speed and explainability abilities. 

For example, when using Deep Neural Network, it is almost impossible to explain the cause of a prediction – or the “black box”, as it is commonly referred to. In addition, an ensemble of few algorithms, perhaps using different pipelines, is likely to increase a model’s accuracy, but would also increase latency and the “black box” effect, making it harder to explain the reasoning behind the predictions.


Sometimes, the chief KPI is explainability, which is your ability to interpret and explain the model’s decision, even at the expense of the model’s accuracy. A common challenge with machine learning is that it often creates a sort of “black-box” that runs calculations and produces answers (i.e. predictions) without being able to explain the reasoning that led to those answers. If a company reports to internal or external stakeholders, or responds to a compliance audit, it needs to be able to explain the reasons behind its decisions. The explanations must also be easy to understand. For example, when a financial institution approves or denies a loan applicant, the ability to explain the prediction model becomes an important KPI.

Toward this end, special explainability features ensure that the development of the model will result in explainable predictions. The data expert will use those features to spot leakages (the existence of data in the training set that is: (a) not be available during the prediction time and; (b) highly correlated to the target) and understand which of the features are most important to the model’s success, and which are less important. By choosing which features to prioritize, data preparation for the final dataset will prove to be more specific and relevant to the chosen KPI.

Explainability features are also key in optimizing the model and for better understanding the data and business decision. 

The above graph depicts feature importance for both missing values.

How to meet a KPI of explainability

For the KPI of explainability, you’ll want to build a simple, understandable model – typically shown using a Decision Tree or Logistic Regression. These are easy to follow, understand, and present if needed. In addition, the pipeline will also be somewhat simple – no complex data engineering or dimensionality reduction, as the process has to be easily explained.

Prediction Speed – for one sample

After the model is created, it is usually deployed and implemented in production, where you’ll need to take into account the pipeline that is required to produce a single prediction. In some circumstances, high prediction speed is the KPI motivating a business. This means that you’ll need a model that can deliver an answer in the shortest amount of time possible. It also means that your business is willing to trade a small drop in accuracy in exchange for fast response time.

For example, a large credit card company needs a model that can detect fraud. The company has an extremely low latency level and demands a response within milliseconds. If you use a complex model, it will take longer than desired to generate an answer, so you compromise by using a simpler model which is a little less accurate, but much, much faster.

How to meet a KPI of inference latency

To meet latency KPIs, a few things should be considered: One, the pipeline (data preprocessing). During this stage, each step should be optimized for speed and efficiency. You should also choose your pre-processing steps carefully and avoid those that are time consuming. Two, the algorithm that is used to train the model. You’ll choose a fast algorithm with simple architecture, like Logistic Regression, Decision Tree, Neural Network, and others. Three, infrastructure. During this stage, multi-core CPU and adequate memory size will likely be needed. In addition, the network overhead should be as low as possible, the bandwidth and the physical distance between the model serving mechanism and the consumer should be taken under consideration. Depending on your prediction workload, you should also consider load balancing your model serving mechanism.

Go with the KPI that serves your business needs

As you can see, an accurate model may have slightly higher latency and be more difficult to explain; an explainable model might be less accurate since it uses simpler algorithms; and a low-latency model could be marginally less accurate and more difficult to explain. 

It’s impossible to define a global “best” model that applies to every business situation. Your KPIs inevitably inform your ML model choices. For every use case, when building the machine learning model the main KPI must be chosen based on the specific business problem at hand. Once you’ve identified your primary KPI, you can set up the best model to deliver the best results.

Optimize for your KPIs with Firefly.ai

With Firefly.ai’s automated machine learning platform, you can easily optimize your model for all of the above — accuracy, explainability, and prediction speed. All of this, without the involvement of an advanced data science department.

If your KPI is accuracy, the platform will search for the best combination of data preprocessing, machine learning algorithm usage, and hyper parameter optimization. At the end of the workflow, the platform will choose the top models that work best together and will comprise an ensemble out of few different models in order to increase accuracy. 

In cases where explainability is the KPI, the platform will optimize to accommodate this need by selecting a more relaxed approach during the preprocessing phase in order to retain a process that is easily explained. Then, it will choose an algorithm that is also easily explainable, accompanied with visualization of the model’s behavior.

For any optimization, sensitivity to missing values and sensitivity to feature values is calculated and presented; these visualizations assist greatly  for understanding the data, explaining the model and optimizing the next models. 

When speed is the KPI, the platform will optimize to this goal by eliminating models that are deemed too slow, and, in addition, the search process will only consider models that could potentially answer the speed requirement.