Feature Importance (SHAP)

Visualize SHAP values to explain machine learning model predictions based on feature contributions.

Feature Importance (SHAP)

Concept Overview

SHAP (SHapley Additive exPlanations) is a game theoretic approach to explain the output of any machine learning model. It connects optimal credit allocation with local explanations using the classic Shapley values from cooperative game theory. By calculating the contribution of each feature to the prediction, SHAP allows us to interpret complex, black-box models by breaking down a specific prediction into a sum of feature effects.

Mathematical Definition

The SHAP value φ_i for a feature i is the weighted average of its marginal contributions across all possible coalitions of features:

φ_i = Σ_{S ⊆ N \ {i}} [ |S|!(|N| − |S| − 1)! / |N|! ] [ f_S∪{i}(x_S∪{i}) − f_S(x_S) ]

where:

N = set of all features

S = subset of features (coalition) without feature i

f_S(x_S) = prediction for feature values in S

Key Concepts

Additive Property (Local Accuracy)

The sum of the SHAP values for all features equals the difference between the model's prediction for a specific instance and the expected (base) prediction for the dataset.

f(x) = E[f(x)] + Σ_i=1^M φ_i

Missingness

If a feature is missing (i.e., its value is unknown or it is not part of the input), it has no impact on the prediction. Therefore, its SHAP value is zero (φ_i = 0).

Consistency

If a model is changed so that a feature's marginal contribution increases or stays the same (regardless of other features), the feature's SHAP value will not decrease. This property guarantees that we can trust the feature importances to reflect the true behavior of the model.

Historical Context

The underlying mathematical foundation of SHAP comes from cooperative game theory, specifically Shapley values, which were introduced by Lloyd Shapley in 1953. Shapley proposed a method to fairly distribute the payout of a game among players based on their individual contributions to the coalitions.

In 2017, Scott Lundberg and Su-In Lee published "A Unified Approach to Interpreting Model Predictions," which adapted Shapley values for machine learning interpretability. They introduced the SHAP framework, unifying several existing methods (like LIME and DeepLIFT) under a single, theoretically sound framework, making it the gold standard for model explainability.

Real-world Applications

Healthcare: Explaining clinical risk predictions to doctors by identifying which patient symptoms or lab results drive the model's diagnosis.
Finance: Justifying credit scoring decisions to customers or regulators by highlighting the positive or negative impact of factors like income or credit history.
Marketing: Understanding customer churn predictions by determining the specific reasons (e.g., low usage, recent support ticket) a user is likely to cancel a subscription.
Fairness & Bias Auditing: Analyzing models to ensure protected attributes (like age or gender) are not disproportionately influencing predictions.

Related Concepts

Decision Trees — models that can be natively explained using TreeSHAP, an optimized algorithm for calculating exact SHAP values for tree-based models
LIME (Local Interpretable Model-agnostic Explanations) — an alternative interpretability method that builds local surrogate models around the prediction
Linear Regression — a naturally interpretable model where the coefficients essentially act as global feature importances

Feature Importance (SHAP)