Note, $f$ is NOT the real function that generate data, e.g. it's a trained ML model.
with $\mathbf{x}'$ being the simplified features that can be mapped to the original features via
$$\mathbf{x} = h_\mathbf{x}(\mathbf{x}')$$we try to ensure the property of the explanation model that
whenever $\mathbf{z}' \approx \mathbf{x}'$.
where
Note | model to explanation | simplified inputs | ||
---|---|---|---|---|
LIME | blackbox | interpretable inputs | ||
DeepLIFT | DNN | |||
Layer-wise relevance propagation | DNN | |||
Classic Shapley Value Estimation | Shapley regression values | needs retrainng models for all subsets of features | ||
Shapley sampling values | applying sampling approximation to Shapely regression values | |||
Quantitative Input Influence | Another way of sampling approximation to Shapely regression values |
which means the explanation model output should match the prediction model output when $\mathbf{z}' = \mathbf{x}'$, and hence $\mathbf{x} = h_\mathbf{x}(\mathbf{x}') = h_\mathbf{x}(\mathbf{z}')$.
Note,
i.e. when $x_i' = 0$, then $\phi_i = 0$, i.e. a feature that's not included in the feature vector shouldn't have impact on the prediction
Let $y(\mathbf{z}') = f(h_\mathbf{x}(\mathbf{z}'))$, and $\mathbf{z}_{\backslash i}$ denote seting $z_i' = 0$ in the simplified binary feature vector. For two models $y_A$ and $y_B$, if
which can be expanded to $f_A(h_\mathbf{x}(\mathbf{z}')) - f_A(h_\mathbf{x}(\mathbf{z}'_{\backslash i})) \ge f_B(h_\mathbf{x}(\mathbf{z}')) - f_B(h_\mathbf{x}(\mathbf{z}'_{\backslash i}))$
then, then corresponding impact for the $i$th feature in the two models should satisfy
where the subscript $_A$ and $_B$ identifies which model this $\phi_i$ belongs to.
In words, consistency means that for two models, if the exclusion of a feature results in a larger reduction in the predicted value in model A than in model B, then this feature should have a bigger impact in model A than in model B, too.
Only one possible explanation model that follows the additive feature attribution methods can satisfy all three properties
WRONG EQUATION ABOVE, NEEDS TO FIX: it's $f_x$ instead of $f$!</span.
Note,
Note on symbols.
The equation can be interpreted as sum up of all marginal contributions brought by feature $i$ of all possible feature vectors with $i$th feature being $0$ scaled by $\frac{1}{M}$.
The above equation can also be written as
or
where
Complexity: $O(TLM2^M)$
Notations
$T$: number of trees
$D$: maximum depth of any tree
$L$: number of leaves
$M$: number of features
$\mathbf{v}$: vector of nodes, $v_j \in \mathcal{R} \cup \text{internal}$
$\mathbf{a}$: vector of indices represent the left child of each internal node
$\mathbf{b}$: vector of indices represent the right child of each internal node
$\mathbf{t}$: vector of thresholds for each internal node
$\mathbf{d}$: vector of indices of features used for splitting in each internal node. $d_j \in \text{feature set}$.
$\mathbf{r}$: vector of covers (i.e. how many data points in the training set fall in the corresponding sub-tree) of each node.
All vectors are of length $N$, the number of nodes in the tree.
Complexity: $O(TLD^2)$
$m$ is the path of unique features we have split on so far, and contains four attributes:
$p_z$, fraction of zeros that are going to extend the subsets.
$p_o$, fraction of ones that are going to textend the subsets
$p_i$, index of the feature used to make the last split.
hot child: the child followed by the tree when given the input $\mathbf{x}$.