only the evidence for the actual class.</p>


R = [None]*L + [A[L]*(T[:,None]==numpy.arange(10))]
R = [None]*L + [A[L]*(T[:,None]==numpy.arange(10))]






<p>
The LRP0, LRPϵ, and LRPγ rules described in the <a
href="">LRP overview paper
</a>

href="https://link.springer.com/chapter/10.1007/9783030289546_10">LRP







<img src="http://latex.codecogs.com/svg.latex?R_j =




<img src="http://latex.codecogs.com/svg.latex?R_j =
\sum_k
\frac{a_j




\r
ho(w_{jk})}{






\r







<p>



<p>
(cf. Section 10.2.2), where ρ is a function that transform the weights, and ϵ



is a small positive increment. We define below two helper functions that perform

the weight transformation and the incrementation. In practice, we would like to

apply different rules at different layers (cf. Section 10.3). Therefore, we also





give the layer index "
<code>
l
</code>
" as argument to these functions.
</p>
argument to these functions.
</p>
def rho(w,l): return w + [None,0.1,0.0,0.0][l] * numpy.maximum(0,w)
def rho(w,l): return w + [None,0.1,0.0,0.0][l] * numpy.maximum(0,w)
def incr(z,l): return z + [None,0.0,0.1,0.0][l] * (z**2).mean()**.5+1e9
def incr(z,l): return z + [None,0.0,0.1,0.0][l] * (z**2).mean()**.5+1e9


<p>
In particular, these functions and the layer they receive as a parameter let



us reduce the general rule to LRP0 for the toplayer, to LRPϵ with ϵ = 0.1std







for the layer just below, and to LRPγ with γ=0.1 for the layer before. We now



come to the practical implementation of this general rule. It can be decomposed

as a sequence of four computations:
</p>
computations:
</p>

<p>
<img src="http://latex.codecogs.com/svg.latex?
<img src="http://latex.codecogs.com/svg.latex?

layers, and at each layer, applying this sequence of computations.</p>

@@ 192,20 +191,21 @@ layers, and at each layer, applying this sequence of computations.</p>
<p>

<p>
Note that the loop above stops one layer before reaching the pixels. To

propagate relevance scores until the pixels, we need to apply an alternate

propagation rule that properly handles pixel values received as input (cf.
Section 10.3.2). In particular, we apply for this layer the
zBrule given





given


by:
</p>
<img src="http://latex.codecogs.com/svg.latex?
<img src="http://latex.codecogs.com/svg.latex?
R_i =












um_{i} a_i w_{ij}

l_i w_{ij}^+  h_i w_{ij}^} R_j


">
">


<p>


















upper bounds of pixel values, i.e. "−1" and "+1", and (·)
<sup>
+
</sup>
and















(·)





can again be implemented with a fourstep procedure similar to the one used in




create arrays of pixel values set to
<i>
l
<sub>
i
</sub></i>
and
<i>
h
<sub>
i
</sub></i>
respectively:
</p>
w = W[0]
w = W[0]
...
c_j = \big[\nabla~\big({\textstyle \sum_k}~z_k(\boldsymbol{a}) \cdot
...
s_k
\big)
\big]_j










">
">





<p>








<p><b>
Pooling layers:
</b>
It is suggested in Section 10.3.2 of the paper to
<p><b>
Pooling layers:
</b>
It is suggested in Section 10.3.2 of the paper to
treat maxpooling layers as average pooling layers in the backward pass.
treat maxpooling layers as average pooling layers in the backward pass.
...
dimensional maps are shown for a selection of VGG16 layers.
...

<p>

<p>












this last layer. This rule can again be implemented in terms of forward passes

and gradient computations.

gradient computations.
</p>
A[0] = (A[0].data).requires_grad_(True)
A[0] = (A[0].data).requires_grad_(True)
...
...
