Model Explainability: Grad-CAM, Feature Maps

Description

Model explainability techniques help understand and interpret the decisions made by complex deep learning models. Grad-CAM (Gradient-weighted Class Activation Mapping) is a popular visualization method that highlights important regions in an input image which influenced the model's prediction. It uses gradients flowing into the final convolutional layer to produce a heatmap of relevant areas.

Feature maps are intermediate outputs from convolutional layers that capture various features like edges, textures, or shapes. Visualizing feature maps helps interpret how the model processes and transforms the input data at different layers.

Examples

Example of generating Grad-CAM heatmap using PyTorch and a pretrained CNN model:


import torch
import torchvision.models as models
import torchvision.transforms as transforms
from PIL import Image
import matplotlib.pyplot as plt
import numpy as np

# Load pretrained model
model = models.resnet18(pretrained=True)
model.eval()

# Image preprocessing
preprocess = transforms.Compose([
    transforms.Resize((224, 224)),
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.485, 0.456, 0.406],
                         std=[0.229, 0.224, 0.225])
])

img = Image.open("input.jpg").convert('RGB')
input_tensor = preprocess(img).unsqueeze(0)

# Hook to get gradients and activations
activations = []
gradients = []

def forward_hook(module, input, output):
    activations.append(output)

def backward_hook(module, grad_in, grad_out):
    gradients.append(grad_out[0])

target_layer = model.layer4[1].conv2
target_layer.register_forward_hook(forward_hook)
target_layer.register_backward_hook(backward_hook)

# Forward pass
output = model(input_tensor)
pred_class = output.argmax(dim=1)

# Backward pass to get gradients
model.zero_grad()
output[0, pred_class].backward()

# Compute Grad-CAM
grad = gradients[0].detach()
act = activations[0].detach()
weights = grad.mean(dim=[2, 3], keepdim=True)
cam = (weights * act).sum(dim=1).squeeze()
cam = np.maximum(cam.cpu(), 0)
cam = cam / cam.max()

# Visualize heatmap
plt.imshow(img)
plt.imshow(cam.cpu(), cmap='jet', alpha=0.5)
plt.axis('off')
plt.show()

Visualizing feature maps can be done by extracting activations from convolutional layers and plotting them as images.

Real-World Applications

Medical Imaging

Explaining model decisions in detecting diseases from X-rays or MRIs by highlighting critical regions.

Fairness and Bias Detection

Analyzing model attention to ensure predictions do not rely on biased features or spurious correlations.

Autonomous Driving

Understanding what parts of the scene the model focuses on to improve safety and trust.

Robotics and Vision

Interpreting feature maps to optimize object detection and navigation models.

Resources

PDFs

The following documents

topic pdf

Recommended Books

Deep Learning by Ian Goodfellow, Yoshua Bengio, and Aaron Courville
Interpretable Machine Learning by Christoph Molnar
Deep Learning with Python by François Chollet

Interview Questions

What is Grad-CAM and how does it help explain CNN predictions?

Grad-CAM uses gradients flowing into the last convolutional layer to produce a heatmap that highlights important regions in the input image influencing the prediction.

What are feature maps in convolutional neural networks?

Feature maps are outputs of convolutional layers that represent detected features like edges or textures at different spatial locations in the input.

How can model explainability techniques improve trust in AI systems?

By showing why models make certain decisions, these techniques help users understand, trust, and validate AI predictions, especially in critical applications.