Skip to content

Conversation

@rxhxm
Copy link

@rxhxm rxhxm commented May 8, 2025

Grad-CAM Visualization Tool for DonkeyCar

Overview

This PR adds a Gradient-weighted Class Activation Mapping (Grad-CAM) visualization tool to help better understand their neural network models. The tool provides a visual explanation of which parts of input images most influence the model's predictions, making it easier to debug and improve model performance.

Features

  • Generate heatmap visualizations showing which regions of input images influence steering predictions
  • Process multiple frames to create visualization videos of model attention during driving
  • Exclude regions from visualization (such as dashboard/hood areas)
  • Support for various DonkeyCar model architectures
  • Command-line interface for easy integration with existing workflows

Implementation

The implementation consists of:

  • A core utility module with key Grad-CAM visualization functions
  • A command-line interface for batch processing
  • Example scripts and documentation

Why This is Valuable

Unlike the existing tools such as makemovie.py that simply show the model's predictions, Grad-CAM helps users understand why the model makes those predictions by visualizing which regions of the input image are most important to the model. This leads to:

  1. Better debugging of model behavior
  2. Improved model training through better understanding of network attention
  3. Detection of potential biases or issues in training data
  4. Enhanced user understanding of how neural networks process visual information

Testing

The tool has been tested on various DonkeyCar models with TensorFlow 2.15 and works with different model architectures including default DonkeyCar models.

Documentation

Documentation is included with examples in the README.md file within the module directory.

@mgagvani
Copy link
Contributor

mgagvani commented May 9, 2025

Just curious, how is this different than the salient flag in makemovie (

class SalientVis():
) ?

@rxhxm
Copy link
Author

rxhxm commented May 14, 2025

Great question @mgagvani !

The Grad-CAM implementation actually differs from the existing salient flag in makemovie.py in several ways:

  1. The existing salient visualization uses a simpler gradient-based technique, while Grad-CAM uses gradient-weighted class activation mapping that specifically targets convolutional layers, providing more precise heatmaps of activation regions.

  2. The new tool offers more control with options to exclude specific image regions (like dashboard/hood areas) and fine-tune processing parameters which aren't available in the current implementation.

  3. The Grad-CAM version handles a wider range of model architectures more robustly with specific error handling for different model output formats.

  4. The Grad-CAM implementation provides higher quality, more interpretable visualizations with enhanced colormapping and overlay techniques that better highlight the specific image regions.

Let me know if you've got any other questions or want me to dive deeper!

Copy link
Contributor

@DocGarbanzo DocGarbanzo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rxhxm - this looks like a good addition to our current salient method. However we cannot merge the code like this, simply because it has been written completely without using any of the donkey car framework, objects and methods. Any stand-alone script will very quickly stop working, because the code evolves over time, so delivering this new functionality within a stand-alone script is not the right thing to do.

However, it will be very easy to implement your new method within the MakeMovie class that currently implements the salient feature. This approach will integrate your new approach directly into the donkey car eco system, remove a lot of unnecessary code in your PR that deals with accessing data, and creating the movie from the individual frames, etc.

It would be great if you could make that change.

@@ -0,0 +1,109 @@
# DonkeyCar Grad-CAM Visualization
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This documentation looks good. But it should go as a PR into the donkey_docs repository, so it will show up in the official documentation at docs.donkeycar.com

@@ -0,0 +1,8 @@
"""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If integrated into donkey, this file will become obsolete

@@ -0,0 +1,432 @@
"""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While the code in this file looks fine, the implementation should be integrated within the current donkey makemovie command and not within a separate script. We can extend the argument list for makemovie to add all the arguments you specify below.



def find_images(data_path):
"""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will not be needed when going through the MakeMovie class as all the data preparation is already in place.



def load_metadata(data_path):
"""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Access to tub metadata and tub records is provided through the Tub class and hence this method will become obsolete.


# Process images
frames = []
for i, img_path in enumerate(image_paths):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code inside this loop is the core part of the PR. This is the bit that should be translated into MakeMovie such that we can choose between the current and your new version of displaying the salient features. At that point all the tub data loading and file loading has already happened. This will greatly simplify your PR.

return frames


def create_video(frames, output_path, fps=20):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will become obsolete

return output_path


def save_frames(frames, output_dir):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As does this...

return output_dir


def main():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And this...

return regions


def create_video(frames, output_path, fps=30):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will become obsolete.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants