Skip to content

mozilla-ai/prompt-saliency

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Prompt Saliency Analyzer

A smol tool for visualizing the importance (saliency) of different words in prompts submitted to Large Language Models (LLMs). Following the definition of $Perb_{Sim}$ in PromptExp: Multi-granularity Prompt Explanation of Large Language Models (Dong et Al. 2024), saliency of a word is measured in terms of how semantically different the LLM response is when that word is masked / removed from the input prompt.

Understanding which parts of a prompt most significantly affect the model's response can help detect model biases, understand causal relationships between model inputs and outputs, and ultimately improve your prompts.

How It Works

The tool measures the importance of each token in your prompt by:

  1. Getting a baseline response from the LLM with your original prompt
  2. Creating variations of your prompt by masking one token at a time
  3. Measuring how much the model's response changes with each masked version (in terms of distance between response semantic embeddings)
  4. Visualizing the importance of each token using a color gradient (blue = less important, red = more important)

Installation

Prerequisites

  • Python 3.7+
  • pip (Python package manager)

Setup

  1. Clone this repository:

    git clone https://github.com/mozilla-ai/prompt-saliency.git
    cd prompt-saliency
  2. Create a virtual environment (recommended):

    python -m venv venv
    source venv/bin/activate  # On Windows: venv\Scripts\activate
  3. Install the required packages:

    pip install -r requirements.txt

Environment Setup

Setting up API Keys

This tool relies on LiteLLM so you can query any model supported by it. To use various LLM providers, you'll need to set up the appropriate API keys as environment variables:

For OpenAI models (default)

# On Linux/macOS
export OPENAI_API_KEY=your_openai_api_key

# On Windows (Command Prompt)
set OPENAI_API_KEY=your_openai_api_key

# On Windows (PowerShell)
$env:OPENAI_API_KEY="your_openai_api_key"

For other providers (Anthropic, Azure, etc.)

# Set the appropriate environment variables based on the provider
export ANTHROPIC_API_KEY=your_anthropic_api_key
export AZURE_API_KEY=your_azure_api_key
# etc.

Using local models

For locally hosted models (e.g. using Ollama):

# Example for using local models
python prompt_saliency.py "Your prompt here" --model ollama/gemma3:latest --api-base http://localhost:11434

Usage

Basic Usage

python prompt_saliency.py "The opposite of 'small' is (one word):"

This will hit openai/gpt-4o by default, and use all-MiniLM-L6-v2 locally to calculate embeddings.

Advanced Options

python prompt_saliency.py "Your prompt here" \
  --model openai/gpt-4o \
  --embedding-model all-MiniLM-L6-v2 \
  --log DEBUG \
  --api-base https://your-api-endpoint

Command Line Arguments

Argument Short Description Default
--model -m Model to send the prompt to openai/gpt-4o
--embedding-model -e Embedding model used for similarity comparison all-MiniLM-L6-v2
--log -l Log level (DEBUG, INFO, WARNING, ERROR) INFO (use DEBUG to see all requests / responses)
--api-base -a API base URL for locally hosted models None (will use LiteLLM defaults)

Example Results

A screenshot showing the output of the tool when the input prompt is "The opposite of 'small' is (ONE word):". After some logging, the output looks like the following: The[green] opposite[blue] of[blue] 'small'[red] is[light blue] (ONE[light blue] word):[green], with colors mapping to the following range: blue for saliency values in the 0.00.2 interval, light blue: 0.20.4, green: 0.40.6, yellow: 0.60.8, red: 0.8~1.0

A screenshot showing the output of the tool when the input prompt is "Write a C function that returns the factorial of an integer number (provide code only)". After some logging, the output looks like the following: Write[blue] a[blue] C[light blue] function[blue] that[blue] returns[blue] the[blue] factorial[yellow] of[blue] an[blue] integer[blue] number[blue] (provide[blue] code[blue] only)[light blue], with colors mapping to the following range: blue for saliency values in the 0.00.2 interval, light blue: 0.20.4, green: 0.40.6, yellow: 0.60.8, red: 0.8~1.0

Supported Models

Inference Models

  • OpenAI models (default): openai/gpt-4o, openai/gpt-4-turbo, etc.
  • Mistral models: mistral/open-mistral-nemo, etc.
  • Ollama models: ollama/gemma3:latest, ollama/phi4:latest, ollama/gemma3:27b, etc.
  • Any other model supported by LiteLLM

Embedding Models

  • Default: all-MiniLM-L6-v2
  • Any model supported by SentenceTransformers

How to Interpret Results

  • Red tokens: Highly important - masking them significantly changes the model's response
  • Yellow/Green tokens: Moderately important
  • Blue tokens: Less important - masking them has minimal effect on the response

Troubleshooting

Common Issues

  1. API Key Errors:

    • Ensure you've set the correct environment variables for your chosen model provider
    • Check that your API key is valid and has not expired
  2. Model Access Issues:

    • Verify you have access to the model you are trying to use
    • For local models, ensure your API base URL is correct and the service is running
  3. Dependency Issues:

    • If you encounter dependency errors, try updating your packages:
      pip install --upgrade sentence-transformers litellm numpy colorama loguru

About

A simple command-line tool to calculate importance of tokens in prompts sent to an LLM.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages