Skip to content

feat: add cpu/cuda config for prompt guard #2194

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

mhdawson
Copy link

@mhdawson mhdawson commented May 16, 2025

What does this PR do?

Previously prompt guard was hard coded to require cuda which prevented it from being used on an instance without a cuda support.

This PR allows prompt guard to be configured to use either cpu or cuda.

Closes #2133

Test Plan

  1. started stack configured with prompt guard as follows on a system without a GPU
safety:
  - provider_id: prompt-guard
    provider_type: inline::prompt-guard
    config:
      guard_execution_type: cpu

and validated prompt guard could be used through the APIs

  1. started stack configured with prompt guard as follows on a system without a GPU
safety:
  - provider_id: prompt-guard
    provider_type: inline::prompt-guard
    config:
      guard_execution_type: cuda

and validated that it it indicated it could not run because packages were not compiled with cuda support. This is the same as before the change.

  1. ran the unit tests as per - https://github.com/meta-llama/llama-stack/blob/main/tests/unit/README.md

Previously prompt guard was hard coded to require cuda which
prevented it from being used on an instance without a cuda
support.

This PR allows prompt guard to be configured to use either cpu
or cuda.

Signed-off-by: Michael Dawson <[email protected]>
@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Meta Open Source bot. label May 16, 2025
@@ -75,7 +75,7 @@ def __init__(
self.temperature = temperature
self.threshold = threshold

self.device = "cuda"
self.device = self.config.guard_execution_type
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we just check if cuda is available and use that otherwise use CPU? no need for a specific configuration like this to be added.

Copy link
Contributor

@ashwinb ashwinb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

requesting changes for my inline comment

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Meta Open Source bot.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Not possible to use CPU inferance with prompt-guard - intentional?
3 participants