Skip to content

feat: add cpu/cuda config for prompt guard #2194

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 14 additions & 0 deletions llama_stack/providers/inline/safety/prompt_guard/config.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,8 +15,14 @@ class PromptGuardType(Enum):
jailbreak = "jailbreak"


class PromptGuardExecutionType(Enum):
cpu = "cpu"
cuda = "cuda"


class PromptGuardConfig(BaseModel):
guard_type: str = PromptGuardType.injection.value
guard_execution_type: str = PromptGuardExecutionType.cuda.value

@classmethod
@field_validator("guard_type")
Expand All @@ -25,8 +31,16 @@ def validate_guard_type(cls, v):
raise ValueError(f"Unknown prompt guard type: {v}")
return v

@classmethod
@field_validator("guard_execution_type")
def validate_guard_execution_type(cls, v):
if v not in [t.value for t in PromptGuardExecutionType]:
raise ValueError(f"Unknown prompt guard execution type: {v}")
return v

@classmethod
def sample_run_config(cls, __distro_dir__: str, **kwargs: Any) -> dict[str, Any]:
return {
"guard_type": "injection",
"guard_execution_type": "cuda",
}
Original file line number Diff line number Diff line change
Expand Up @@ -75,7 +75,7 @@ def __init__(
self.temperature = temperature
self.threshold = threshold

self.device = "cuda"
self.device = self.config.guard_execution_type
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we just check if cuda is available and use that otherwise use CPU? no need for a specific configuration like this to be added.


# load model and tokenizer
self.tokenizer = AutoTokenizer.from_pretrained(model_dir)
Expand Down
Loading