Skip to content

Add run_moderation to the remote provider#21

Merged
m-misiura merged 3 commits intotrustyai-explainability:mainfrom
m-misiura:lls-0.2.19-no-openai-in-moderations
Sep 4, 2025
Merged

Add run_moderation to the remote provider#21
m-misiura merged 3 commits intotrustyai-explainability:mainfrom
m-misiura:lls-0.2.19-no-openai-in-moderations

Conversation

@m-misiura
Copy link
Copy Markdown
Collaborator

@m-misiura m-misiura commented Aug 29, 2025

What does this PR do?

With the changes in upstream llama stack >= 0.2.18, there is a need to add the run_moderation method, else the provider will break (see this PR and this discussion

To ensure backward compatibility, e.g. with llama stack == 0.2.14, imports ModerationObject and ModerationObjectResults are put inside the try-except statement

Test plan

I added some tests for the run_moderation using Mocks. I also tested manually against a live server

Here is a run_moderation response from an inline provider (codeshield)

url -X POST http://localhost:8321/v1/openai/v1/moderations \
  -H "Content-Type: application/json" \
  -d '{
    "input": ["You dotard, I really hate this", "My email is test@email.com", "This is a test message"],
    "model": "code-scanner"
  }' | jq

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:-100   587  100   449  100   138  11593   3563 --:--:-- --:--:-- --:--:-- 15447
{
  "id": "7cdca466-2ce5-4a16-93f3-840a0c6977f8",
  "model": "code-scanner",
  "results": [
    {
      "flagged": false,
      "categories": {},
      "category_applied_input_types": {},
      "category_scores": {},
      "user_message": null,
      "metadata": {}
    },
    {
      "flagged": false,
      "categories": {},
      "category_applied_input_types": {},
      "category_scores": {},
      "user_message": null,
      "metadata": {}
    },
    {
      "flagged": false,
      "categories": {},
      "category_applied_input_types": {},
      "category_scores": {},
      "user_message": null,
      "metadata": {}
    }
  ]
}

Here is a run_moderation response from the trustyai_fms provider

curl -X POST http://localhost:8321/v1/openai/v1/moderations \
  -H "Content-Type: application/json" \
  -d '{
    "input": ["You dotard, I really hate this", "My email is test@email.com", "This is a test message"],
    "model": "composite_shield"
  }' | jq

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:-100   142    0     0  100   142      0    687 --:--:-- --:--:-- --:-100  1668  100  1526  100   142   2788    259 --:--:-- --:--:-- --:--:--  3043
{
  "id": "862c7d2a-14a8-4587-8f30-aebc8b996233",
  "model": "composite_shield",
  "results": [
    {
      "flagged": true,
      "categories": {
        "LABEL_1": true
      },
      "category_applied_input_types": {
        "LABEL_1": [
          "text"
        ]
      },
      "category_scores": {
        "LABEL_1": 0.9750116467475892
      },
      "user_message": "You dotard, I really hate this",
      "metadata": {
        "message_index": 0,
        "text": "You dotard, I really hate this",
        "status": "violation",
        "score": 0.9750116467475892,
        "detection_type": "LABEL_1",
        "individual_detector_results": [
          {
            "detector_id": "hap",
            "status": "violation",
            "score": 0.9750116467475892,
            "detection_type": "LABEL_1"
          },
          {
            "detector_id": "regex",
            "status": "pass",
            "score": null,
            "detection_type": null
          }
        ]
      }
    },
    {
      "flagged": true,
      "categories": {
        "pii": true
      },
      "category_applied_input_types": {
        "pii": [
          "text"
        ]
      },
      "category_scores": {
        "pii": 1.0
      },
      "user_message": "My email is test@email.com",
      "metadata": {
        "message_index": 1,
        "text": "My email is test@email.com",
        "status": "violation",
        "score": 1.0,
        "detection_type": "pii",
        "individual_detector_results": [
          {
            "detector_id": "hap",
            "status": "pass",
            "score": null,
            "detection_type": null
          },
          {
            "detector_id": "regex",
            "status": "violation",
            "score": 1.0,
            "detection_type": "pii"
          }
        ]
      }
    },
    {
      "flagged": false,
      "categories": {},
      "category_applied_input_types": {},
      "category_scores": {},
      "user_message": "This is a test message",
      "metadata": {
        "message_index": 2,
        "text": "This is a test message",
        "status": "pass",
        "score": null,
        "detection_type": null,
        "individual_detector_results": [
          {
            "detector_id": "hap",
            "status": "pass",
            "score": null,
            "detection_type": null
          },
          {
            "detector_id": "regex",
            "status": "pass",
            "score": null,
            "detection_type": null
          }
        ]
      }
    }
  ]
}

In addition:

Summary by Sourcery

Enable moderation support in the trustyai_fms provider by implementing run_moderation with backward compatibility fallbacks and accompanying unit tests

New Features:

  • Add run_moderation method to the remote provider to support content moderation using llama-stack shields

Enhancements:

  • Log a warning when moderation support is unavailable on older llama-stack versions
  • Introduce helper methods to map model names to shield IDs and convert input texts to message objects

Tests:

  • Add unit tests for run_moderation covering flagged results, error handling, empty inputs, and single-string inputs

Chores:

  • Bump package version to 0.2.1

@m-misiura m-misiura requested a review from ruivieira August 29, 2025 13:45
@sourcery-ai
Copy link
Copy Markdown

sourcery-ai bot commented Aug 29, 2025

Reviewer's Guide

This PR adds moderation support to the remote safety provider by implementing a new run_moderation method with helper functions for shield resolution and input conversion, includes a warning for backward compatibility, bumps the package version, and adds comprehensive unit tests for the new moderation workflow.

Sequence diagram for the new moderation workflow in run_moderation

sequenceDiagram
    participant Client
    participant DetectorProvider
    participant ShieldsService
    participant Shield
    participant ModerationObject
    Client->>DetectorProvider: run_moderation(input, model)
    DetectorProvider->>DetectorProvider: _get_shield_id_from_model(model)
    DetectorProvider->>ShieldsService: list_shields()
    ShieldsService-->>DetectorProvider: shields_response
    DetectorProvider->>DetectorProvider: _convert_input_to_messages(input)
    DetectorProvider->>Shield: run_shield(shield_id, messages)
    Shield-->>DetectorProvider: shield_response
    DetectorProvider->>ModerationObject: Build ModerationObject with results
    DetectorProvider-->>Client: ModerationObject
Loading

Entity relationship diagram for ModerationObject and ModerationObjectResults

erDiagram
    MODERATION_OBJECT {
        string id
        string model
    }
    MODERATION_OBJECT_RESULTS {
        boolean flagged
        object categories
        object category_applied_input_types
        object category_scores
        string user_message
        object metadata
    }
    MODERATION_OBJECT ||--o{ MODERATION_OBJECT_RESULTS : contains
Loading

Class diagram for new and updated moderation-related types

classDiagram
    class DetectorProvider {
        +run_moderation(input: str | list[str], model: str): ModerationObject
        +_get_shield_id_from_model(model: str): str
        +_convert_input_to_messages(texts: str | list[str]): List[Message]
    }
    class ModerationObject {
        +id: str
        +model: str
        +results: List[ModerationObjectResults]
    }
    class ModerationObjectResults {
        +flagged: bool
        +categories: dict
        +category_applied_input_types: dict
        +category_scores: dict
        +user_message: str
        +metadata: dict
    }
    class UserMessage {
        +content: str
    }
    DetectorProvider --> ModerationObject
    ModerationObject --> ModerationObjectResults
    DetectorProvider --> UserMessage
Loading

File-Level Changes

Change Details Files
Warn when moderation support is unavailable
  • Emit warning if llama-stack lacks ModerationObject support
llama_stack_provider_trustyai_fms/detectors/base.py
Extend DetectorProvider with moderation workflow
  • Add run_moderation method to fetch shield IDs, convert inputs, call run_shield, and map results
  • Cache model-to-shield IDs for performance
  • Handle exceptions by returning unflagged results with error metadata
llama_stack_provider_trustyai_fms/detectors/base.py
Introduce moderation helper methods
  • Implement _get_shield_id_from_model to resolve model names to shield identifiers
  • Implement _convert_input_to_messages to wrap strings into UserMessage objects
llama_stack_provider_trustyai_fms/detectors/base.py
Bump package version
  • Update project version from 0.2.0 to 0.2.1
pyproject.toml
Add unit tests for run_moderation
  • Cover flagged and non-flagged responses
  • Exercise error fallback behavior
  • Test handling of empty and single-string inputs
tests/test_moderation.py

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

provider.run_shield = AsyncMock(return_value=FakeShieldResponse())

result = await provider.run_moderation(["bad message", "good message"], "test_model")
assert len(result.results) == 2

Check notice

Code scanning / Bandit

Use of assert detected. The enclosed code will be removed when compiling to optimised byte code. Note test

Use of assert detected. The enclosed code will be removed when compiling to optimised byte code.

result = await provider.run_moderation(["bad message", "good message"], "test_model")
assert len(result.results) == 2
assert result.results[0].flagged is True

Check notice

Code scanning / Bandit

Use of assert detected. The enclosed code will be removed when compiling to optimised byte code. Note test

Use of assert detected. The enclosed code will be removed when compiling to optimised byte code.
result = await provider.run_moderation(["bad message", "good message"], "test_model")
assert len(result.results) == 2
assert result.results[0].flagged is True
assert result.results[1].flagged is False

Check notice

Code scanning / Bandit

Use of assert detected. The enclosed code will be removed when compiling to optimised byte code. Note test

Use of assert detected. The enclosed code will be removed when compiling to optimised byte code.
assert len(result.results) == 2
assert result.results[0].flagged is True
assert result.results[1].flagged is False
assert result.results[0].user_message == "bad message"

Check notice

Code scanning / Bandit

Use of assert detected. The enclosed code will be removed when compiling to optimised byte code. Note test

Use of assert detected. The enclosed code will be removed when compiling to optimised byte code.
assert result.results[0].flagged is True
assert result.results[1].flagged is False
assert result.results[0].user_message == "bad message"
assert result.results[1].user_message == "good message"

Check notice

Code scanning / Bandit

Use of assert detected. The enclosed code will be removed when compiling to optimised byte code. Note test

Use of assert detected. The enclosed code will be removed when compiling to optimised byte code.
provider._convert_input_to_messages = MagicMock(return_value=[MagicMock(content="msg")])

result = await provider.run_moderation(["msg"], "test_model")
assert len(result.results) == 1

Check notice

Code scanning / Bandit

Use of assert detected. The enclosed code will be removed when compiling to optimised byte code. Note test

Use of assert detected. The enclosed code will be removed when compiling to optimised byte code.
Copy link
Copy Markdown

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey there - I've reviewed your changes - here's some feedback:

  • Avoid using input as a parameter name in run_moderation to prevent shadowing Python’s built‐in; consider renaming it to something like texts or inputs.
  • run_moderation currently calls list_shields on every invocation and does a linear scan of results_metadata for each message; consider caching shield_id per model and indexing results_metadata by message_index to improve performance.
  • The run_moderation method is quite large in base.py—extracting it into a dedicated helper or service class would improve readability and maintainability.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- Avoid using `input` as a parameter name in run_moderation to prevent shadowing Python’s built‐in; consider renaming it to something like `texts` or `inputs`.
- run_moderation currently calls list_shields on every invocation and does a linear scan of results_metadata for each message; consider caching shield_id per model and indexing results_metadata by message_index to improve performance.
- The run_moderation method is quite large in base.py—extracting it into a dedicated helper or service class would improve readability and maintainability.

## Individual Comments

### Comment 1
<location> `tests/test_moderation.py:4` </location>
<code_context>
+import pytest
+from unittest.mock import AsyncMock, MagicMock
+
+@pytest.mark.asyncio
+async def test_run_moderation_flagged():
+    from llama_stack_provider_trustyai_fms.detectors.base import DetectorProvider
+
</code_context>

<issue_to_address>
Missing test for empty input and single string input edge cases.

Please add tests for empty list input and single string input to ensure run_moderation handles these cases correctly.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Comment on lines +4 to +5
@pytest.mark.asyncio
async def test_run_moderation_flagged():
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (testing): Missing test for empty input and single string input edge cases.

Please add tests for empty list input and single string input to ensure run_moderation handles these cases correctly.

Comment on lines +1888 to +1891
if isinstance(input, str):
inputs = [input]
else:
inputs = input
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (code-quality): Replace if statement with if expression (assign-if-exp)

Suggested change
if isinstance(input, str):
inputs = [input]
else:
inputs = input
inputs = [input] if isinstance(input, str) else input

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sourcery-ai review

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure! I'm generating a new review now.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @m-misiura, I've posted a new review for you!

@m-misiura m-misiura requested a review from adolfo-ab August 29, 2025 15:21

result = await provider.run_moderation(["msg"], "test_model")
assert len(result.results) == 1
assert result.results[0].flagged is False

Check notice

Code scanning / Bandit

Use of assert detected. The enclosed code will be removed when compiling to optimised byte code. Note test

Use of assert detected. The enclosed code will be removed when compiling to optimised byte code.
result = await provider.run_moderation(["msg"], "test_model")
assert len(result.results) == 1
assert result.results[0].flagged is False
assert "fail" in result.results[0].metadata["error"]

Check notice

Code scanning / Bandit

Use of assert detected. The enclosed code will be removed when compiling to optimised byte code. Note test

Use of assert detected. The enclosed code will be removed when compiling to optimised byte code.
provider._convert_input_to_messages = MagicMock(return_value=[])
provider.run_shield = AsyncMock()
result = await provider.run_moderation([], "test_model")
assert len(result.results) == 0

Check notice

Code scanning / Bandit

Use of assert detected. The enclosed code will be removed when compiling to optimised byte code. Note test

Use of assert detected. The enclosed code will be removed when compiling to optimised byte code.
]})
))
result = await provider.run_moderation("one message", "test_model")
assert len(result.results) == 1

Check notice

Code scanning / Bandit

Use of assert detected. The enclosed code will be removed when compiling to optimised byte code. Note test

Use of assert detected. The enclosed code will be removed when compiling to optimised byte code.
))
result = await provider.run_moderation("one message", "test_model")
assert len(result.results) == 1
assert result.results[0].user_message == "one message" No newline at end of file

Check notice

Code scanning / Bandit

Use of assert detected. The enclosed code will be removed when compiling to optimised byte code. Note test

Use of assert detected. The enclosed code will be removed when compiling to optimised byte code.
…on` and avoiding shadowing Python’s built-in function input inside the method body
Copy link
Copy Markdown

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey there - I've reviewed your changes - here's some feedback:

  • Instead of lazily creating _model_to_shield_id via hasattr checks, initialize that cache in the provider’s __init__ to make the code clearer and avoid repeated attribute lookups.
  • Catching all exceptions in run_moderation and hiding them in result metadata can make debugging harder—consider logging unexpected errors or narrowing the except clause to known failure modes.
  • The run_moderation method combines shield lookup, input conversion, and result assembly in one block—extracting the result-building logic into a helper would improve readability and maintainability.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- Instead of lazily creating `_model_to_shield_id` via `hasattr` checks, initialize that cache in the provider’s `__init__` to make the code clearer and avoid repeated attribute lookups.
- Catching all exceptions in `run_moderation` and hiding them in result metadata can make debugging harder—consider logging unexpected errors or narrowing the except clause to known failure modes.
- The `run_moderation` method combines shield lookup, input conversion, and result assembly in one block—extracting the result-building logic into a helper would improve readability and maintainability.

## Individual Comments

### Comment 1
<location> `llama_stack_provider_trustyai_fms/detectors/base.py:1806` </location>
<code_context>
+    if _HAS_MODERATION:
</code_context>

<issue_to_address>
Conditional method definition may lead to missing attributes.

Since the presence of run_moderation depends on _HAS_MODERATION, code that expects this method may fail in environments where moderation is unavailable. To ensure a consistent class interface, define run_moderation unconditionally and raise NotImplementedError when moderation is not supported.
</issue_to_address>

### Comment 2
<location> `llama_stack_provider_trustyai_fms/detectors/base.py:1839` </location>
<code_context>
+                    if result:
+                        cat = result.get("detection_type")
+                        score = result.get("score")
+                        if isinstance(cat, str) and score is not None:
+                            is_violation = result.get("status") == "violation"
+                            categories[cat] = is_violation
</code_context>

<issue_to_address>
Only one category per message is supported.

Currently, only one detection_type and score are processed per message. If the API can return multiple categories, update the logic to handle all relevant categories.

Suggested implementation:

```python
                    if result:
                        # Support multiple categories per message
                        detected_categories = result.get("categories")
                        detected_scores = result.get("scores")
                        detected_statuses = result.get("statuses")
                        # Fallback for single category format
                        if detected_categories and isinstance(detected_categories, dict):
                            for cat, status in detected_statuses.items():
                                score = detected_scores.get(cat)
                                if isinstance(cat, str) and score is not None:
                                    is_violation = status == "violation"
                                    categories[cat] = is_violation
                                    category_scores[cat] = float(score)
                                    category_applied_input_types[cat] = ["text"]
                                    if is_violation:
                                        flagged = True
                        else:
                            cat = result.get("detection_type")
                            score = result.get("score")
                            if isinstance(cat, str) and score is not None:
                                is_violation = result.get("status") == "violation"
                                categories[cat] = is_violation
                                category_scores[cat] = float(score)
                                category_applied_input_types[cat] = ["text"]
                                flagged = is_violation
                        meta = result

```

- You may need to adjust the keys (`categories`, `scores`, `statuses`) to match the actual API response format if they differ.
- If the API returns a list of category objects instead of dicts, iterate accordingly.
- Ensure that the rest of the code (e.g., how `ModerationObjectResults` uses these dicts) supports multiple categories.
</issue_to_address>

### Comment 3
<location> `tests/test_moderation.py:9` </location>
<code_context>
+    from llama_stack_provider_trustyai_fms.detectors.base import DetectorProvider
+
+    provider = DetectorProvider(detectors={})
+    provider._get_shield_id_from_model = AsyncMock(return_value="test_shield")
+    provider._convert_input_to_messages = MagicMock(return_value=[
+        MagicMock(content="bad message"), MagicMock(content="good message")
</code_context>

<issue_to_address>
Consider adding a test for multiple shields found for a model.

Please add a test that triggers the multiple shields exception and verifies the error is correctly reflected in the moderation results metadata.

Suggested implementation:

```python
import pytest
from unittest.mock import AsyncMock, MagicMock

class MultipleShieldsFoundError(Exception):
    pass

@pytest.mark.asyncio
async def test_run_moderation_multiple_shields_error():
    from llama_stack_provider_trustyai_fms.detectors.base import DetectorProvider

    provider = DetectorProvider(detectors={})
    # Simulate multiple shields found by raising the error
    provider._get_shield_id_from_model = AsyncMock(side_effect=MultipleShieldsFoundError("Multiple shields found for model"))
    provider._convert_input_to_messages = MagicMock(return_value=[
        MagicMock(content="test message")
    ])

    # Run moderation and check error in metadata
    result = await provider.run_moderation("test_model", "test input")
    assert result["metadata"]["error"] == "Multiple shields found for model"

@pytest.mark.asyncio
async def test_run_moderation_flagged():

```

- If `MultipleShieldsFoundError` is defined elsewhere in your codebase, import it instead of defining it in the test file.
- Ensure that `provider.run_moderation` correctly catches the exception and sets the error in `result["metadata"]["error"]`. If not, you may need to update the implementation to handle this case.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Comment on lines +1806 to +1815
if _HAS_MODERATION:
async def run_moderation(self, input: str | list[str], model: str) -> ModerationObject:
"""
Runs moderation for each input message.
Returns a ModerationObject with one ModerationObjectResults per input.
"""
texts = input # Avoid shadowing the built-in 'input'
try:
# Shield ID caching for performance
if not hasattr(self, "_model_to_shield_id"):
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue: Conditional method definition may lead to missing attributes.

Since the presence of run_moderation depends on _HAS_MODERATION, code that expects this method may fail in environments where moderation is unavailable. To ensure a consistent class interface, define run_moderation unconditionally and raise NotImplementedError when moderation is not supported.

if result:
cat = result.get("detection_type")
score = result.get("score")
if isinstance(cat, str) and score is not None:
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: Only one category per message is supported.

Currently, only one detection_type and score are processed per message. If the API can return multiple categories, update the logic to handle all relevant categories.

Suggested implementation:

                    if result:
                        # Support multiple categories per message
                        detected_categories = result.get("categories")
                        detected_scores = result.get("scores")
                        detected_statuses = result.get("statuses")
                        # Fallback for single category format
                        if detected_categories and isinstance(detected_categories, dict):
                            for cat, status in detected_statuses.items():
                                score = detected_scores.get(cat)
                                if isinstance(cat, str) and score is not None:
                                    is_violation = status == "violation"
                                    categories[cat] = is_violation
                                    category_scores[cat] = float(score)
                                    category_applied_input_types[cat] = ["text"]
                                    if is_violation:
                                        flagged = True
                        else:
                            cat = result.get("detection_type")
                            score = result.get("score")
                            if isinstance(cat, str) and score is not None:
                                is_violation = result.get("status") == "violation"
                                categories[cat] = is_violation
                                category_scores[cat] = float(score)
                                category_applied_input_types[cat] = ["text"]
                                flagged = is_violation
                        meta = result
  • You may need to adjust the keys (categories, scores, statuses) to match the actual API response format if they differ.
  • If the API returns a list of category objects instead of dicts, iterate accordingly.
  • Ensure that the rest of the code (e.g., how ModerationObjectResults uses these dicts) supports multiple categories.

from llama_stack_provider_trustyai_fms.detectors.base import DetectorProvider

provider = DetectorProvider(detectors={})
provider._get_shield_id_from_model = AsyncMock(return_value="test_shield")
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (testing): Consider adding a test for multiple shields found for a model.

Please add a test that triggers the multiple shields exception and verifies the error is correctly reflected in the moderation results metadata.

Suggested implementation:

import pytest
from unittest.mock import AsyncMock, MagicMock

class MultipleShieldsFoundError(Exception):
    pass

@pytest.mark.asyncio
async def test_run_moderation_multiple_shields_error():
    from llama_stack_provider_trustyai_fms.detectors.base import DetectorProvider

    provider = DetectorProvider(detectors={})
    # Simulate multiple shields found by raising the error
    provider._get_shield_id_from_model = AsyncMock(side_effect=MultipleShieldsFoundError("Multiple shields found for model"))
    provider._convert_input_to_messages = MagicMock(return_value=[
        MagicMock(content="test message")
    ])

    # Run moderation and check error in metadata
    result = await provider.run_moderation("test_model", "test input")
    assert result["metadata"]["error"] == "Multiple shields found for model"

@pytest.mark.asyncio
async def test_run_moderation_flagged():
  • If MultipleShieldsFoundError is defined elsewhere in your codebase, import it instead of defining it in the test file.
  • Ensure that provider.run_moderation correctly catches the exception and sets the error in result["metadata"]["error"]. If not, you may need to update the implementation to handle this case.

)
)
if _HAS_MODERATION:
async def run_moderation(self, input: str | list[str], model: str) -> ModerationObject:
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (code-quality): We've found these issues:


Explanation

The quality score for this function is below the quality threshold of 25%.
This score is a combination of the method length, cognitive complexity and working memory.

How can you solve this?

It might be worth refactoring this function to make it shorter and more readable.

  • Reduce the function length by extracting pieces of functionality out into
    their own functions. This is the most important thing you can do - ideally a
    function should be less than 10 lines.
  • Reduce nesting, perhaps by introducing guard clauses to return early.
  • Ensure that variables are tightly scoped, so that code using related concepts
    sits together within the function rather than being scattered.

Comment on lines +1898 to +1901
if isinstance(texts, str):
inputs = [texts]
else:
inputs = texts
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (code-quality): Replace if statement with if expression (assign-if-exp)

Suggested change
if isinstance(texts, str):
inputs = [texts]
else:
inputs = texts
inputs = [texts] if isinstance(texts, str) else texts

@ruivieira
Copy link
Copy Markdown
Member

@m-misiura m-misiura merged commit 923324f into trustyai-explainability:main Sep 4, 2025
5 of 7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants