Skip to content

feat(ts-sdk): add trust score persistence#585

Merged
imran-siddique merged 6 commits intomicrosoft:mainfrom
imran-siddique:feat/ts-sdk-parity
Mar 29, 2026
Merged

feat(ts-sdk): add trust score persistence#585
imran-siddique merged 6 commits intomicrosoft:mainfrom
imran-siddique:feat/ts-sdk-parity

Conversation

@imran-siddique
Copy link
Copy Markdown
Member

Addresses #577 - TypeScript SDK parity. 98 lines, 153 tests passing (4 new).

Add persistPath option to TrustConfig that enables saving trust scores
to a JSON file. Scores are loaded on construction and saved after each
recordSuccess/recordFailure call. Missing or corrupt files are handled
gracefully.

Closes microsoft#577

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@github-actions github-actions bot added tests agent-mesh agent-mesh package size/M Medium PR (< 200 lines) labels Mar 29, 2026
Copy link
Copy Markdown

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 AI Agent: code-reviewer

Review of feat(ts-sdk): add trust score persistence

This PR introduces trust score persistence functionality to the TrustManager class in the TypeScript SDK. The feature allows trust scores to be saved to and loaded from a file, ensuring they persist across application restarts. The implementation includes changes to the TrustManager class, updates to the TrustConfig interface, and new tests for the persistence functionality.


🔴 CRITICAL

  1. Unencrypted Storage of Sensitive Data:

    • The trust scores are being stored in plaintext in a file. If the file is accessed by an unauthorized user, it could lead to a compromise of sensitive trust data.
    • Recommendation: Encrypt the trust score data before writing it to disk. Use a secure encryption algorithm (e.g., AES-256) and ensure the encryption key is securely managed (e.g., via a secure key management system or environment variables).
  2. Lack of File Permissions Management:

    • The file is created without explicitly setting restrictive permissions. This could allow unauthorized users on the same system to access or modify the file.
    • Recommendation: Use fs.writeFileSync with the mode option to set restrictive permissions (e.g., 0o600 for read/write access only to the owner). Similarly, verify and set permissions when reading the file.
  3. Error Handling in Persistence:

    • The saveToDisk and loadFromDisk methods silently ignore all errors. This could lead to undetected failures in saving or loading trust scores, potentially causing data loss or inconsistent behavior.
    • Recommendation: Log errors when file operations fail, and consider exposing these errors to the caller (e.g., via events or return values) so they can handle them appropriately.

🟡 WARNING

  1. Breaking Change in TrustConfig:
    • The persistPath property was added to the TrustConfig interface. While this is an optional property, it could still cause issues for existing consumers of the SDK if they are using strict type checking or if they extend the TrustConfig interface.
    • Recommendation: Clearly document this addition in the changelog and release notes. Consider versioning the SDK appropriately (e.g., a minor version bump if following semantic versioning).

💡 SUGGESTIONS

  1. Concurrency and Thread Safety:

    • The TrustManager class uses a Map to store agent trust states, but there is no explicit handling of concurrent access. If the class is used in a multi-threaded or multi-process environment, this could lead to race conditions or data corruption.
    • Recommendation: Use a locking mechanism (e.g., Mutex) to ensure thread-safe access to the agents map and file operations.
  2. Atomic File Writes:

    • The saveToDisk method writes directly to the file, which could result in a partially written or corrupted file if the process is interrupted.
    • Recommendation: Write to a temporary file first, then atomically rename it to the target file. This ensures that the file is either fully written or not modified at all.
  3. Validation of Loaded Data:

    • The loadFromDisk method assumes that the data in the file matches the AgentTrustState structure. If the file contains unexpected or malicious data, it could lead to runtime errors or undefined behavior.
    • Recommendation: Validate the loaded data against the expected structure (e.g., using a schema validation library or manual checks).
  4. Test Coverage for Edge Cases:

    • The tests cover basic scenarios (e.g., missing or corrupt files), but additional edge cases could be tested:
      • Large files with many agents.
      • Concurrent access to the same file by multiple TrustManager instances.
      • File system errors (e.g., insufficient permissions, disk full).
    • Recommendation: Add tests for these scenarios to ensure robustness.
  5. Documentation:

    • The new persistPath feature is not documented in the code comments or external documentation.
    • Recommendation: Update the SDK documentation to explain how to use the persistPath feature, including any security considerations (e.g., file permissions, encryption).

Summary of Actions

Critical

  • Encrypt trust score data before writing to disk.
  • Set restrictive file permissions when creating the persistence file.
  • Improve error handling in saveToDisk and loadFromDisk.

Warning

  • Document the addition of persistPath in the changelog and release notes.

Suggestions

  • Ensure thread safety for agents map and file operations.
  • Use atomic file writes to prevent partial writes.
  • Validate loaded data against the expected structure.
  • Expand test coverage for edge cases.
  • Update SDK documentation to include the new feature.

By addressing these issues, the trust score persistence feature can be made more secure, robust, and user-friendly. Let me know if you need further clarification or assistance!

@github-actions
Copy link
Copy Markdown

github-actions bot commented Mar 29, 2026

🤖 AI Agent: security-scanner — Security Review of PR: feat(ts-sdk): add trust score persistence

Security Review of PR: feat(ts-sdk): add trust score persistence

This PR introduces a feature to persist trust scores in the TypeScript SDK of the microsoft/agent-governance-toolkit. While the feature is useful, it introduces several potential security risks that need to be addressed.


Findings

🔴 CRITICAL: Deserialization Attack Vector

  • Issue: The loadFromDisk method directly parses JSON data from a file using JSON.parse without validating its structure or content. This introduces a potential deserialization attack vector where a maliciously crafted JSON file could inject unexpected data or exploit the system.
  • Attack Vector: An attacker could replace the JSON file at the persistPath location with a malicious payload. If the file contains unexpected or malicious content, it could cause undefined behavior or even execute arbitrary code, depending on how the data is used.
  • Fix: Validate the structure and content of the parsed JSON data before using it. For example:
    private loadFromDisk(): void {
      if (!this.persistPath) return;
      try {
        const raw = fs.readFileSync(this.persistPath, 'utf-8');
        const data = JSON.parse(raw);
        if (this.isValidTrustState(data)) {
          for (const [key, value] of Object.entries(data)) {
            this.agents.set(key, value);
          }
        }
      } catch {
        // best-effort: ignore missing or corrupt files
      }
    }
    
    private isValidTrustState(data: any): data is Record<string, AgentTrustState> {
      // Add validation logic to ensure `data` matches the expected structure
      return typeof data === 'object' && Object.values(data).every(value =>
        typeof value.successes === 'number' &&
        typeof value.failures === 'number' &&
        typeof value.score === 'number' &&
        typeof value.lastUpdate === 'number'
      );
    }

🟠 HIGH: Credential Exposure via Persistence

  • Issue: The persistPath file may contain sensitive trust-related data, such as agent identities and their trust scores. If this file is not adequately protected, it could be read or tampered with by unauthorized users.
  • Attack Vector: An attacker with access to the file system could read or modify the trust scores, potentially undermining the trust model or gaining unauthorized access to sensitive data.
  • Fix:
    • Ensure the file is stored in a secure location with restricted permissions (e.g., chmod 600).
    • Use encryption to protect the contents of the file. For example:
      private encrypt(data: string): string {
        const cipher = crypto.createCipher('aes-256-cbc', this.config.encryptionKey);
        return cipher.update(data, 'utf8', 'hex') + cipher.final('hex');
      }
      
      private decrypt(data: string): string {
        const decipher = crypto.createDecipher('aes-256-cbc', this.config.encryptionKey);
        return decipher.update(data, 'hex', 'utf8') + decipher.final('utf8');
      }

🟠 HIGH: Race Condition in File Access

  • Issue: The saveToDisk and loadFromDisk methods perform file operations without any locking mechanism. This could lead to race conditions if multiple instances of TrustManager attempt to read/write the same file simultaneously.
  • Attack Vector: Concurrent access to the file could corrupt the data or cause partial writes, leading to inconsistent trust scores.
  • Fix: Use file locks to ensure atomic read/write operations. For example, use a library like proper-lockfile to manage file locks:
    import * as lockfile from 'proper-lockfile';
    
    private saveToDisk(): void {
      if (!this.persistPath) return;
      try {
        const release = lockfile.lockSync(this.persistPath);
        const data: Record<string, AgentTrustState> = {};
        for (const [key, value] of this.agents) {
          data[key] = value;
        }
        fs.writeFileSync(this.persistPath, JSON.stringify(data), 'utf-8');
        release();
      } catch {
        // best-effort: ignore write errors
      }
    }

🟡 MEDIUM: Error Handling and Logging

  • Issue: The saveToDisk and loadFromDisk methods silently ignore errors. While this is a "best-effort" approach, it could mask critical issues, such as file system corruption or permission errors.
  • Attack Vector: If the persistence mechanism fails silently, users may operate under the false assumption that trust scores are being persisted, leading to potential security gaps.
  • Fix: Log errors (without exposing sensitive information) to provide visibility into failures. For example:
    private saveToDisk(): void {
      if (!this.persistPath) return;
      try {
        const data: Record<string, AgentTrustState> = {};
        for (const [key, value] of this.agents) {
          data[key] = value;
        }
        fs.writeFileSync(this.persistPath, JSON.stringify(data), 'utf-8');
      } catch (error) {
        console.error('Failed to save trust scores to disk:', error.message);
      }
    }

🔵 LOW: Optional Persistence Path

  • Issue: The persistPath is optional, which is fine, but there is no explicit warning or documentation about the implications of not setting it.
  • Attack Vector: Users may unintentionally omit the persistPath, leading to non-persistent trust scores without realizing it.
  • Fix: Add a warning or documentation note to clarify the behavior when persistPath is not set.

Summary of Findings

Finding Severity Fix
Deserialization attack vector 🔴 CRITICAL Validate JSON structure before using it.
Credential exposure via persistence 🟠 HIGH Encrypt the file contents and restrict file permissions.
Race condition in file access 🟠 HIGH Use file locks to ensure atomic read/write operations.
Error handling and logging 🟡 MEDIUM Log errors to provide visibility into failures.
Optional persistence path lacks documentation 🔵 LOW Add documentation or warnings about the implications of not setting persistPath.

Recommendations

  1. Address the critical deserialization issue immediately by validating the JSON structure.
  2. Implement encryption and file locking to secure the persistence mechanism.
  3. Improve error handling and logging to ensure visibility into failures.
  4. Update documentation to clarify the behavior of the persistPath option.

This PR introduces a useful feature, but the identified issues must be addressed to ensure the security and reliability of the TrustManager class.

Copy link
Copy Markdown

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 AI Agent: code-reviewer

Review Summary

This pull request introduces trust score persistence to the TrustManager class in the TypeScript SDK. The feature allows trust scores to be saved to and loaded from a file specified by a persistPath configuration option. The implementation includes methods for saving and loading trust scores and corresponding unit tests to validate the functionality.

While the implementation is generally solid and well-tested, there are a few areas of concern and improvement opportunities, particularly around security, error handling, and potential edge cases.


🔴 CRITICAL

  1. Unencrypted Trust Score Persistence

    • Issue: The trust scores are being saved to disk in plaintext JSON format. This could expose sensitive information about agents if the file is accessed by unauthorized users.
    • Impact: This could lead to unauthorized access or tampering with trust scores, potentially compromising the integrity of the trust system.
    • Recommendation: Encrypt the trust score data before writing it to disk. Use a secure encryption algorithm (e.g., AES-256) and ensure the encryption key is securely managed (e.g., via environment variables or a secure key management system).
  2. Lack of File Permission Management

    • Issue: The file is written to disk without explicitly setting restrictive file permissions. This could allow other processes or users on the same system to read or modify the file.
    • Impact: Unauthorized access to the trust score file could compromise the security of the system.
    • Recommendation: Use fs.writeFileSync with a mode argument to set restrictive file permissions (e.g., 0o600 for read/write access only to the file owner).
  3. No Integrity Check for Loaded Data

    • Issue: The loadFromDisk method does not verify the integrity of the loaded data. A malicious actor could tamper with the file to inject invalid or malicious data.
    • Impact: This could lead to incorrect trust scores or even a denial-of-service attack if the data is malformed.
    • Recommendation: Implement a mechanism to verify the integrity of the file, such as using a cryptographic hash or digital signature.

🟡 WARNING

  1. Breaking Change in TrustConfig
    • Issue: The TrustConfig interface has been modified to include a new optional property, persistPath. While this is backward-compatible in TypeScript (due to the optional nature), it could cause runtime issues if existing configurations are passed to the TrustManager without accounting for this new property.
    • Impact: Potential runtime errors or unexpected behavior in downstream consumers of the SDK.
    • Recommendation: Clearly document this change in the release notes and ensure that downstream consumers are aware of the new configuration option.

💡 SUGGESTIONS

  1. Error Logging

    • Observation: The saveToDisk and loadFromDisk methods silently ignore errors (e.g., file write or read failures).
    • Recommendation: Log errors using a logging library or mechanism to aid in debugging and monitoring. This will help identify issues during runtime without failing silently.
  2. Concurrency Safety

    • Observation: The saveToDisk and loadFromDisk methods are not thread-safe. Concurrent access to the agents map or the file could lead to race conditions or data corruption.
    • Recommendation: Use a locking mechanism or atomic file operations to ensure thread safety when reading from or writing to the file.
  3. Test Coverage

    • Observation: The tests cover basic scenarios, such as persistence across instances, handling missing or corrupt files, and the absence of a persistPath. However, edge cases like concurrent access to the same file or large-scale data (e.g., thousands of agents) are not tested.
    • Recommendation: Add tests for concurrent access to the same persistPath and for handling large datasets to ensure the implementation scales and handles edge cases gracefully.
  4. Validation of Loaded Data

    • Observation: The loadFromDisk method does not validate the structure of the loaded data.
    • Recommendation: Add validation to ensure that the loaded data conforms to the AgentTrustState interface. This can prevent runtime errors caused by malformed or unexpected data.
  5. Documentation

    • Observation: The new persistPath option is not documented in the code comments or any accompanying documentation.
    • Recommendation: Update the documentation to explain the purpose of persistPath, how to use it, and any security considerations (e.g., file permissions, encryption).
  6. Error Handling Granularity

    • Observation: The try-catch blocks in saveToDisk and loadFromDisk catch all errors but do not differentiate between recoverable and non-recoverable errors.
    • Recommendation: Differentiate between recoverable errors (e.g., file not found) and non-recoverable errors (e.g., permission denied) to handle them appropriately.

Summary of Changes Needed

Security

  • 🔴 Encrypt trust score data before saving to disk.
  • 🔴 Set restrictive file permissions when writing the file.
  • 🔴 Add integrity checks for loaded data.

Backward Compatibility

  • 🟡 Document the new persistPath property in TrustConfig and notify downstream consumers.

Improvements

  • 💡 Add error logging for saveToDisk and loadFromDisk.
  • 💡 Ensure thread safety for file operations and agents map access.
  • 💡 Add tests for concurrent access and large datasets.
  • 💡 Validate the structure of loaded data.
  • 💡 Differentiate between recoverable and non-recoverable errors in try-catch blocks.

By addressing these issues and suggestions, the feature can be made more robust, secure, and reliable. Let me know if you need further clarification or assistance!

Copy link
Copy Markdown

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 AI Agent: code-reviewer

Review Summary

This pull request introduces trust score persistence to the TrustManager class in the TypeScript SDK. The implementation includes methods to save and load trust scores to/from a file, and corresponding tests to validate the functionality. While the implementation is generally well-structured and the tests are comprehensive, there are a few areas that require attention to ensure security, correctness, and maintainability.


🔴 CRITICAL

  1. Unvalidated File Path (Potential Path Traversal Vulnerability):

    • The persistPath is directly used in fs.writeFileSync and fs.readFileSync without validation. This could lead to path traversal vulnerabilities if an attacker can control the persistPath value.
    • Action: Validate the persistPath to ensure it is within an expected directory or follows a specific pattern. For example:
      import * as path from 'path';
      const baseDir = '/expected/base/directory';
      const resolvedPath = path.resolve(baseDir, persistPath);
      if (!resolvedPath.startsWith(baseDir)) {
        throw new Error('Invalid persistPath: Path traversal detected');
      }
  2. Unencrypted Storage of Sensitive Data:

    • Trust scores are being written to disk in plaintext. If the file is accessed by unauthorized users, it could compromise the trust model.
    • Action: Encrypt the data before writing it to disk. Use a secure encryption library (e.g., crypto module) and ensure proper key management. Example:
      import * as crypto from 'crypto';
      const cipher = crypto.createCipher('aes-256-cbc', encryptionKey);
      const encrypted = Buffer.concat([cipher.update(JSON.stringify(data)), cipher.final()]);
      fs.writeFileSync(this.persistPath, encrypted);
  3. Error Handling in Persistence Methods:

    • The saveToDisk and loadFromDisk methods silently ignore errors. This could lead to undetected failures in saving/loading trust scores, potentially causing security issues or data loss.
    • Action: Log errors or provide a mechanism to surface them. For example:
      try {
        fs.writeFileSync(this.persistPath, JSON.stringify(data), 'utf-8');
      } catch (error) {
        console.error('Failed to save trust scores:', error);
      }

🟡 WARNING

  1. Potential Breaking Change in TrustConfig:
    • The TrustConfig interface now includes an optional persistPath property. If existing consumers of the SDK are using TrustConfig and are not expecting this property, it could lead to unexpected behavior.
    • Action: Document this change clearly in the release notes and ensure that it is backward-compatible. Consider adding runtime validation to ensure that existing configurations without persistPath continue to work as expected.

💡 SUGGESTIONS

  1. Concurrency and Thread Safety:

    • The TrustManager class uses a Map to store agent trust states. If the TrustManager is used in a multi-threaded or asynchronous context, concurrent access to the agents map could lead to race conditions.
    • Action: Use a synchronization mechanism (e.g., Mutex or Lock) to ensure thread-safe access to the agents map and the saveToDisk/loadFromDisk methods.
  2. Test Coverage for Edge Cases:

    • While the tests cover basic scenarios (e.g., missing or corrupt files), additional edge cases should be tested:
      • What happens if the file is locked by another process?
      • What happens if the disk is full or the file system is read-only?
    • Action: Add tests to simulate these scenarios and ensure the TrustManager behaves as expected.
  3. Documentation:

    • The new persistPath feature should be documented in the SDK's README or API documentation. Include examples of how to use it and any security considerations (e.g., encrypting the file, setting appropriate file permissions).
  4. File Permissions:

    • The file created for persistence should have restrictive permissions to prevent unauthorized access.
    • Action: Use fs.chmodSync to set permissions (e.g., 0o600 for owner read/write only).
  5. Decouple Persistence Logic:

    • The persistence logic is tightly coupled with the TrustManager class. This could make it harder to test or extend in the future.
    • Action: Consider abstracting the persistence logic into a separate class or module (e.g., FilePersistence) that the TrustManager can use.

Final Assessment

  • The feature is a valuable addition to the SDK and aligns with the goal of improving trust score management.
  • However, the current implementation has critical security issues (path traversal, lack of encryption) that must be addressed before merging.
  • Addressing the concurrency concerns and improving test coverage will further enhance the robustness of the implementation.

Please address the critical issues and consider the suggestions before merging this pull request.

@imran-siddique imran-siddique enabled auto-merge (squash) March 29, 2026 16:37
Copy link
Copy Markdown

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 AI Agent: code-reviewer

Review Summary

This PR introduces trust score persistence to the TrustManager class in the TypeScript SDK. It allows trust scores to be saved to and loaded from a file, ensuring continuity across application restarts. The implementation includes new methods (saveToDisk and loadFromDisk) and updates to the TrustManager class to handle persistence. Unit tests are provided to validate the functionality.

While the feature is well-implemented and tested, there are some areas of concern and opportunities for improvement.


🔴 CRITICAL

  1. Unencrypted Trust Data Storage:

    • The trust scores are being stored in plaintext JSON files. This could expose sensitive data if the file is accessed by unauthorized users.
    • Recommendation: Encrypt the trust data before writing it to disk. Use a secure encryption library (e.g., AES encryption) and ensure the encryption key is securely managed (e.g., via environment variables or a secure key management service).
  2. No File Permission Hardening:

    • The persistPath file is created without explicitly setting restrictive file permissions. This could allow unauthorized users on the same system to read or modify the file.
    • Recommendation: Use fs.writeFileSync with the mode option to set restrictive permissions (e.g., 0o600 for owner-only read/write access).

🟡 WARNING

  1. Backward Compatibility:
    • The addition of the persistPath property to the TrustConfig interface is a potential breaking change for users who rely on strict type checking. Although it's optional, existing code that uses TrustConfig may need to be updated to handle this new property.
    • Recommendation: Clearly document this change in the release notes and consider bumping the minor version to indicate the addition of new functionality.

💡 SUGGESTIONS

  1. Error Handling for File Operations:

    • The saveToDisk and loadFromDisk methods silently ignore errors during file operations. While this is acceptable for a best-effort approach, it may lead to silent failures that are difficult to debug.
    • Recommendation: Log errors using a logging framework or provide a callback mechanism to notify the user of failures.
  2. Concurrency Concerns:

    • The saveToDisk and loadFromDisk methods do not handle concurrent access to the persistPath file. If multiple instances of TrustManager are running simultaneously, they could overwrite each other's changes or corrupt the file.
    • Recommendation: Implement file locking or use an atomic write library to prevent race conditions.
  3. Data Validation:

    • The loadFromDisk method assumes that the JSON file contains valid AgentTrustState objects. If the file is corrupted or tampered with, this could lead to runtime errors or incorrect behavior.
    • Recommendation: Validate the parsed data against the expected structure before using it. Consider using a schema validation library or manually check the properties.
  4. Test Coverage:

    • While the tests cover the main scenarios, they do not test edge cases like:
      • Concurrent access to the persistPath file.
      • Large numbers of agents being persisted and loaded.
      • Performance impact of frequent saveToDisk calls.
    • Recommendation: Add tests for these scenarios to ensure robustness.
  5. Documentation:

    • The new persistPath option in TrustConfig is not documented in the code comments or the PR description.
    • Recommendation: Update the documentation to explain how to use this feature, including any security considerations (e.g., file permissions, encryption).
  6. Decoupling Persistence Logic:

    • The persistence logic is tightly coupled with the TrustManager class. This makes it harder to test and extend.
    • Recommendation: Consider abstracting the persistence logic into a separate class or interface (e.g., TrustPersistence) that the TrustManager can use. This would also make it easier to support alternative storage mechanisms (e.g., databases, cloud storage) in the future.

Additional Notes

  • The use of try/catch blocks for best-effort error handling is reasonable for this context but should be complemented with logging or user notifications.
  • The tests are well-written and cover the primary use cases. The use of temporary files ensures that tests do not interfere with each other.

Actionable Summary

🔴 CRITICAL

  • Encrypt trust data before writing to disk.
  • Set restrictive file permissions for the persistPath file.

🟡 WARNING

  • Document the addition of the persistPath property as a potential breaking change.

💡 SUGGESTIONS

  • Log errors during file operations.
  • Handle concurrent access to the persistPath file.
  • Validate data loaded from disk.
  • Add tests for edge cases and performance.
  • Document the persistPath option and its security implications.
  • Decouple persistence logic into a separate class or interface.

By addressing these issues, the feature can be made more secure, robust, and maintainable.

@imran-siddique imran-siddique disabled auto-merge March 29, 2026 18:41
Copy link
Copy Markdown

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 AI Agent: code-reviewer

Review Summary

This PR introduces trust score persistence to the TrustManager class in the TypeScript SDK. It allows trust scores to be saved to and loaded from a file, ensuring state continuity across application restarts. The implementation includes new methods (saveToDisk and loadFromDisk) and updates the TrustConfig interface to include an optional persistPath parameter. The PR also includes comprehensive tests to validate the persistence functionality.

While the implementation is generally well-structured and adheres to the project's coding standards, there are some areas of concern and opportunities for improvement.


🔴 CRITICAL

  1. Unencrypted Storage of Sensitive Data
    The trust scores are being stored in plaintext JSON files. If the persistPath is set to a location accessible by unauthorized users, this could lead to sensitive data exposure or tampering.
    Recommendation: Encrypt the trust scores before writing them to disk and decrypt them when reading. Use a secure encryption library and ensure the encryption key is managed securely.

  2. No Integrity Check for Persisted Data
    The implementation does not verify the integrity of the persisted data. This could allow an attacker to tamper with the file and inject malicious or incorrect trust scores.
    Recommendation: Use a cryptographic hash (e.g., HMAC) to verify the integrity of the file. Store the hash securely and validate it before loading the data.

  3. Error Handling in Persistence Methods
    The saveToDisk and loadFromDisk methods silently ignore all errors. This could lead to undetected issues, such as partial writes or corrupted files, which might compromise the trust system.
    Recommendation: Log errors when they occur, and consider re-throwing critical exceptions. At a minimum, provide a mechanism to alert the user or system administrator when persistence fails.


🟡 WARNING

  1. Breaking Change in TrustConfig Interface
    The TrustConfig interface now includes an optional persistPath property. While this is backward-compatible (since it's optional), any existing implementations that use TrustConfig might need to be updated to handle this new property.
    Recommendation: Clearly document this change in the release notes and ensure that all dependent code is updated accordingly.

💡 SUGGESTIONS

  1. Concurrency and Thread Safety
    The saveToDisk and loadFromDisk methods are not thread-safe. Concurrent access to the agents map or the file system could lead to race conditions or data corruption.
    Recommendation: Use a locking mechanism or a queue to serialize access to these methods. Alternatively, consider using an atomic file write library to ensure safe file operations.

  2. File Path Validation
    The persistPath is directly used without validation. If the path is invalid or points to a restricted location, it could cause unexpected behavior.
    Recommendation: Validate the persistPath during initialization to ensure it is a valid, writable path.

  3. Test Coverage for Edge Cases
    While the tests cover basic scenarios, additional edge cases should be tested:

    • What happens if the file system is full during a write operation?
    • What if the file is deleted or modified while the application is running?
    • What if the persistPath points to a directory instead of a file?
  4. Performance Considerations
    The saveToDisk method writes the entire agents map to disk on every update. For large datasets, this could become a performance bottleneck.
    Recommendation: Implement a batching mechanism or a debounce strategy to reduce the frequency of writes. Alternatively, consider using a database or a more efficient storage mechanism for large-scale use cases.

  5. Documentation
    The new persistPath feature is not documented in the code comments or the PR description.
    Recommendation: Add documentation to explain how to use this feature, including any security considerations (e.g., file permissions, encryption).

  6. Error Logging
    The catch blocks in saveToDisk and loadFromDisk do not log any errors. This could make debugging difficult if issues arise.
    Recommendation: Use a logging library to log errors with sufficient context for debugging.


Final Assessment

The feature is a valuable addition to the TypeScript SDK, enhancing its functionality and aligning it with the stated goal of parity. However, the current implementation has critical security flaws that must be addressed before merging. Additionally, there are minor concerns about backward compatibility, thread safety, and performance that should be considered.

Actionable Steps Before Merging:

  1. Implement encryption and integrity checks for persisted data. 🔴
  2. Improve error handling and logging in saveToDisk and loadFromDisk. 🔴
  3. Validate the persistPath during initialization. 💡
  4. Address potential concurrency issues in persistence methods. 💡
  5. Document the new persistPath feature and its usage. 💡
  6. Add tests for edge cases related to file system errors and invalid paths. 💡

Once these issues are resolved, the PR will be ready for merging.

Copy link
Copy Markdown

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 AI Agent: code-reviewer

Review Summary

This PR introduces trust score persistence to the TrustManager class in the TypeScript SDK. The feature allows trust scores to be saved to and loaded from disk via an optional persistPath configuration. The implementation includes new methods (saveToDisk and loadFromDisk) and corresponding tests to ensure functionality and robustness.

While the implementation is generally solid, there are some areas of concern and opportunities for improvement, particularly around security, error handling, and potential breaking changes.


🔴 CRITICAL

  1. Unencrypted Trust Score Storage:

    • The trust scores are being stored in plaintext JSON files. If the file is accessed by an unauthorized user or process, sensitive information about agent interactions could be exposed.
    • Recommendation: Encrypt the trust score data before writing it to disk. Use a secure encryption library (e.g., crypto in Node.js) and ensure the encryption key is securely managed.
  2. Lack of File Permission Hardening:

    • The fs.writeFileSync method does not specify file permissions. This could result in the file being created with default permissions, potentially allowing unauthorized access.
    • Recommendation: Use fs.writeFileSync with a mode argument (e.g., 0o600) to restrict file access to the owner only.
  3. Silent Error Handling:

    • Both saveToDisk and loadFromDisk silently ignore errors. This could lead to undetected issues, such as corrupted or missing trust score data.
    • Recommendation: Log errors or provide a mechanism to notify the user when an error occurs. At the very least, differentiate between expected errors (e.g., file not found) and unexpected ones (e.g., permission issues).

🟡 WARNING

  1. Breaking Change in TrustConfig:
    • The TrustConfig interface now includes an optional persistPath property. While this is a non-breaking change for existing code, it could lead to unexpected behavior if users are not aware of the new feature.
    • Recommendation: Clearly document this change in the release notes and update any relevant documentation to explain how persistPath works.

💡 SUGGESTIONS

  1. Concurrency Safety:

    • The saveToDisk and loadFromDisk methods are not thread-safe. Concurrent calls to these methods could result in race conditions or corrupted files.
    • Recommendation: Use a locking mechanism (e.g., fs.promises.open with O_EXCL) to ensure that only one process can write to the file at a time.
  2. Validation of Loaded Data:

    • The loadFromDisk method assumes that the JSON data is valid and directly casts it to Record<string, AgentTrustState>. This could lead to runtime errors if the file contains unexpected data.
    • Recommendation: Validate the loaded data against the AgentTrustState structure before using it. Consider using a schema validation library like zod or io-ts.
  3. Test Coverage:

    • While the tests cover basic functionality and edge cases (e.g., missing or corrupt files), they do not test concurrent access or large datasets.
    • Recommendation: Add tests for concurrent access to the persistence file and scenarios with a large number of agents to ensure scalability and robustness.
  4. Performance Optimization:

    • The saveToDisk method serializes the entire agents map to JSON on every update. This could become a performance bottleneck for large datasets.
    • Recommendation: Implement a batching mechanism or debounce writes to reduce the frequency of disk I/O operations.
  5. Documentation:

    • The new persistPath feature is not documented in the code comments or external documentation.
    • Recommendation: Add detailed comments and update the SDK documentation to explain how to use the persistPath feature, including security considerations.

Additional Notes

  • The use of try...catch blocks in saveToDisk and loadFromDisk is appropriate for handling file system errors, but the lack of logging makes debugging difficult.
  • The tests are well-structured and cover a variety of scenarios, but they could benefit from additional assertions to verify file permissions and encryption (if implemented).

Action Items

  1. 🔴 Encrypt trust score data before writing to disk.
  2. 🔴 Harden file permissions when creating the persistence file.
  3. 🔴 Improve error handling in saveToDisk and loadFromDisk.
  4. 🟡 Document the persistPath feature and its implications.
  5. 💡 Add thread safety to disk operations.
  6. 💡 Validate loaded data against the expected schema.
  7. 💡 Optimize disk I/O operations for large datasets.
  8. 💡 Expand test coverage to include concurrency and performance scenarios.

By addressing these issues and suggestions, the feature will be more secure, robust, and user-friendly. Let me know if you need further clarification or assistance!

Copy link
Copy Markdown

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 AI Agent: code-reviewer

Feedback on the Pull Request

🔴 CRITICAL: Security Issues

  1. Unvalidated Input in loadFromDisk:

    • The loadFromDisk method directly parses JSON data from the file without validating its structure. This could lead to potential security risks, such as deserialization attacks or runtime errors if the file is tampered with or corrupted.
    • Recommendation: Use a schema validation library (e.g., zod or ajv) to validate the structure of the parsed JSON data before using it. Ensure that the data matches the expected AgentTrustState structure.
  2. Silent Failures in File Operations:

    • Both saveToDisk and loadFromDisk suppress all errors by catching exceptions and doing nothing. This could lead to silent failures, making it difficult to debug issues or detect tampering.
    • Recommendation: Log errors when file operations fail. For example, use console.error or integrate with a logging framework to capture these issues.
  3. Potential Path Traversal Vulnerability:

    • The persistPath is directly used for file operations without sanitization or validation. If an attacker can control the persistPath, they could exploit this to perform path traversal attacks and overwrite or read sensitive files.
    • Recommendation: Validate and sanitize the persistPath to ensure it does not contain malicious paths. Use libraries like path.resolve to ensure the path is within a safe directory.
  4. Lack of File Permissions Control:

    • The saveToDisk method does not set restrictive file permissions when creating the file. This could allow unauthorized users to read or modify the file.
    • Recommendation: Use fs.writeFileSync with appropriate file permissions (e.g., 0o600 for owner-only read/write access).

🟡 WARNING: Potential Breaking Changes

  1. Change in TrustConfig Interface:
    • The TrustConfig interface now includes an optional persistPath property. While this is a non-breaking change for existing code, it could lead to unexpected behavior if existing configurations inadvertently include a persistPath property.
    • Recommendation: Clearly document this change in the release notes and ensure that any existing configurations are reviewed for compatibility.

💡 Suggestions for Improvement

  1. Concurrency and Thread Safety:

    • The TrustManager class uses a Map to store agent trust states. However, there is no explicit handling of concurrent access to this map. If the TrustManager is used in a multi-threaded or asynchronous context, this could lead to race conditions.
    • Recommendation: Use a synchronization mechanism (e.g., a Mutex or Lock) to ensure thread-safe access to the agents map.
  2. Atomic File Writes:

    • The saveToDisk method writes directly to the file, which could result in a corrupted file if the process is interrupted.
    • Recommendation: Write to a temporary file first, then rename it to the target file. This ensures atomicity and prevents partial writes.
  3. Test Coverage:

    • The new tests cover basic persistence scenarios, but additional edge cases should be tested:
      • What happens if the persistPath points to a directory instead of a file?
      • What happens if the file is locked or inaccessible due to permissions?
      • What happens if the disk is full during a write operation?
  4. Documentation:

    • The persistPath feature is not documented in the code comments or the PR description. This could lead to confusion for developers using the TrustManager.
    • Recommendation: Add detailed comments and update any relevant documentation to explain how the persistPath feature works, including its limitations and security considerations.
  5. Error Handling Granularity:

    • The catch blocks in saveToDisk and loadFromDisk are too broad. They suppress all errors, including programming errors (e.g., TypeError) and operational errors (e.g., ENOENT).
    • Recommendation: Narrow the scope of the catch blocks to handle only expected errors (e.g., ENOENT for missing files) and rethrow unexpected ones.
  6. Backward Compatibility Tests:

    • While the changes appear backward-compatible, it would be prudent to add tests that verify the behavior of the TrustManager when persistPath is not set, ensuring no regressions are introduced.

Summary of Actionable Items

  1. 🔴 Add schema validation for the parsed JSON data in loadFromDisk.
  2. 🔴 Log errors in saveToDisk and loadFromDisk instead of silently ignoring them.
  3. 🔴 Validate and sanitize the persistPath to prevent path traversal attacks.
  4. 🔴 Set restrictive file permissions (e.g., 0o600) when writing the persistence file.
  5. 🟡 Document the addition of persistPath in the release notes and ensure compatibility with existing configurations.
  6. 💡 Add thread-safety mechanisms for the agents map.
  7. 💡 Implement atomic file writes in saveToDisk.
  8. 💡 Expand test coverage to include edge cases for file operations.
  9. 💡 Add detailed documentation for the persistPath feature.
  10. 💡 Narrow the scope of catch blocks in file operations.

By addressing these issues and suggestions, the code will be more secure, robust, and maintainable. Let me know if you need further clarification or assistance!

@imran-siddique imran-siddique merged commit c63fb93 into microsoft:main Mar 29, 2026
50 of 51 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agent-mesh agent-mesh package size/M Medium PR (< 200 lines) tests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant