Skip to content

Conversation

yogeshselvarajan
Copy link

@yogeshselvarajan yogeshselvarajan commented Sep 9, 2025

📝 Description

This PR introduces a new Azure Blob Storage connector to the IO.Storage subsystem. This addition expands SmythOS's cloud provider support, offering users a highly scalable and secure storage option within the Microsoft Azure ecosystem.

The new AzureBlobStorage class implements the complete StorageConnector interface, providing full CRUD (read, write, delete, exists), metadata, and ACL management. It seamlessly integrates with the SmythOS security model by storing custom ACLs in the blob's metadata.

Key features include:

  • Secure authentication using an access key (StorageSharedKeyCredential).
  • An expire method that uses Azure Blob Index Tags, allowing for automatic object deletion via Azure's Lifecycle Management rules.

🔗 Related Issues

  • Fixes #
  • Relates to #

🔧 Type of Change

  • 🐛 Bug fix (non-breaking change that fixes an issue)
  • ✨ New feature (non-breaking change that adds functionality)
  • 📚 Documentation update
  • 🔧 Code refactoring (no functional changes)
  • 🧪 Test improvements
  • 🔨 Build/CI changes

✅ Checklist

  • Self-review performed
  • Tests added/updated
  • Documentation updated (if needed)

Summary by CodeRabbit

  • New Features

    • Added Azure Blob Storage as a supported storage backend and made its connector available for direct use.
  • Documentation

    • Expanded storage docs with Azure Blob Storage setup and examples.
    • Updated service usage examples to include Azure alongside existing providers.
    • Clarified README to reference AWS S3, Azure Blob Storage, and other providers.
    • Added a runnable Azure Blob Storage example.
  • Chores

    • Added Azure Blob Storage SDK as a dependency and registered the connector with the storage service.

Copy link

coderabbitai bot commented Sep 9, 2025

Walkthrough

Adds an Azure Blob Storage connector: implements AzureBlobStorage with ACL/metadata/lifecycle methods, registers and exports it from core, adds the Azure SDK dependency, provides docs updates and a new example demonstrating write/read verification against Azure Blob Storage.

Changes

Cohort / File(s) Summary
Docs: storage provider mentions
README.md, packages/core/docs/connectors/storage.md, packages/sdk/docs/07-services.md
README broadened storage provider wording; core docs add an Azure Blob Storage connector section with config, usage, and security notes; SDK docs sample lists AzureBlobStorage() alongside S3/LocalStorage in examples.
Core: Azure connector implementation and wiring
packages/core/src/subsystems/IO/Storage.service/connectors/AzureBlobStorage.class.ts, packages/core/src/subsystems/IO/Storage.service/index.ts, packages/core/src/index.ts
New AzureBlobStorage connector and AzureBlobConfig type implemented (read/write/delete/exists/metadata/ACL/expire, metadata/ACL migration helpers, initialization/lifecycle); connector registered with Storage service and re-exported from core index.
Dependency
packages/core/package.json
Adds runtime dependency @azure/storage-blob ^12.28.0.
Example
examples/06-Storage-no-agent/03-AzureBlobStorage.ts
New TypeScript example: loads env, validates Azure credentials/container, writes smythos-azure-test.txt, reads and verifies contents, logs progress and errors.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  actor Dev as Developer Script
  participant SDK as @smythos/sdk Storage API
  participant Conn as AzureBlobStorage Connector
  participant Az as Azure Blob SDK

  Dev->>SDK: storage.write(resourceId, data, opts)
  SDK->>Conn: write(acRequest, resourceId, value, acl?, metadata?)
  Note over Conn: ensure container initialized (cached promise)
  Conn->>Az: uploadBlob + set headers & metadata
  Az-->>Conn: 201 Created
  Conn-->>SDK: resolve write
  SDK-->>Dev: write complete

  Dev->>SDK: storage.read(resourceId)
  SDK->>Conn: read(acRequest, resourceId)
  Conn->>Az: downloadToBuffer
  alt blob found
    Az-->>Conn: 200 OK (buffer)
    Conn-->>SDK: data
    SDK-->>Dev: content returned
  else not found
    Az-->>Conn: 404
    Conn-->>SDK: undefined
    SDK-->>Dev: not found
  end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Pre-merge checks (2 passed, 1 warning)

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Description Check ⚠️ Warning The pull request description is missing actual issue references in the “Related Issues” section, leaving placeholder entries instead of linking to any relevant issue numbers. Update the “Related Issues” section to reference the appropriate issue(s) using “Fixes #” or “Relates to #”, or explicitly state “None” if there are no related issues.
✅ Passed checks (2 passed)
Check name Status Explanation
Title Check ✅ Passed The title succinctly and clearly captures the main change by indicating the addition of an Azure Blob Storage connector to the storage subsystem, following the conventional commit style and matching the changes in the pull request.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

Poem

I hop between the clouds with glee,
From S3’s trees to Azure’s sea,
A blob of bytes, a carrot’s crunch,
I write, I read—then lunchy-lunch! 🥕
Connectors bloom, my whiskers twirl.

Tip

👮 Agentic pre-merge checks are now available in preview!

Pro plan users can now enable pre-merge checks in their settings to enforce checklists before merging PRs.

  • Built-in checks – Quickly apply ready-made checks to enforce title conventions, require pull request descriptions that follow templates, validate linked issues for compliance, and more.
  • Custom agentic checks – Define your own rules using CodeRabbit’s advanced agentic capabilities to enforce organization-specific policies and workflows. For example, you can instruct CodeRabbit’s agent to verify that API documentation is updated whenever API schema files are modified in a PR. Note: Upto 5 custom checks are currently allowed during the preview period. Pricing for this feature will be announced in a few weeks.

Example:

reviews:
  pre_merge_checks:
    custom_checks:
      - name: "Undocumented Breaking Changes"
        mode: "warning"
        instructions: |
          Pass/fail criteria: All breaking changes to public APIs, CLI flags, environment variables, configuration keys, database schemas, or HTTP/GraphQL endpoints must be documented in the "Breaking Change" section of the PR description and in CHANGELOG.md. Exclude purely internal or private changes (e.g., code not exported from package entry points or explicitly marked as internal).

Please share your feedback with us on this Discord post.

✨ Finishing Touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@yogeshselvarajan
Copy link
Author

@coderabbitai review

Copy link

coderabbitai bot commented Sep 9, 2025

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

🧹 Nitpick comments (7)
packages/core/docs/connectors/storage.md (2)

85-86: Fix markdown horizontal rule to satisfy markdownlint (MD035).

Replace the 5-dash rule with the standard 3-dash rule.

Apply:

------
+---

118-128: Normalize unordered list style/indentation (MD004, MD007).

Switch asterisks to dashes and remove extra indentation.

Apply:

-  * Production environments, especially those hosted on Microsoft Azure
-  * Applications requiring durable, high-availability, and geo-redundant storage
-  * Integration with the Azure ecosystem (Azure Functions, Logic Apps, etc.)
-  * Storing large-scale unstructured data like images, videos, and documents
+ - Production environments, especially those hosted on Microsoft Azure
+ - Applications requiring durable, high-availability, and geo-redundant storage
+ - Integration with the Azure ecosystem (Azure Functions, Logic Apps, etc.)
+ - Storing large-scale unstructured data like images, videos, and documents

And:

-  * Use Managed Identities when running on Azure infrastructure for the most secure, keyless authentication.
-  * Store credentials securely using environment variables or Azure Key Vault.
-  * Configure container access policies and use Shared Access Signatures (SAS) for fine-grained, temporary access.
-  * Enable encryption at rest and in transit for sensitive data.
+ - Use Managed Identities when running on Azure infrastructure for the most secure, keyless authentication.
+ - Store credentials securely using environment variables or Azure Key Vault.
+ - Configure container access policies and use Shared Access Signatures (SAS) for fine-grained, temporary access.
+ - Enable encryption at rest and in transit for sensitive data.
examples/06-Storage-no-agent/03-AzureBlobStorage.ts (1)

45-47: Exit with non-zero on failure to aid CI signal.

Return keeps the process "successful." Prefer process.exit(1).

Apply:

-main().catch(error => {
-    console.error("An error occurred:", error.message);
-});
+main().catch(error => {
+    console.error("An error occurred:", (error as Error).message ?? error);
+    process.exit(1);
+});
packages/core/src/index.ts (1)

165-165: Missing semicolon.

Keep style consistent with surrounding exports.

Apply:

-export * from './subsystems/IO/Storage.service/connectors/AzureBlobStorage.class'
+export * from './subsystems/IO/Storage.service/connectors/AzureBlobStorage.class';
packages/core/src/subsystems/IO/Storage.service/index.ts (1)

7-14: Formatting + semicolons for consistency.

Minor style nits on import spacing and missing semicolons.

Apply:

-import { AzureBlobStorage} from './connectors/AzureBlobStorage.class'
+import { AzureBlobStorage } from './connectors/AzureBlobStorage.class';
@@
-        ConnectorService.register(TConnectorService.Storage, 'AzureBlobStorage', AzureBlobStorage);
+        ConnectorService.register(TConnectorService.Storage, 'AzureBlobStorage', AzureBlobStorage);
packages/core/src/subsystems/IO/Storage.service/connectors/AzureBlobStorage.class.ts (2)

54-59: Consider sovereign clouds support (optional endpoint suffix).

Hardcoding blob.core.windows.net excludes Azure Gov/China. Allow overriding endpoint suffix.

Example:

-        const endpointUrl = `https://${_settings.storageAccountName}.blob.core.windows.net`;
+        const endpointSuffix = (_settings as any).endpointSuffix ?? 'blob.core.windows.net';
+        const endpointUrl = `https://${_settings.storageAccountName}.${endpointSuffix}`;

17-18: Naming: avoid shadowing the global console.

Logger('AzureBlobStorage') assigned to const console is confusing. Prefer const logger = Logger(...).

If you want, I can prep a follow-up PR to rename usages.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge Base: Disabled due to data retention organization setting

📥 Commits

Reviewing files that changed from the base of the PR and between a60550a and 9cdf68e.

⛔ Files ignored due to path filters (1)
  • pnpm-lock.yaml is excluded by !**/pnpm-lock.yaml
📒 Files selected for processing (8)
  • README.md (1 hunks)
  • examples/06-Storage-no-agent/03-AzureBlobStorage.ts (1 hunks)
  • packages/core/docs/connectors/storage.md (1 hunks)
  • packages/core/package.json (1 hunks)
  • packages/core/src/index.ts (1 hunks)
  • packages/core/src/subsystems/IO/Storage.service/connectors/AzureBlobStorage.class.ts (1 hunks)
  • packages/core/src/subsystems/IO/Storage.service/index.ts (1 hunks)
  • packages/sdk/docs/07-services.md (1 hunks)
🧰 Additional context used
🪛 markdownlint-cli2 (0.17.2)
packages/core/docs/connectors/storage.md

85-85: Horizontal rule style
Expected: ---; Actual: -----

(MD035, hr-style)


118-118: Unordered list style
Expected: dash; Actual: asterisk

(MD004, ul-style)


118-118: Unordered list indentation
Expected: 0; Actual: 2

(MD007, ul-indent)


119-119: Unordered list style
Expected: dash; Actual: asterisk

(MD004, ul-style)


119-119: Unordered list indentation
Expected: 0; Actual: 2

(MD007, ul-indent)


120-120: Unordered list style
Expected: dash; Actual: asterisk

(MD004, ul-style)


120-120: Unordered list indentation
Expected: 0; Actual: 2

(MD007, ul-indent)


121-121: Unordered list style
Expected: dash; Actual: asterisk

(MD004, ul-style)


121-121: Unordered list indentation
Expected: 0; Actual: 2

(MD007, ul-indent)


125-125: Unordered list style
Expected: dash; Actual: asterisk

(MD004, ul-style)


125-125: Unordered list indentation
Expected: 0; Actual: 2

(MD007, ul-indent)


126-126: Unordered list style
Expected: dash; Actual: asterisk

(MD004, ul-style)


126-126: Unordered list indentation
Expected: 0; Actual: 2

(MD007, ul-indent)


127-127: Unordered list style
Expected: dash; Actual: asterisk

(MD004, ul-style)


127-127: Unordered list indentation
Expected: 0; Actual: 2

(MD007, ul-indent)


128-128: Unordered list style
Expected: dash; Actual: asterisk

(MD004, ul-style)


128-128: Unordered list indentation
Expected: 0; Actual: 2

(MD007, ul-indent)

🔇 Additional comments (5)
packages/core/package.json (1)

63-63: Verified Node ≥20 compatibility; review Rollup bundle size

@azure/storage-blob@^12.28.0 (latest 12.29.0) declares "engines": { "node": ">=20.0.0" } and its transitive deps only bump @azure/storage-common from beta to ^12.1.0. Confirm this integrates under your Rollup config without inflating bundle size significantly.

packages/sdk/docs/07-services.md (1)

46-46: Docs update is clear and accurate.

Including AzureBlobStorage alongside LocalStorage and S3 improves discoverability. No further changes needed.

README.md (1)

28-32: Nice clarity bump on storage backends.

The added Azure reference reads well and aligns with the new connector.

packages/core/src/subsystems/IO/Storage.service/connectors/AzureBlobStorage.class.ts (2)

439-456: TTL to days: document rounding behavior.

You ceil to days; clarify in docs or accept an option to force minimum 1 day. Implementation is fine otherwise.

Would you like to expose an option like { rounding: 'ceil' | 'floor' } for parity with S3 lifecycle TTL handling?


528-546: Azure metadata constraints: ensure string values are ASCII-safe and under size limits.

JSON.stringify is fine, but consider sanitizing keys/values if user-provided to meet Azure’s metadata rules (ASCII only; limited size).

If upstream passes arbitrary metadata, add a lightweight sanitizer or validator in _serializeAzureMetadata.

@yogeshselvarajan
Copy link
Author

@coderabbitai generate docstrings

Copy link

coderabbitai bot commented Sep 9, 2025

✅ Actions performed

Docstrings generation started.

coderabbitai bot added a commit that referenced this pull request Sep 9, 2025
Docstrings generation was requested by @yogeshselvarajan.

* #126 (comment)

The following files were modified:

* `examples/06-Storage-no-agent/03-AzureBlobStorage.ts`
Copy link

coderabbitai bot commented Sep 9, 2025

Note

Generated docstrings for this pull request at #127

@yogeshselvarajan
Copy link
Author

@coderabbitai review

Copy link

coderabbitai bot commented Sep 9, 2025

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (5)
packages/core/src/subsystems/IO/Storage.service/connectors/AzureBlobStorage.class.ts (5)

31-44: Constructor now fails fast on invalid config — LGTM.

Matches prior guidance to throw on missing config; safer than early return.


318-324: getACL now returns undefined when missing — LGTM.

Aligns with JSDoc and prior feedback.


499-506: Metadata migration avoids mutating ACL — LGTM.

Using an 'acl_migrated' metadata flag is cleaner and type-safe.


595-603: Safe 404 handling in _getAzureMetadata — LGTM.

Narrowing status and returning undefined is correct.


183-193: Delete is not idempotent; use deleteIfExists().

delete() throws on 404; contradicts the method contract and comment.

Apply:

-      // The Azure SDK for Blob Storage handles the "not found" case gracefully
-      // by design, so no preliminary `exists()` check is needed.
-      await blockBlobClient.delete();
+      // Idempotent delete: succeeds even if the blob is already absent.
+      await blockBlobClient.deleteIfExists();
🧹 Nitpick comments (4)
packages/core/src/subsystems/IO/Storage.service/connectors/AzureBlobStorage.class.ts (4)

65-79: Clear stale initializationPromise in finally.

Prevents retaining a resolved promise reference after success.

Apply:

     private async initialize(): Promise<void> {
       if (this.isInitialized) {
         return;
       }
       try{
         await this.containerClient.createIfNotExists();
         console.log(`Container "${this._settings.blobContainerName}" is ready.`);
         this.isInitialized = true;
       }
       catch(error){
         console.error('Failed to initialize AzureBlobStorage:', error);
-        this.initializationPromise = null;
         throw error;
-      }
+      } finally {
+        // Avoid holding onto a resolved/rejected promise reference.
+        this.initializationPromise = null;
+      }
     }

158-161: Harden error logging: avoid assuming error.message.

Use safe fallback to handle unknown errors and TS’s useUnknownInCatchVariables.

Apply:

-      console.error(`Failed to write blob "${resourceId}":`, error.message);
+      console.error(`Failed to write blob "${resourceId}":`, (error as any)?.message ?? error);
-      console.error(`Failed to delete blob "${resourceId}":`, error.message);
+      console.error(`Failed to delete blob "${resourceId}":`, (error as any)?.message ?? error);
-      console.error(`Failed to check existence for blob "${resourceId}":`, error.message);
+      console.error(`Failed to check existence for blob "${resourceId}":`, (error as any)?.message ?? error);
-      console.error(`Failed to get metadata for blob "${resourceId}":`, error.message);
+      console.error(`Failed to get metadata for blob "${resourceId}":`, (error as any)?.message ?? error);
-      console.error(`Failed to set metadata for blob "${resourceId}":`, error.message);
+      console.error(`Failed to set metadata for blob "${resourceId}":`, (error as any)?.message ?? error);
-      console.error(`Failed to get ACL for blob "${resourceId}":`, error.message);
+      console.error(`Failed to get ACL for blob "${resourceId}":`, (error as any)?.message ?? error);
-      console.error(`Failed to set ACL for blob "${resourceId}":`, error.message);
+      console.error(`Failed to set ACL for blob "${resourceId}":`, (error as any)?.message ?? error);
-      console.error(`Failed to get resource ACL for blob "${resourceId}":`, error.message);
+      console.error(`Failed to get resource ACL for blob "${resourceId}":`, (error as any)?.message ?? error);
-      console.error(`Failed to set expiration for blob "${resourceId}":`, error.message);
+      console.error(`Failed to set expiration for blob "${resourceId}":`, (error as any)?.message ?? error);

Also applies to: 189-193, 219-221, 247-249, 287-289, 326-328, 370-372, 408-410, 453-455


98-116: Consider streaming/timeout for large reads.

downloadToBuffer loads entire blobs into memory; add AbortController timeout and/or expose a streaming read to avoid OOM on large objects.

Sample (30s timeout):

-      return await blockBlobClient.downloadToBuffer();
+      const c = new AbortController();
+      const t = setTimeout(() => c.abort(), 30_000);
+      try {
+        return await blockBlobClient.downloadToBuffer(undefined, undefined, undefined, { abortSignal: c.signal });
+      } finally {
+        clearTimeout(t);
+      }

If desired, I can add a readStream() API variant.


45-50: Optional: support custom endpoint.

Azure sovereign clouds/private stacks use non-public endpoints. Consider an optional endpoint override in AzureBlobConfig.

Proposed type tweak:

 export type AzureBlobConfig = {
   storageAccountName: string;
   storageAccountAccessKey: string;
   blobContainerName: string;
+  endpointUrl?: string; // optional custom endpoint
 };

And:

- const endpointUrl = `https://${_settings.storageAccountName}.blob.core.windows.net`;
+ const endpointUrl = _settings.endpointUrl ?? `https://${_settings.storageAccountName}.blob.core.windows.net`;
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge Base: Disabled due to data retention organization setting

📥 Commits

Reviewing files that changed from the base of the PR and between 9cdf68e and 5d9c223.

📒 Files selected for processing (2)
  • examples/06-Storage-no-agent/03-AzureBlobStorage.ts (1 hunks)
  • packages/core/src/subsystems/IO/Storage.service/connectors/AzureBlobStorage.class.ts (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • examples/06-Storage-no-agent/03-AzureBlobStorage.ts
🔇 Additional comments (1)
packages/core/src/subsystems/IO/Storage.service/connectors/AzureBlobStorage.class.ts (1)

403-405: ACL.from handles undefined and defaults to a restrictive ACL. The constructor treats undefined as empty entries and applies the default hash algorithm, resulting in no granted permissions—no additional guard is needed.

@alaa-eddine
Copy link
Collaborator

Hey @yogeshselvarajan thank you for your contribution we'll review it and let you know if we need any changes from you.

@zubairirfan96 please take a look at this, check if the provided example works properly, and that the connector respects the Storage interface.

@yogeshselvarajan
Copy link
Author

Hey @yogeshselvarajan thank you for your contribution we'll review it and let you know if we need any changes from you.

@zubairirfan96 please take a look at this, check if the provided example works properly, and that the connector respects the Storage interface.

Hi @zubairirfan96 Did you get any chance to review this PR?

@alaa-eddine-k
Copy link
Contributor

Hi @yogeshselvarajan Zubair is out this week, but we're reviewing and testing all the PRs.
This one is a very important one and we want to release it asap.

Please keep in mind that the review process requires extensive compatibility tests with the SRE code but also with our SaaS distribution, this is why it takes time.

The code looks good so far, but I will let you know if we need any changes.

@alaa-eddine-k
Copy link
Contributor

@yogeshselvarajan I tested your implementation and it's failing.

few hints : first the constructor should not throw an exception, this is currently required for graceful errors handling by SmythOS , just show a warning message and return (check S3Storage)

the other issue is with write() function
when you call this

await blockBlobClient.uploadData(value, uploadOptions);

here value is of type StorageData which can by any of the following types : string | Uint8Array | Buffer | Readable
but Azure expects one of these types : Buffer | Blob | ArrayBuffer | ArrayBufferView

so string would fail here.

these are two obvious issues I found there are maybe others.
did you actually test the PR before submitting ?

@yogeshselvarajan
Copy link
Author

@yogeshselvarajan I tested your implementation and it's failing.

few hints : first the constructor should not throw an exception, this is currently required for graceful errors handling by SmythOS , just show a warning message and return (check S3Storage)

the other issue is with write() function when you call this

await blockBlobClient.uploadData(value, uploadOptions);

here value is of type StorageData which can by any of the following types : string | Uint8Array | Buffer | Readable but Azure expects one of these types : Buffer | Blob | ArrayBuffer | ArrayBufferView

so string would fail here.

these are two obvious issues I found there are maybe others. did you actually test the PR before submitting ?

Hi @alaa-eddine-k Thanks for the detailed feedback and for catching these critical issues. My apologies for the errors.

I mistakenly assumed the validation was handled entirely by a CI pipeline and didn't perform the necessary local integration tests. I'd appreciate it if you could guide me on the recommended local testing setup. I will fix the issues you pointed out right away.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants