-
Notifications
You must be signed in to change notification settings - Fork 167
feat(storage): add Azure Blob Storage connector. #126
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
feat(storage): add Azure Blob Storage connector. #126
Conversation
Signed-off-by: Yogesh Selvarajan <[email protected]>
WalkthroughAdds an Azure Blob Storage connector: implements AzureBlobStorage with ACL/metadata/lifecycle methods, registers and exports it from core, adds the Azure SDK dependency, provides docs updates and a new example demonstrating write/read verification against Azure Blob Storage. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
actor Dev as Developer Script
participant SDK as @smythos/sdk Storage API
participant Conn as AzureBlobStorage Connector
participant Az as Azure Blob SDK
Dev->>SDK: storage.write(resourceId, data, opts)
SDK->>Conn: write(acRequest, resourceId, value, acl?, metadata?)
Note over Conn: ensure container initialized (cached promise)
Conn->>Az: uploadBlob + set headers & metadata
Az-->>Conn: 201 Created
Conn-->>SDK: resolve write
SDK-->>Dev: write complete
Dev->>SDK: storage.read(resourceId)
SDK->>Conn: read(acRequest, resourceId)
Conn->>Az: downloadToBuffer
alt blob found
Az-->>Conn: 200 OK (buffer)
Conn-->>SDK: data
SDK-->>Dev: content returned
else not found
Az-->>Conn: 404
Conn-->>SDK: undefined
SDK-->>Dev: not found
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Pre-merge checks (2 passed, 1 warning)❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
Poem
Tip 👮 Agentic pre-merge checks are now available in preview!Pro plan users can now enable pre-merge checks in their settings to enforce checklists before merging PRs.
Example: reviews:
pre_merge_checks:
custom_checks:
- name: "Undocumented Breaking Changes"
mode: "warning"
instructions: |
Pass/fail criteria: All breaking changes to public APIs, CLI flags, environment variables, configuration keys, database schemas, or HTTP/GraphQL endpoints must be documented in the "Breaking Change" section of the PR description and in CHANGELOG.md. Exclude purely internal or private changes (e.g., code not exported from package entry points or explicitly marked as internal). Please share your feedback with us on this Discord post. ✨ Finishing Touches
🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 7
🧹 Nitpick comments (7)
packages/core/docs/connectors/storage.md (2)
85-86
: Fix markdown horizontal rule to satisfy markdownlint (MD035).Replace the 5-dash rule with the standard 3-dash rule.
Apply:
------ +---
118-128
: Normalize unordered list style/indentation (MD004, MD007).Switch asterisks to dashes and remove extra indentation.
Apply:
- * Production environments, especially those hosted on Microsoft Azure - * Applications requiring durable, high-availability, and geo-redundant storage - * Integration with the Azure ecosystem (Azure Functions, Logic Apps, etc.) - * Storing large-scale unstructured data like images, videos, and documents + - Production environments, especially those hosted on Microsoft Azure + - Applications requiring durable, high-availability, and geo-redundant storage + - Integration with the Azure ecosystem (Azure Functions, Logic Apps, etc.) + - Storing large-scale unstructured data like images, videos, and documentsAnd:
- * Use Managed Identities when running on Azure infrastructure for the most secure, keyless authentication. - * Store credentials securely using environment variables or Azure Key Vault. - * Configure container access policies and use Shared Access Signatures (SAS) for fine-grained, temporary access. - * Enable encryption at rest and in transit for sensitive data. + - Use Managed Identities when running on Azure infrastructure for the most secure, keyless authentication. + - Store credentials securely using environment variables or Azure Key Vault. + - Configure container access policies and use Shared Access Signatures (SAS) for fine-grained, temporary access. + - Enable encryption at rest and in transit for sensitive data.examples/06-Storage-no-agent/03-AzureBlobStorage.ts (1)
45-47
: Exit with non-zero on failure to aid CI signal.Return keeps the process "successful." Prefer process.exit(1).
Apply:
-main().catch(error => { - console.error("An error occurred:", error.message); -}); +main().catch(error => { + console.error("An error occurred:", (error as Error).message ?? error); + process.exit(1); +});packages/core/src/index.ts (1)
165-165
: Missing semicolon.Keep style consistent with surrounding exports.
Apply:
-export * from './subsystems/IO/Storage.service/connectors/AzureBlobStorage.class' +export * from './subsystems/IO/Storage.service/connectors/AzureBlobStorage.class';packages/core/src/subsystems/IO/Storage.service/index.ts (1)
7-14
: Formatting + semicolons for consistency.Minor style nits on import spacing and missing semicolons.
Apply:
-import { AzureBlobStorage} from './connectors/AzureBlobStorage.class' +import { AzureBlobStorage } from './connectors/AzureBlobStorage.class'; @@ - ConnectorService.register(TConnectorService.Storage, 'AzureBlobStorage', AzureBlobStorage); + ConnectorService.register(TConnectorService.Storage, 'AzureBlobStorage', AzureBlobStorage);packages/core/src/subsystems/IO/Storage.service/connectors/AzureBlobStorage.class.ts (2)
54-59
: Consider sovereign clouds support (optional endpoint suffix).Hardcoding blob.core.windows.net excludes Azure Gov/China. Allow overriding endpoint suffix.
Example:
- const endpointUrl = `https://${_settings.storageAccountName}.blob.core.windows.net`; + const endpointSuffix = (_settings as any).endpointSuffix ?? 'blob.core.windows.net'; + const endpointUrl = `https://${_settings.storageAccountName}.${endpointSuffix}`;
17-18
: Naming: avoid shadowing the global console.Logger('AzureBlobStorage') assigned to const console is confusing. Prefer const logger = Logger(...).
If you want, I can prep a follow-up PR to rename usages.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
Cache: Disabled due to data retention organization setting
Knowledge Base: Disabled due to data retention organization setting
⛔ Files ignored due to path filters (1)
pnpm-lock.yaml
is excluded by!**/pnpm-lock.yaml
📒 Files selected for processing (8)
README.md
(1 hunks)examples/06-Storage-no-agent/03-AzureBlobStorage.ts
(1 hunks)packages/core/docs/connectors/storage.md
(1 hunks)packages/core/package.json
(1 hunks)packages/core/src/index.ts
(1 hunks)packages/core/src/subsystems/IO/Storage.service/connectors/AzureBlobStorage.class.ts
(1 hunks)packages/core/src/subsystems/IO/Storage.service/index.ts
(1 hunks)packages/sdk/docs/07-services.md
(1 hunks)
🧰 Additional context used
🪛 markdownlint-cli2 (0.17.2)
packages/core/docs/connectors/storage.md
85-85: Horizontal rule style
Expected: ---; Actual: -----
(MD035, hr-style)
118-118: Unordered list style
Expected: dash; Actual: asterisk
(MD004, ul-style)
118-118: Unordered list indentation
Expected: 0; Actual: 2
(MD007, ul-indent)
119-119: Unordered list style
Expected: dash; Actual: asterisk
(MD004, ul-style)
119-119: Unordered list indentation
Expected: 0; Actual: 2
(MD007, ul-indent)
120-120: Unordered list style
Expected: dash; Actual: asterisk
(MD004, ul-style)
120-120: Unordered list indentation
Expected: 0; Actual: 2
(MD007, ul-indent)
121-121: Unordered list style
Expected: dash; Actual: asterisk
(MD004, ul-style)
121-121: Unordered list indentation
Expected: 0; Actual: 2
(MD007, ul-indent)
125-125: Unordered list style
Expected: dash; Actual: asterisk
(MD004, ul-style)
125-125: Unordered list indentation
Expected: 0; Actual: 2
(MD007, ul-indent)
126-126: Unordered list style
Expected: dash; Actual: asterisk
(MD004, ul-style)
126-126: Unordered list indentation
Expected: 0; Actual: 2
(MD007, ul-indent)
127-127: Unordered list style
Expected: dash; Actual: asterisk
(MD004, ul-style)
127-127: Unordered list indentation
Expected: 0; Actual: 2
(MD007, ul-indent)
128-128: Unordered list style
Expected: dash; Actual: asterisk
(MD004, ul-style)
128-128: Unordered list indentation
Expected: 0; Actual: 2
(MD007, ul-indent)
🔇 Additional comments (5)
packages/core/package.json (1)
63-63
: Verified Node ≥20 compatibility; review Rollup bundle size
@azure/storage-blob@^12.28.0
(latest 12.29.0) declares"engines": { "node": ">=20.0.0" }
and its transitive deps only bump@azure/storage-common
from beta to^12.1.0
. Confirm this integrates under your Rollup config without inflating bundle size significantly.packages/sdk/docs/07-services.md (1)
46-46
: Docs update is clear and accurate.Including AzureBlobStorage alongside LocalStorage and S3 improves discoverability. No further changes needed.
README.md (1)
28-32
: Nice clarity bump on storage backends.The added Azure reference reads well and aligns with the new connector.
packages/core/src/subsystems/IO/Storage.service/connectors/AzureBlobStorage.class.ts (2)
439-456
: TTL to days: document rounding behavior.You ceil to days; clarify in docs or accept an option to force minimum 1 day. Implementation is fine otherwise.
Would you like to expose an option like { rounding: 'ceil' | 'floor' } for parity with S3 lifecycle TTL handling?
528-546
: Azure metadata constraints: ensure string values are ASCII-safe and under size limits.JSON.stringify is fine, but consider sanitizing keys/values if user-provided to meet Azure’s metadata rules (ASCII only; limited size).
If upstream passes arbitrary metadata, add a lightweight sanitizer or validator in _serializeAzureMetadata.
packages/core/src/subsystems/IO/Storage.service/connectors/AzureBlobStorage.class.ts
Outdated
Show resolved
Hide resolved
packages/core/src/subsystems/IO/Storage.service/connectors/AzureBlobStorage.class.ts
Show resolved
Hide resolved
packages/core/src/subsystems/IO/Storage.service/connectors/AzureBlobStorage.class.ts
Show resolved
Hide resolved
packages/core/src/subsystems/IO/Storage.service/connectors/AzureBlobStorage.class.ts
Show resolved
Hide resolved
packages/core/src/subsystems/IO/Storage.service/connectors/AzureBlobStorage.class.ts
Show resolved
Hide resolved
packages/core/src/subsystems/IO/Storage.service/connectors/AzureBlobStorage.class.ts
Show resolved
Hide resolved
@coderabbitai generate docstrings |
✅ Actions performedDocstrings generation started. |
Docstrings generation was requested by @yogeshselvarajan. * #126 (comment) The following files were modified: * `examples/06-Storage-no-agent/03-AzureBlobStorage.ts`
Note Generated docstrings for this pull request at #127 |
Signed-off-by: Yogesh Selvarajan <[email protected]>
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (5)
packages/core/src/subsystems/IO/Storage.service/connectors/AzureBlobStorage.class.ts (5)
31-44
: Constructor now fails fast on invalid config — LGTM.Matches prior guidance to throw on missing config; safer than early return.
318-324
: getACL now returns undefined when missing — LGTM.Aligns with JSDoc and prior feedback.
499-506
: Metadata migration avoids mutating ACL — LGTM.Using an 'acl_migrated' metadata flag is cleaner and type-safe.
595-603
: Safe 404 handling in _getAzureMetadata — LGTM.Narrowing status and returning undefined is correct.
183-193
: Delete is not idempotent; use deleteIfExists().delete() throws on 404; contradicts the method contract and comment.
Apply:
- // The Azure SDK for Blob Storage handles the "not found" case gracefully - // by design, so no preliminary `exists()` check is needed. - await blockBlobClient.delete(); + // Idempotent delete: succeeds even if the blob is already absent. + await blockBlobClient.deleteIfExists();
🧹 Nitpick comments (4)
packages/core/src/subsystems/IO/Storage.service/connectors/AzureBlobStorage.class.ts (4)
65-79
: Clear stale initializationPromise in finally.Prevents retaining a resolved promise reference after success.
Apply:
private async initialize(): Promise<void> { if (this.isInitialized) { return; } try{ await this.containerClient.createIfNotExists(); console.log(`Container "${this._settings.blobContainerName}" is ready.`); this.isInitialized = true; } catch(error){ console.error('Failed to initialize AzureBlobStorage:', error); - this.initializationPromise = null; throw error; - } + } finally { + // Avoid holding onto a resolved/rejected promise reference. + this.initializationPromise = null; + } }
158-161
: Harden error logging: avoid assuming error.message.Use safe fallback to handle unknown errors and TS’s useUnknownInCatchVariables.
Apply:
- console.error(`Failed to write blob "${resourceId}":`, error.message); + console.error(`Failed to write blob "${resourceId}":`, (error as any)?.message ?? error);- console.error(`Failed to delete blob "${resourceId}":`, error.message); + console.error(`Failed to delete blob "${resourceId}":`, (error as any)?.message ?? error);- console.error(`Failed to check existence for blob "${resourceId}":`, error.message); + console.error(`Failed to check existence for blob "${resourceId}":`, (error as any)?.message ?? error);- console.error(`Failed to get metadata for blob "${resourceId}":`, error.message); + console.error(`Failed to get metadata for blob "${resourceId}":`, (error as any)?.message ?? error);- console.error(`Failed to set metadata for blob "${resourceId}":`, error.message); + console.error(`Failed to set metadata for blob "${resourceId}":`, (error as any)?.message ?? error);- console.error(`Failed to get ACL for blob "${resourceId}":`, error.message); + console.error(`Failed to get ACL for blob "${resourceId}":`, (error as any)?.message ?? error);- console.error(`Failed to set ACL for blob "${resourceId}":`, error.message); + console.error(`Failed to set ACL for blob "${resourceId}":`, (error as any)?.message ?? error);- console.error(`Failed to get resource ACL for blob "${resourceId}":`, error.message); + console.error(`Failed to get resource ACL for blob "${resourceId}":`, (error as any)?.message ?? error);- console.error(`Failed to set expiration for blob "${resourceId}":`, error.message); + console.error(`Failed to set expiration for blob "${resourceId}":`, (error as any)?.message ?? error);Also applies to: 189-193, 219-221, 247-249, 287-289, 326-328, 370-372, 408-410, 453-455
98-116
: Consider streaming/timeout for large reads.downloadToBuffer loads entire blobs into memory; add AbortController timeout and/or expose a streaming read to avoid OOM on large objects.
Sample (30s timeout):
- return await blockBlobClient.downloadToBuffer(); + const c = new AbortController(); + const t = setTimeout(() => c.abort(), 30_000); + try { + return await blockBlobClient.downloadToBuffer(undefined, undefined, undefined, { abortSignal: c.signal }); + } finally { + clearTimeout(t); + }If desired, I can add a readStream() API variant.
45-50
: Optional: support custom endpoint.Azure sovereign clouds/private stacks use non-public endpoints. Consider an optional endpoint override in AzureBlobConfig.
Proposed type tweak:
export type AzureBlobConfig = { storageAccountName: string; storageAccountAccessKey: string; blobContainerName: string; + endpointUrl?: string; // optional custom endpoint };
And:
- const endpointUrl = `https://${_settings.storageAccountName}.blob.core.windows.net`; + const endpointUrl = _settings.endpointUrl ?? `https://${_settings.storageAccountName}.blob.core.windows.net`;
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
Cache: Disabled due to data retention organization setting
Knowledge Base: Disabled due to data retention organization setting
📒 Files selected for processing (2)
examples/06-Storage-no-agent/03-AzureBlobStorage.ts
(1 hunks)packages/core/src/subsystems/IO/Storage.service/connectors/AzureBlobStorage.class.ts
(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- examples/06-Storage-no-agent/03-AzureBlobStorage.ts
🔇 Additional comments (1)
packages/core/src/subsystems/IO/Storage.service/connectors/AzureBlobStorage.class.ts (1)
403-405
: ACL.from handles undefined and defaults to a restrictive ACL. The constructor treatsundefined
as empty entries and applies the default hash algorithm, resulting in no granted permissions—no additional guard is needed.
Hey @yogeshselvarajan thank you for your contribution we'll review it and let you know if we need any changes from you. @zubairirfan96 please take a look at this, check if the provided example works properly, and that the connector respects the Storage interface. |
Hi @zubairirfan96 Did you get any chance to review this PR? |
Hi @yogeshselvarajan Zubair is out this week, but we're reviewing and testing all the PRs. Please keep in mind that the review process requires extensive compatibility tests with the SRE code but also with our SaaS distribution, this is why it takes time. The code looks good so far, but I will let you know if we need any changes. |
@yogeshselvarajan I tested your implementation and it's failing. few hints : first the constructor should not throw an exception, this is currently required for graceful errors handling by SmythOS , just show a warning message and return (check S3Storage) the other issue is with write() function
here value is of type StorageData which can by any of the following types : string | Uint8Array | Buffer | Readable so string would fail here. these are two obvious issues I found there are maybe others. |
Hi @alaa-eddine-k Thanks for the detailed feedback and for catching these critical issues. My apologies for the errors. I mistakenly assumed the validation was handled entirely by a CI pipeline and didn't perform the necessary local integration tests. I'd appreciate it if you could guide me on the recommended local testing setup. I will fix the issues you pointed out right away. |
📝 Description
This PR introduces a new Azure Blob Storage connector to the
IO.Storage
subsystem. This addition expands SmythOS's cloud provider support, offering users a highly scalable and secure storage option within the Microsoft Azure ecosystem.The new AzureBlobStorage class implements the complete
StorageConnector
interface, providing full CRUD (read
,write
,delete
,exists
), metadata, and ACL management. It seamlessly integrates with the SmythOS security model by storing custom ACLs in the blob's metadata.Key features include:
expire
method that uses Azure Blob Index Tags, allowing for automatic object deletion via Azure's Lifecycle Management rules.🔗 Related Issues
🔧 Type of Change
✅ Checklist
Summary by CodeRabbit
New Features
Documentation
Chores