Skip to content

[Epic]: Shopify Source Plugin - Product Sourcing #53

Description

@dawidurbanski

[Epic]: Shopify Source Plugin - Product Sourcing

Vision Statement

Create a production-ready Shopify source plugin for Universal Data Layer that enables developers to source all products from their Shopify stores into the UDL node store. This plugin will follow the established patterns from the Contentful plugin, providing seamless GraphQL querying of Shopify product data with full type safety and reference resolution.

The Shopify plugin will be the second major source plugin for UDL, validating our plugin architecture's flexibility and establishing patterns for e-commerce data sourcing. This enables developers building headless Shopify storefronts to leverage UDL's powerful GraphQL layer, caching, and codegen capabilities.

Background & Context

  • Current State: UDL has a mature Contentful source plugin that demonstrates the full plugin lifecycle: configuration, data fetching, node transformation, and reference resolution. The core plugin infrastructure supports custom indexes, reference resolvers (\$entries/\$ref pattern), and type generation.

  • Problem: Developers building headless Shopify stores currently need to either use Shopify's Storefront API directly or build custom integration layers. There's no standardized way to unify Shopify product data with other data sources in a type-safe GraphQL layer.

  • Opportunity: By creating a Shopify source plugin, we enable:

    • Unified GraphQL queries across Shopify products and other data sources
    • Type-safe queries with full codegen support
    • Caching and performance optimizations through UDL's node store
    • Reference resolution between Shopify entities (products ↔ collections ↔ variants)
    • Consistent developer experience across content sources

Goals & Success Metrics

Primary Goals

  1. Source all products, variants, collections, and images from Shopify into UDL
  2. Provide full reference resolution between Shopify entities using the existing \$entries/\$ref patterns
  3. Generate accurate TypeScript types for all Shopify node types
  4. Follow established Contentful plugin patterns for consistency

Success Metrics

  • All Shopify products sync correctly with variants, images, and metafields
  • Collections are sourced with product references properly linked
  • Generated types provide full autocomplete for Shopify fields
  • Test coverage: 100% for core functionality
  • Example project working with real Shopify store

Technical Strategy

Architecture Overview

The plugin will follow the established Contentful plugin patterns:

  1. Configuration: Export config, onLoad, sourceNodes, referenceResolver, entityKeyConfig from udl.config.ts
  2. Authentication: Use Shopify Storefront Access Token (or Admin API token for full data)
  3. Data Fetching: GraphQL queries with cursor-based pagination
  4. Transformation: Convert Shopify objects to UDL nodes with shopifyId index
  5. Reference Resolution: Use marker field pattern (_shopifyRef) with existing core \$entries/\$ref infrastructure
  6. Sync Strategy: Tiered approach with Bulk Operations API + Delta sync (see Sync Strategy section)

Key Design Decisions

  • API Choice: Storefront API vs Admin API

    • Decision: Start with Storefront API for public product data
    • Rationale: Storefront API is public-facing, has simpler auth, and covers most product sourcing needs. Admin API can be added later for inventory/order data.
  • Node Types Structure

    • Products → ShopifyProduct
    • Variants → ShopifyProductVariant
    • Collections → ShopifyCollection
    • Images → embedded in product nodes (not separate)

Data Layer Implications

  • Plugin Architecture: Follows established patterns - validates plugin architecture flexibility
  • Caching Strategy: Nodes cached by shopifyId with content digest for change detection
  • Framework Adapters: No changes needed - works with existing Next.js/Vite adapters
  • Reference Resolution: Uses existing core \$entries/\$ref infrastructure with shopifyId index

Sync Strategy

Dependency: This section requires Remote UDL - WebSocket-Based Sync Infrastructure to be completed first.

Overview

Implement a tiered sync strategy for Shopify data (products, collections, assets) that balances efficiency with freshness. Uses Bulk Operations API for initial/stale syncs and incremental queries for recent changes.

Architecture Diagram

┌─────────────────────────────────────────────────────────┐
│  Local UDL (.udl-cache/shopify/)                        │
├─────────────────────────────────────────────────────────┤
│  meta.json: { lastSync, epoch }                         │
│  nodes.json (via existing CacheStorage)                 │
└──────────────────┬──────────────────────────────────────┘
                   │
     ┌─────────────┴─────────────┐
     │ Sync Decision             │
     ├───────────────────────────┤
     │ No cache?      → Bulk Op  │
     │ lastSync > 30d → Bulk Op  │
     │ lastSync < 30d → Delta    │
     └─────────────────────────┬─┘
                               │
        ┌──────────────────────┴───────────────────┐
        │                                          │
   ┌────▼─────┐                           ┌────────▼────────┐
   │ Bulk Op  │                           │ Delta Sync      │
   │ (JSONL)  │                           │ updated_at:>X   │
   │          │                           │ + deletions API │
   └────┬─────┘                           └────────┬────────┘
        │                                          │
        └──────────────────┬───────────────────────┘
                           │
              ┌────────────▼────────────┐
              │ Remote UDL (deployed)   │
              │ • Webhook receiver      │
              │ • Deletion log (30d)    │
              │ • WebSocket broadcast   │
              └─────────────────────────┘

Sync Decision Logic

Condition Action
No .udl-cache/shopify exists Full sync (Bulk Operations)
lastSync > 30 days Full sync (Bulk Operations)
lastSync ≤ 30 days Delta sync (GraphQL query)

Components

  1. Full Sync (Bulk Operations API)

    • Submit bulkOperationRunQuery mutation
    • Poll for completion (currentBulkOperation query)
    • Download and parse JSONL (reconstruct parent/child relationships via __parentId)
    • Handle existing operation conflict (cancel stale ops before starting)
    • Write to cache + update meta.json with lastSync timestamp
  2. Delta Sync

    • Query nodes with updated_at:>'${lastSync}' filter
    • Use @shopify/shopify-api for automatic rate limit handling
    • Merge updated nodes into cache
  3. Deletion Handling

    • Via Remote UDL: Receive deletions from WebSocket broadcast
    • Deletion log with 30-day retention on Remote UDL
    • Local cache cleanup based on deletion events
  4. Remote UDL Integration

    • Same UDL deployed to hosting provider (Vercel, AWS, etc.)
    • Webhook receiver for Shopify events (products/update, products/delete, etc.)
    • WebSocket broadcast to connected local instances
    • See Remote UDL Epic for infrastructure details

Cache Structure

.udl-cache/
  shopify/
    meta.json          # { lastSync: ISO8601, shopDomain: string }
  nodes.json           # Existing CacheStorage (all plugin nodes)

Implementation Roadmap

Phase 1: Foundation

Goal: Set up plugin structure and basic authentication

Issue 1.1: Plugin Scaffold and Configuration

  • Description: Create the plugin package structure following Contentful plugin patterns. Set up TypeScript configuration, exports, and basic plugin lifecycle hooks.
  • Deliverables:
    • packages/plugin-source-shopify/ package structure
    • udl.config.ts with config, onLoad, sourceNodes, referenceResolver, entityKeyConfig exports
    • Options interface with required fields (storeDomain, storefrontAccessToken)
    • Options validation in onLoad
    • Error classes for Shopify-specific errors
    • Package.json with peer dependencies
    • Unit tests for options validation
  • Dependencies: None

Issue 1.2: Shopify GraphQL Client and Authentication

  • Description: Create a GraphQL client for Shopify Storefront API with proper authentication, error handling, and retry logic.
  • Deliverables:
    • GraphQL client wrapper with Storefront Access Token auth
    • Query complexity awareness (Shopify uses complexity-based limits)
    • Error handling for 4xx/5xx responses
    • Retry logic for transient failures
    • Unit tests for client functionality
  • Dependencies: Issue 1.1

Phase 2: Core Data Sourcing

Goal: Implement product, variant, and collection sourcing

Issue 2.1: Product Fetching and Transformation

  • Description: Implement the core product fetching logic with cursor-based pagination and transform Shopify products into UDL nodes.
  • Deliverables:
    • Products query with all required fields (title, handle, description, vendor, productType, tags, status, pricing, SEO)
    • Cursor-based pagination handling (fetch all products)
    • Product transformation to ShopifyProduct nodes
    • shopifyId field for reference resolution
    • Content digest generation for change detection
    • Unit tests for transformation logic
  • Dependencies: Issue 1.2

Issue 2.2: Variant and Image Sourcing

  • Description: Extend product sourcing to include variants and images as related data.
  • Deliverables:
    • Variants query nested within products (or as separate nodes with references)
    • Variant data transformation (price, compareAtPrice, sku, inventory, selectedOptions)
    • Image data extraction (url, altText, width, height)
    • Reference creation for variant-to-product relationships using _shopifyRef marker
    • Handle variant limit (2048 per product) with pagination
    • Unit tests for variant/image transformation
  • Dependencies: Issue 2.1

Issue 2.3: Collection Sourcing with Product References

  • Description: Source Shopify collections and create proper references to products within each collection.
  • Deliverables:
    • Collections query with cursor-based pagination
    • Collection transformation to ShopifyCollection nodes
    • Product reference creation within collections using _shopifyRef marker
    • Bidirectional reference support (product.collections, collection.products)
    • Unit tests for collection transformation
  • Dependencies: Issue 2.1

Phase 2.5: Shopify Sync

Goal: Implement efficient sync strategy for Shopify data

Dependency: Requires Remote UDL - WebSocket-Based Sync Infrastructure to be completed first.

Issue 2.5.1: Shopify Bulk Operations Integration

  • Description: Implement Shopify Bulk Operations API for full sync (initial and stale cache scenarios)
  • Deliverables:
    • bulkOperationRunQuery mutation for products, collections, files
    • Polling for completion with exponential backoff
    • JSONL streaming parser (reconstruct __parentId relationships)
    • Cancel stale operations before starting new ones
    • Progress reporting callback
  • Dependencies: Issue 1.2, Remote UDL Issue 2 (Sync State Management)

Issue 2.5.2: Shopify Delta Sync

  • Description: Implement delta sync using updated_at filtering for incremental updates
  • Deliverables:
    • GraphQL queries with updated_at:>'${lastSync}' filter
    • Rate limit handling via @shopify/shopify-api client
    • Merge logic for updated nodes
    • Sync state update after successful sync
  • Dependencies: Issue 2.5.1

Issue 2.5.3: Shopify Webhook Handler

  • Description: Implement webhook receiver for real-time updates on Remote UDL
  • Deliverables:
    • Webhook topics: products/*, collections/*, inventory_levels/*
    • HMAC signature verification
    • Event-to-node transformation (reuse existing transform logic)
    • Integration with deletion log for delete events
    • Export handler using core WebhookHandler interface
  • Dependencies: Remote UDL Issue 7 (Webhook Handler Interface), Issues 2.1-2.3

Phase 3: Production Readiness

Goal: Documentation and example integration

Issue 3.1: Documentation and Example

  • Description: Create comprehensive documentation and a working example demonstrating Shopify plugin usage.
  • Deliverables:
    • Plugin README with configuration options
    • Documentation page in docs/content/ (similar to Contentful docs)
    • Example queries for common use cases
    • examples/nextjs-shopify/ or extend existing example
    • Troubleshooting guide for common issues
    • Sync strategy documentation
  • Dependencies: Issues 2.2, 2.3, 2.5.*

Risks & Mitigation

Technical Risks

Risk Impact Mitigation
Shopify API query complexity limits Medium Implement complexity tracking, optimize queries
Large product catalogs (10k+) High Cursor-based pagination, chunked processing, progress reporting
Variant limits (2048/product) Low Handle pagination within products, document limits
Storefront API field limitations Medium Document limitations, plan Admin API support for future
Bulk Operation timeout (>10M products) High Chunked queries, timeout handling, resume support
Webhook delivery failures Medium Retry logic, manual sync trigger fallback

Dependencies

  • External: Shopify Storefront API stability (versioned API, low risk)
  • Internal: Core plugin infrastructure (stable, low risk)
  • Internal: Remote UDL infrastructure ([Epic]: UDL as Production Data Layer #57) - required for sync features

Testing Strategy

Test Coverage Goals

  • Unit test coverage: >80%
  • Tests written alongside each feature (not at the end)

Test Approach

  • Unit Tests: Written with each issue for transformation logic, client, references
  • Integration Tests: Full sync flow tests with MSW mocks
  • Manual Testing: Real Shopify store integration in example project

Documentation Requirements

  • Plugin README with quick start guide
  • Configuration options reference
  • Example GraphQL queries
  • Troubleshooting guide (common issues)
  • API limitations documentation
  • Sync strategy documentation
  • Example project with Next.js

Definition of Done

  • All 9 issues completed and merged (6 original + 3 sync issues)
  • Test coverage >80%
  • Documentation complete
  • No critical bugs
  • Example project working with real Shopify store
  • Plugin published to npm
  • Sync with Remote UDL working end-to-end

Future Considerations

This epic enables future enhancements:

  • Admin API Support: Access to inventory, orders, customers
  • Metafield Support: Custom metafield extraction and typing
  • Multi-currency/Multi-language: Internationalized product data
  • Media Support: 3D models, videos beyond images
  • Selling Plans: Subscription product support

Related Work

  • Depends on:
  • Parallel with: Contentful plugin serves as reference implementation
  • Enables: Future e-commerce plugins (WooCommerce, BigCommerce)

Issue Summary

# Issue Phase Dependencies
1.1 Plugin Scaffold and Configuration 1 None
1.2 Shopify GraphQL Client and Authentication 1 1.1
2.1 Product Fetching and Transformation 2 1.2
2.2 Variant and Image Sourcing 2 2.1
2.3 Collection Sourcing with Product References 2 2.1
2.5.1 Shopify Bulk Operations Integration 2.5 1.2, Remote UDL #57
2.5.2 Shopify Delta Sync 2.5 2.5.1
2.5.3 Shopify Webhook Handler 2.5 Remote UDL #57, 2.1-2.3
3.1 Documentation and Example 3 2.2, 2.3, 2.5.*

Metadata

Metadata

Assignees

No one assigned

    Labels

    epicEpic issue tracking a large feature

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions