Skip to content

Conversation

@jaclync
Copy link
Contributor

@jaclync jaclync commented Sep 9, 2025

For WOOMOB-1298

Description

This PR implements persistence capabilities for the POS catalog incremental sync service. The changes enable the service to save incremental catalog updates (products and variations) to the database after fetching them from the API.

Key changes:

  • Added persistIncrementalCatalogData method to POSCatalogPersistenceService that handles upsert operations for products and variations
  • Updated POSCatalogIncrementalSyncService to integrate with the persistence service and save data after successful API fetches
  • Made database column properties public in persisted model classes to support filtering operations
  • Added test coverage for both incremental persistence functionality and error handling scenarios

The implementation ensures that:

  • Products and variations are inserted or updated as needed
  • Related attributes and images are properly replaced during updates
  • Database operations are atomic and handle failures gracefully
  • Sync dates are only updated after successful persistence to maintain data consistency

Steps to reproduce

Just CI is sufficient, as the full/incremental services aren't integrated with the app yet.

Testing information

I tested that incremental sync worked after a full sync for a big site, making changes to at least one product and variation while full sync is in progress:

Task {
   do {
        let date = Date()
        let catalog = try await syncService.startFullSync(for: siteID)
        print("Full sync completed with \(catalog.products.count) products and \(catalog.variations.count) variations for siteID \(siteID)")

        try await incrementalSyncService.startIncrementalSync(for: siteID, lastFullSyncDate: date)
        print("Incremental sync completed")
    } catch {
        print("Full sync failed for siteID \(siteID) with error: \(error)")
    }
}

  • I have considered if this change warrants user-facing release notes and have added them to RELEASE-NOTES.txt if necessary.

@jaclync jaclync changed the base branch from trunk to feat/WOOMOB-1097-incremental-sync-service September 9, 2025 02:07
@jaclync jaclync added type: task An internally driven task. feature: POS labels Sep 9, 2025
@jaclync jaclync added this to the 23.3 milestone Sep 9, 2025
@jaclync jaclync changed the title Add persistence for POS incremental sync service [Local Catalog] Incremental sync: persist incrementally changed products/variations to storage Sep 9, 2025
@wpmobilebot
Copy link
Collaborator

wpmobilebot commented Sep 9, 2025

App Icon📲 You can test the changes from this Pull Request in WooCommerce iOS Prototype by scanning the QR code below to install the corresponding build.

App NameWooCommerce iOS Prototype
Build Numberpr16106-7162e20
Version23.2
Bundle IDcom.automattic.alpha.woocommerce
Commit7162e20
Installation URL422ctu5qvsboo
Automatticians: You can use our internal self-serve MC tool to give yourself access to those builds if needed.

}

try PersistedProductVariationAttribute
.filter(sql: "\(PersistedProductVariationAttribute.Columns.productVariationID.name) IN (\(variationIDs))")
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an example of the last resort in p1757321561703349/1757321341.670339-slack-C070SJRA8DP. Please feel free to suggest a better way to do this. The corresponding foreign key column constant needs to be marked as public.

What I've tried:

  • .filter { variationIDs.contains($0.productVariationID) }
  • .filter { variationIDs.contains(PersistedProductVariationAttribute.Columns.productVariationID) }

The issue is that $0.productVariationID or PersistedProductVariationAttribute.Columns.productVariationID is of Column type, and only basic operators like == are supported. .filter { $0.productVariationID == 2 } works.

We could delete all existing attributes per product/variation in a prettier way, but deleting from productVariationID in (variationIDs) would be the most performant. WDYT? I personally think it's fine having raw SQL + unit tests if it helps with performance.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From what I've read and tried, it should be perfectly quick to do it in a loop, as long as that loop is inside the transaction, i.e.

           for variation in catalog.variations {
               try PersistedProductVariationAttribute
                    .filter { $0.productVariationID == variation.productVariationID }
                    .deleteAll(db)
            }

That would probably be ok for an incremental sync, which should be a relatively small change set. If it's not a small change, we might be better off doing a full sync anyway?

That said, there's absolutely nothing wrong with using SQL directly. We need to take care to correctly escape it (though type safety helps a lot when we're using Int64 IDs!)

Perhaps see whether you can detect any performance difference between the two?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps see whether you can detect any performance difference between the two?

With the test case at the bottom, for 1000 products each with 3 attributes and 2000 variations each with 2 attributes to replace, the average runtime from 100 runs was:

  • SQL version to delete all attributes in a set of IDs: 0.168967 seconds
  • For loop to delete attributes for each product/variation ID: 0.284567 seconds

=> there is ~0.12s difference deleting 7k attributes from the test case, not a big difference.

That would probably be ok for an incremental sync, which should be a relatively small change set. If it's not a small change, we might be better off doing a full sync anyway?

Yea, given that it's quite an edge case with a big data set from an incremental sync, the performance difference is negligible for most cases. I will update the implementation to delete existing image(s)/attributes in a for loop.

Test function ```swift @test func persistIncrementalCatalogData_performance_with_large_dataset() async throws { // Given - create a large dataset with existing products and variations that will be replaced let existingProductCount = 1000 let existingVariationCount = 2000
    // Create existing products with attributes
    var existingProducts: [POSProduct] = []
    for i in 1...existingProductCount {
        let attributes = [
            ProductAttribute.fake().copy(name: "Color", options: ["Red", "Blue", "Green"]),
            ProductAttribute.fake().copy(name: "Size", options: ["S", "M", "L", "XL"]),
            ProductAttribute.fake().copy(name: "Material", options: ["Cotton", "Polyester"])
        ]
        let product = POSProduct.fake().copy(
            siteID: sampleSiteID,
            productID: Int64(i),
            name: "Product \(i)",
            attributes: attributes
        )
        existingProducts.append(product)
    }
    
    // Create existing variations with attributes
    var existingVariations: [POSProductVariation] = []
    for i in 1...existingVariationCount {
        let productID = Int64((i % existingProductCount) + 1)
        let attributes = [
            ProductVariationAttribute.fake().copy(name: "Color", option: "Red"),
            ProductVariationAttribute.fake().copy(name: "Size", option: "M")
        ]
        let variation = POSProductVariation.fake().copy(
            siteID: sampleSiteID,
            productID: productID,
            productVariationID: Int64(i),
            attributes: attributes,
            price: "\(10 + i).00"
        )
        existingVariations.append(variation)
    }
    
    // Insert existing data
    let existingCatalog = POSCatalog(products: existingProducts, variations: existingVariations)
    try await sut.replaceAllCatalogData(existingCatalog, siteID: sampleSiteID)
    
    // Create update catalog with modified data
    var updatedProducts: [POSProduct] = []
    for i in 1...existingProductCount {
        let attributes = [
            ProductAttribute.fake().copy(name: "Color", options: ["Purple", "Orange", "Yellow"]),
            ProductAttribute.fake().copy(name: "Size", options: ["XS", "S", "M", "L", "XL", "XXL"]),
            ProductAttribute.fake().copy(name: "Style", options: ["Casual", "Formal"])
        ]
        let product = POSProduct.fake().copy(
            siteID: sampleSiteID,
            productID: Int64(i),
            name: "Updated Product \(i)",
            attributes: attributes
        )
        updatedProducts.append(product)
    }
    
    var updatedVariations: [POSProductVariation] = []
    for i in 1...existingVariationCount {
        let productID = Int64((i % existingProductCount) + 1)
        let attributes = [
            ProductVariationAttribute.fake().copy(name: "Color", option: "Purple"),
            ProductVariationAttribute.fake().copy(name: "Size", option: "L"),
            ProductVariationAttribute.fake().copy(name: "Style", option: "Casual")
        ]
        let variation = POSProductVariation.fake().copy(
            siteID: sampleSiteID,
            productID: productID,
            productVariationID: Int64(i),
            attributes: attributes,
            price: "\(20 + i).00"
        )
        updatedVariations.append(variation)
    }
    
    let updateCatalog = POSCatalog(products: updatedProducts, variations: updatedVariations)
    
    // When - measure performance of incremental update
    let startTime = Date()
    try await sut.persistIncrementalCatalogData(updateCatalog, siteID: sampleSiteID)
    let endTime = Date()
    let timeElapsed = endTime.timeIntervalSince(startTime)
    
    // Then - verify the operation completed in reasonable time and data is correct
    print("persistIncrementalCatalogData with \(existingProductCount) products and \(existingVariationCount) variations took \(timeElapsed) seconds")
    
    try await grdbManager.databaseConnection.read { db in
        let productCount = try PersistedProduct.fetchCount(db)
        let variationCount = try PersistedProductVariation.fetchCount(db)
        let attributeCount = try PersistedProductAttribute.fetchCount(db)
        let variationAttributeCount = try PersistedProductVariationAttribute.fetchCount(db)
        
        #expect(productCount == existingProductCount)
        #expect(variationCount == existingVariationCount)
        #expect(attributeCount == existingProductCount * 3) // 3 attributes per product
        #expect(variationAttributeCount == existingVariationCount * 3) // 3 attributes per variation
        
        // Verify data was actually updated by checking a sample
        let sampleProduct = try PersistedProduct.filter(sql: "\(PersistedProduct.Columns.id.name) = 1").fetchOne(db)
        #expect(sampleProduct?.name == "Updated Product 1")
        
        let sampleVariation = try PersistedProductVariation.filter(sql: "\(PersistedProductVariation.Columns.id.name) = 1").fetchOne(db)
        #expect(sampleVariation?.price == "21.00")
        
        // Verify attributes were replaced correctly
        let productAttributes = try PersistedProductAttribute.filter(sql: "\(PersistedProductAttribute.Columns.productID.name) = 1").fetchAll(db)
        let attributeNames = Set(productAttributes.map { $0.name })
        #expect(attributeNames == Set(["Color", "Size", "Style"]))
    }
    
    // Performance assertion - should complete within 10 seconds for this dataset size
    #expect(timeElapsed < 10.0, "Performance test failed: operation took \(timeElapsed) seconds, expected < 10 seconds")
}
</details>

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated filtering by SQL with for loops comparing with ID in 7dbd5b3.

Copy link
Contributor

@joshheald joshheald left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some comments in line, but looks good, thanks!

public enum Columns {
static let id = Column(CodingKeys.id)
static let productVariationID = Column(CodingKeys.productVariationID)
public static let productVariationID = Column(CodingKeys.productVariationID)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can make all these public but I'll do it in a future PR.

I'm sure that periphery would moan about it, but actually it's part of the public interface for the persisted classes. Since the error messages are unclear, it's a better experience to make them public ahead of use.

Comment on lines 89 to 100
let productIDs = catalog.products.map { $0.productID }.map { String($0) }.joined(separator: ",")

try PersistedProductImage
.filter(sql: "\(PersistedProductImage.Columns.productID.name) IN (\(productIDs))")
.deleteAll(db)
for image in catalog.productImagesToPersist {
try image.insert(db, onConflict: .replace)
}

try PersistedProductAttribute
.filter(sql: "\(PersistedProductAttribute.Columns.productID.name) IN (\(productIDs))")
.deleteAll(db)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For deletions, the alternative approach is to do PersistedProductImage.filter { $0.productID == product.productID }.deleteAll(db) in the product loop above.

I think when we move to compound keys (including the site ID) this might be an easier approach than constructing SQL strings for compound keys.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to above, I updated filtering by SQL with for loops comparing with ID in 7dbd5b3.

.filter(sql: "\(PersistedProductVariationImage.Columns.productVariationID.name) IN (\(variationIDs))")
.deleteAll(db)
for image in catalog.variationImagesToPersist {
try image.insert(db, onConflict: .replace)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the full sync, I used onConflict: ignore. We should be consistent – it doesn't matter which we go with for images or attributes (when we support global attributes in future.) Either first wins, or last wins.

For variations on an incremental sync, it does matter because we'll definitely want to replace... so perhaps we should use replace everywhere?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, incremental sync requires relace on conflict. For full sync, like you said, either works as we delete all the data at the beginning. It only matters if there are duplicate products/variations in the catalog (which is not impossible from the "Large fun testing" test site) with different values, but there's no way to know whether the first or last is correct in this edge case.

I replaced the ignore with replace on conflict strategy in full sync in
7162e20.


try await grdbManager.databaseConnection.write { db in
let site = PersistedSite(id: siteID)
try site.insert(db, onConflict: .ignore)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we ever insert a Site for an incremental persist? Perhaps this won't ever do anything but is good to have for safety

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch - no, it's not necessary to insert a Site in an incremental sync as a full sync is a prerequisite. I removed it and updated test cases that depend on the Site insertion in cdd580d.

}
}

@Test func persistIncrementalCatalogData_upserts_existing_and_new_products() async throws {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Strictly, I don't think it does.

Upsert is INSERT x ON CONFLICT DO UPDATE SET x.name = name {etc} – there is an upsert(db) function we could use for this, at least for mutable persistable records.

Currently we use insert(db, onConflict: .replace), which generates INSERT x ON CONFLICT DO REPLACE.

AIUI, the main difference is the hooks that are triggered. With replace, records are deleted and recreated, so the deletions will cascade. For upsert, they're just updated so none of the deletion hooks run.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good to know about the difference. I renamed the products/variations test cases to mention "replace" instead of "upsert" in 7b37191 and 20b5910.

Base automatically changed from feat/WOOMOB-1097-incremental-sync-service to trunk September 10, 2025 03:46
@jaclync
Copy link
Contributor Author

jaclync commented Sep 10, 2025

Thanks for the review and GRDB insights! I believe I responded to all the comments, merging now but feel free to leave further suggestions I can make changes separately.

@jaclync jaclync merged commit 3914fc5 into trunk Sep 10, 2025
14 checks passed
@jaclync jaclync deleted the feat/WOOMOB-1298-incremental-sync-persistence branch September 10, 2025 06:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

feature: POS type: task An internally driven task.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants