-
Notifications
You must be signed in to change notification settings - Fork 121
[Local Catalog] Incremental sync: persist incrementally changed products/variations to storage #16106
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
|
| } | ||
|
|
||
| try PersistedProductVariationAttribute | ||
| .filter(sql: "\(PersistedProductVariationAttribute.Columns.productVariationID.name) IN (\(variationIDs))") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is an example of the last resort in p1757321561703349/1757321341.670339-slack-C070SJRA8DP. Please feel free to suggest a better way to do this. The corresponding foreign key column constant needs to be marked as public.
What I've tried:
.filter { variationIDs.contains($0.productVariationID) }.filter { variationIDs.contains(PersistedProductVariationAttribute.Columns.productVariationID) }
The issue is that $0.productVariationID or PersistedProductVariationAttribute.Columns.productVariationID is of Column type, and only basic operators like == are supported. .filter { $0.productVariationID == 2 } works.
We could delete all existing attributes per product/variation in a prettier way, but deleting from productVariationID in (variationIDs) would be the most performant. WDYT? I personally think it's fine having raw SQL + unit tests if it helps with performance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From what I've read and tried, it should be perfectly quick to do it in a loop, as long as that loop is inside the transaction, i.e.
for variation in catalog.variations {
try PersistedProductVariationAttribute
.filter { $0.productVariationID == variation.productVariationID }
.deleteAll(db)
}
That would probably be ok for an incremental sync, which should be a relatively small change set. If it's not a small change, we might be better off doing a full sync anyway?
That said, there's absolutely nothing wrong with using SQL directly. We need to take care to correctly escape it (though type safety helps a lot when we're using Int64 IDs!)
Perhaps see whether you can detect any performance difference between the two?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps see whether you can detect any performance difference between the two?
With the test case at the bottom, for 1000 products each with 3 attributes and 2000 variations each with 2 attributes to replace, the average runtime from 100 runs was:
- SQL version to delete all attributes in a set of IDs: 0.168967 seconds
- For loop to delete attributes for each product/variation ID: 0.284567 seconds
=> there is ~0.12s difference deleting 7k attributes from the test case, not a big difference.
That would probably be ok for an incremental sync, which should be a relatively small change set. If it's not a small change, we might be better off doing a full sync anyway?
Yea, given that it's quite an edge case with a big data set from an incremental sync, the performance difference is negligible for most cases. I will update the implementation to delete existing image(s)/attributes in a for loop.
Test function
```swift @test func persistIncrementalCatalogData_performance_with_large_dataset() async throws { // Given - create a large dataset with existing products and variations that will be replaced let existingProductCount = 1000 let existingVariationCount = 2000 // Create existing products with attributes
var existingProducts: [POSProduct] = []
for i in 1...existingProductCount {
let attributes = [
ProductAttribute.fake().copy(name: "Color", options: ["Red", "Blue", "Green"]),
ProductAttribute.fake().copy(name: "Size", options: ["S", "M", "L", "XL"]),
ProductAttribute.fake().copy(name: "Material", options: ["Cotton", "Polyester"])
]
let product = POSProduct.fake().copy(
siteID: sampleSiteID,
productID: Int64(i),
name: "Product \(i)",
attributes: attributes
)
existingProducts.append(product)
}
// Create existing variations with attributes
var existingVariations: [POSProductVariation] = []
for i in 1...existingVariationCount {
let productID = Int64((i % existingProductCount) + 1)
let attributes = [
ProductVariationAttribute.fake().copy(name: "Color", option: "Red"),
ProductVariationAttribute.fake().copy(name: "Size", option: "M")
]
let variation = POSProductVariation.fake().copy(
siteID: sampleSiteID,
productID: productID,
productVariationID: Int64(i),
attributes: attributes,
price: "\(10 + i).00"
)
existingVariations.append(variation)
}
// Insert existing data
let existingCatalog = POSCatalog(products: existingProducts, variations: existingVariations)
try await sut.replaceAllCatalogData(existingCatalog, siteID: sampleSiteID)
// Create update catalog with modified data
var updatedProducts: [POSProduct] = []
for i in 1...existingProductCount {
let attributes = [
ProductAttribute.fake().copy(name: "Color", options: ["Purple", "Orange", "Yellow"]),
ProductAttribute.fake().copy(name: "Size", options: ["XS", "S", "M", "L", "XL", "XXL"]),
ProductAttribute.fake().copy(name: "Style", options: ["Casual", "Formal"])
]
let product = POSProduct.fake().copy(
siteID: sampleSiteID,
productID: Int64(i),
name: "Updated Product \(i)",
attributes: attributes
)
updatedProducts.append(product)
}
var updatedVariations: [POSProductVariation] = []
for i in 1...existingVariationCount {
let productID = Int64((i % existingProductCount) + 1)
let attributes = [
ProductVariationAttribute.fake().copy(name: "Color", option: "Purple"),
ProductVariationAttribute.fake().copy(name: "Size", option: "L"),
ProductVariationAttribute.fake().copy(name: "Style", option: "Casual")
]
let variation = POSProductVariation.fake().copy(
siteID: sampleSiteID,
productID: productID,
productVariationID: Int64(i),
attributes: attributes,
price: "\(20 + i).00"
)
updatedVariations.append(variation)
}
let updateCatalog = POSCatalog(products: updatedProducts, variations: updatedVariations)
// When - measure performance of incremental update
let startTime = Date()
try await sut.persistIncrementalCatalogData(updateCatalog, siteID: sampleSiteID)
let endTime = Date()
let timeElapsed = endTime.timeIntervalSince(startTime)
// Then - verify the operation completed in reasonable time and data is correct
print("persistIncrementalCatalogData with \(existingProductCount) products and \(existingVariationCount) variations took \(timeElapsed) seconds")
try await grdbManager.databaseConnection.read { db in
let productCount = try PersistedProduct.fetchCount(db)
let variationCount = try PersistedProductVariation.fetchCount(db)
let attributeCount = try PersistedProductAttribute.fetchCount(db)
let variationAttributeCount = try PersistedProductVariationAttribute.fetchCount(db)
#expect(productCount == existingProductCount)
#expect(variationCount == existingVariationCount)
#expect(attributeCount == existingProductCount * 3) // 3 attributes per product
#expect(variationAttributeCount == existingVariationCount * 3) // 3 attributes per variation
// Verify data was actually updated by checking a sample
let sampleProduct = try PersistedProduct.filter(sql: "\(PersistedProduct.Columns.id.name) = 1").fetchOne(db)
#expect(sampleProduct?.name == "Updated Product 1")
let sampleVariation = try PersistedProductVariation.filter(sql: "\(PersistedProductVariation.Columns.id.name) = 1").fetchOne(db)
#expect(sampleVariation?.price == "21.00")
// Verify attributes were replaced correctly
let productAttributes = try PersistedProductAttribute.filter(sql: "\(PersistedProductAttribute.Columns.productID.name) = 1").fetchAll(db)
let attributeNames = Set(productAttributes.map { $0.name })
#expect(attributeNames == Set(["Color", "Size", "Style"]))
}
// Performance assertion - should complete within 10 seconds for this dataset size
#expect(timeElapsed < 10.0, "Performance test failed: operation took \(timeElapsed) seconds, expected < 10 seconds")
}
</details>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated filtering by SQL with for loops comparing with ID in 7dbd5b3.
joshheald
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some comments in line, but looks good, thanks!
| public enum Columns { | ||
| static let id = Column(CodingKeys.id) | ||
| static let productVariationID = Column(CodingKeys.productVariationID) | ||
| public static let productVariationID = Column(CodingKeys.productVariationID) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can make all these public but I'll do it in a future PR.
I'm sure that periphery would moan about it, but actually it's part of the public interface for the persisted classes. Since the error messages are unclear, it's a better experience to make them public ahead of use.
| let productIDs = catalog.products.map { $0.productID }.map { String($0) }.joined(separator: ",") | ||
|
|
||
| try PersistedProductImage | ||
| .filter(sql: "\(PersistedProductImage.Columns.productID.name) IN (\(productIDs))") | ||
| .deleteAll(db) | ||
| for image in catalog.productImagesToPersist { | ||
| try image.insert(db, onConflict: .replace) | ||
| } | ||
|
|
||
| try PersistedProductAttribute | ||
| .filter(sql: "\(PersistedProductAttribute.Columns.productID.name) IN (\(productIDs))") | ||
| .deleteAll(db) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For deletions, the alternative approach is to do PersistedProductImage.filter { $0.productID == product.productID }.deleteAll(db) in the product loop above.
I think when we move to compound keys (including the site ID) this might be an easier approach than constructing SQL strings for compound keys.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar to above, I updated filtering by SQL with for loops comparing with ID in 7dbd5b3.
| .filter(sql: "\(PersistedProductVariationImage.Columns.productVariationID.name) IN (\(variationIDs))") | ||
| .deleteAll(db) | ||
| for image in catalog.variationImagesToPersist { | ||
| try image.insert(db, onConflict: .replace) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the full sync, I used onConflict: ignore. We should be consistent – it doesn't matter which we go with for images or attributes (when we support global attributes in future.) Either first wins, or last wins.
For variations on an incremental sync, it does matter because we'll definitely want to replace... so perhaps we should use replace everywhere?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea, incremental sync requires relace on conflict. For full sync, like you said, either works as we delete all the data at the beginning. It only matters if there are duplicate products/variations in the catalog (which is not impossible from the "Large fun testing" test site) with different values, but there's no way to know whether the first or last is correct in this edge case.
I replaced the ignore with replace on conflict strategy in full sync in
7162e20.
|
|
||
| try await grdbManager.databaseConnection.write { db in | ||
| let site = PersistedSite(id: siteID) | ||
| try site.insert(db, onConflict: .ignore) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we ever insert a Site for an incremental persist? Perhaps this won't ever do anything but is good to have for safety
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch - no, it's not necessary to insert a Site in an incremental sync as a full sync is a prerequisite. I removed it and updated test cases that depend on the Site insertion in cdd580d.
| } | ||
| } | ||
|
|
||
| @Test func persistIncrementalCatalogData_upserts_existing_and_new_products() async throws { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Strictly, I don't think it does.
Upsert is INSERT x ON CONFLICT DO UPDATE SET x.name = name {etc} – there is an upsert(db) function we could use for this, at least for mutable persistable records.
Currently we use insert(db, onConflict: .replace), which generates INSERT x ON CONFLICT DO REPLACE.
AIUI, the main difference is the hooks that are triggered. With replace, records are deleted and recreated, so the deletions will cascade. For upsert, they're just updated so none of the deletion hooks run.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
…ntal sync is always called after a full sync.
…emental sync for consistency.
|
Thanks for the review and GRDB insights! I believe I responded to all the comments, merging now but feel free to leave further suggestions I can make changes separately. |

For WOOMOB-1298
Description
This PR implements persistence capabilities for the POS catalog incremental sync service. The changes enable the service to save incremental catalog updates (products and variations) to the database after fetching them from the API.
Key changes:
persistIncrementalCatalogDatamethod toPOSCatalogPersistenceServicethat handles upsert operations for products and variationsPOSCatalogIncrementalSyncServiceto integrate with the persistence service and save data after successful API fetchesThe implementation ensures that:
Steps to reproduce
Just CI is sufficient, as the full/incremental services aren't integrated with the app yet.
Testing information
I tested that incremental sync worked after a full sync for a big site, making changes to at least one product and variation while full sync is in progress:
RELEASE-NOTES.txtif necessary.