-
Notifications
You must be signed in to change notification settings - Fork 1.3k
fix(core): apply pessimistic locking to prevent order modification race conditions #3398 #4071
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
WalkthroughAdds pessimistic locking to OrderService methods to prevent concurrent-write races, introduces an e2e race-condition test performing parallel AddItemToOrder mutations, and updates CI/e2e readiness, DB connection settings, and MariaDB startup defaults for more reliable end-to-end testing. Changes
Possibly related issues
🚥 Pre-merge checks | ✅ 3 | ❌ 2❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
📜 Recent review detailsConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro 📒 Files selected for processing (3)
🚧 Files skipped from review as they are similar to previous changes (3)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
packages/core/src/service/services/order.service.ts (1)
1214-1233: Remove redundant order fetch in setShippingMethod.The method fetches the order with a pessimistic write lock at Line 1219, then calls
orderModifier.setShippingMethods, and immediately fetches the same order again with the same lock at lines 1228-1230. The second fetch appears unnecessary and could impact performance. Consider whetherorderModifier.setShippingMethodsalready updates the order entity in place, making the second fetch redundant.🔎 Possible simplification
async setShippingMethod( ctx: RequestContext, orderId: ID, shippingMethodIds: ID[], ): Promise<ErrorResultUnion<SetOrderShippingMethodResult, Order>> { const order = await this.getOrderOrThrow(ctx, orderId, undefined, { mode: 'pessimistic_write' }); const validationError = this.assertAddingItemsState(order); if (validationError) { return validationError; } const result = await this.orderModifier.setShippingMethods(ctx, order, shippingMethodIds); if (isGraphQlErrorResult(result)) { return result; } - const updatedOrder = await this.getOrderOrThrow(ctx, orderId, undefined, { - mode: 'pessimistic_write', - }); - await this.applyPriceAdjustments(ctx, updatedOrder); - return this.connection.getRepository(ctx, Order).save(updatedOrder); + await this.applyPriceAdjustments(ctx, order); + return this.connection.getRepository(ctx, Order).save(order);Verify whether
orderModifier.setShippingMethodsmodifies the order in place or returns a modified order before applying this change.
🧹 Nitpick comments (2)
packages/core/e2e/race-condition.e2e-spec.ts (2)
46-56: Verify individual mutation results to detect partial failures.The test only checks the final
totalQuantitybut doesn't verify that all individualaddItemToOrdermutations succeeded. If some mutations fail with errors, the test might pass incorrectly if the successful ones happen to produce the expected total.🔎 Recommended enhancement
- const promises: Array<Promise<any>> = []; + const results: Array<any> = []; for (let i = 0; i < concurrency; i++) { - promises.push( - shopClient.query(ADD_ITEM_TO_ORDER, { - productVariantId: variantId, - quantity: quantityPerRequest, - }), - ); + const result = await shopClient.query(ADD_ITEM_TO_ORDER, { + productVariantId: variantId, + quantity: quantityPerRequest, + }); + results.push(result); } - await Promise.all(promises); + // Verify all mutations succeeded + for (const result of results) { + expect(result.addItemToOrder.__typename).not.toBe('ErrorResult'); + expect(result.addItemToOrder.id).toBeDefined(); + }Note: This changes the test to sequential execution, which defeats the race condition testing purpose. For concurrent execution with verification, consider:
const promises: Array<Promise<any>> = []; for (let i = 0; i < concurrency; i++) { promises.push( shopClient.query(ADD_ITEM_TO_ORDER, { productVariantId: variantId, quantity: quantityPerRequest, }), ); } - await Promise.all(promises); + const results = await Promise.all(promises); + + // Verify all mutations succeeded + for (const result of results) { + expect(result.addItemToOrder.__typename).not.toBe('ErrorResult'); + expect(result.addItemToOrder.id).toBeDefined(); + }
46-46: ReplaceArray<Promise<any>>with a properly typed array.Using
anybypasses type safety. Consider defining a proper type for the mutation result or using the generated GraphQL types.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
packages/core/e2e/race-condition.e2e-spec.ts(1 hunks)packages/core/src/service/services/order.service.ts(28 hunks)
🧰 Additional context used
🧠 Learnings (3)
📚 Learning: 2025-11-21T18:29:30.413Z
Learnt from: DanielBiegler
Repo: vendure-ecommerce/vendure PR: 3815
File: packages/core/e2e/api-key.e2e-spec.ts:160-224
Timestamp: 2025-11-21T18:29:30.413Z
Learning: In the vendure-ecommerce/vendure repository, hand-written `gql` template literals (e.g., `CREATE_API_KEY`, `API_KEY`, etc.) in test files like `packages/core/e2e/api-key.e2e-spec.ts` are the source documents for GraphQL code generation. The code generator processes these to produce typed `*Document` exports (e.g., `CreateApiKeyDocument`, `ApiKeyDocument`) that are then imported and used in the tests. These hand-written gql constants should not be considered unused or dead code—they are essential inputs to the code generation process.
Applied to files:
packages/core/e2e/race-condition.e2e-spec.ts
📚 Learning: 2025-06-25T11:20:48.133Z
Learnt from: dlhck
Repo: vendure-ecommerce/vendure PR: 3615
File: packages/dashboard/src/app/routes/_authenticated/_products/components/product-bulk-actions.tsx:161-214
Timestamp: 2025-06-25T11:20:48.133Z
Learning: Using `useMutation` with multiple concurrent `mutate` calls (e.g., in a `map` operation) causes race conditions and unpredictable behavior. This pattern has caused issues in the Vendure dashboard codebase. Instead, use `mutateAsync` with sequential processing (for loop with await) or proper chaining to ensure only one mutation runs at a time.
Applied to files:
packages/core/e2e/race-condition.e2e-spec.ts
📚 Learning: 2025-11-10T14:23:14.480Z
Learnt from: gabriellbui
Repo: vendure-ecommerce/vendure PR: 3953
File: packages/dev-server/graphql/graphql-env.d.ts:335-335
Timestamp: 2025-11-10T14:23:14.480Z
Learning: Repo vendure-ecommerce/vendure: packages/dev-server/graphql/graphql-env.d.ts is a generated dev-server introspection file and should be gitignored; prefer regeneration and/or removing it from PRs rather than reviewing its contents.
Applied to files:
packages/core/e2e/race-condition.e2e-spec.ts
🧬 Code graph analysis (2)
packages/core/e2e/race-condition.e2e-spec.ts (2)
e2e-common/test-config.ts (1)
TEST_SETUP_TIMEOUT_MS(28-28)packages/core/src/api/resolvers/shop/shop-order.resolver.ts (1)
activeOrder(99-116)
packages/core/src/service/services/order.service.ts (3)
packages/core/src/event-bus/events/order-event.ts (1)
order(33-35)packages/core/src/api/resolvers/shop/shop-order.resolver.ts (1)
order(75-95)packages/core/src/api/resolvers/admin/order.resolver.ts (1)
order(66-73)
🔇 Additional comments (4)
packages/core/src/service/services/order.service.ts (4)
1980-2003: LGTM!The
getOrderOrThrowmethod correctly propagates the lock parameter tofindOneand maintains proper error handling.
214-296: Apply pessimistic lock to OrderLines query to prevent concurrent modifications.The pessimistic lock on the Order query (line 255) does not extend to the separately-fetched OrderLines (lines 268-279). Each query requires explicit locking—when the main Order is locked, the separate OrderLines query lacks a lock parameter and can be modified by concurrent transactions. Either apply the lock to the linesQb as well, or document that this method must be called within a transaction where the caller is responsible for ensuring the lines' consistency.
214-218: Comprehensive pessimistic locking is correctly implemented across all order modification paths.The OrderService implements consistent pessimistic_write locking for all methods that modify orders: updateCustomFields, updateOrderCustomer, addItemsToOrder, adjustOrderLine, removeItemsFromOrder, addSurchargeToOrder, removeSurchargeFromOrder, applyCouponCode, removeCouponCode, setShippingAddress, setBillingAddress, unsetShippingAddress, unsetBillingAddress, setShippingMethods, transitionToState, and multiple state transition handlers. New order creation (create/createDraft) correctly omits initial locks since they're new records, then immediately calls transitionToState which applies locking. OrderModifier helper methods operate on already-locked orders, and event bus publications occur within locked transactions without bypassing service methods. No unprotected modification paths were identified.
214-218: Transaction wrapping for pessimistic locks is already in place via the @transaction decorator.While the
findOnemethod doesn't explicitly wrap queries in transactions at the service level, the codebase implements transaction handling at the resolver boundary layer. The@Transaction()decorator on resolver methods (e.g.,addItemsToOrderat line 465) creates a transactional QueryRunner that's attached to the RequestContext. Service methods called within these decorated resolvers automatically use the transactional context. All methods using pessimistic locks (viagetOrderOrThrow) are invoked exclusively from@Transaction-decorated resolver methods, ensuring proper transaction boundaries. This architectural pattern—transactions at the boundary rather than service layer—is valid and correctly implemented in this codebase.Likely an incorrect or invalid review comment.
| const quantityPerRequest = 1; | ||
| const concurrency = 5; | ||
|
|
||
| // Ejecutamos 5 peticiones simultáneas para añadir el item |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion | 🟠 Major
Replace Spanish comments with English.
The codebase should maintain English as the primary language for comments to ensure consistency and accessibility for all contributors.
🔎 Proposed fix
- // Ejecutamos 5 peticiones simultáneas para añadir el item
+ // Execute 5 concurrent requests to add the item
const promises: Array<Promise<any>> = [];- // Si hay condición de carrera, el total será menor a 5 (algunas escrituras se sobrescribieron)
+ // If there's a race condition, the total will be less than 5 (some writes were overwritten)
expect(activeOrder.totalQuantity).toBe(concurrency * quantityPerRequest);Also applies to: 62-62
🤖 Prompt for AI Agents
In packages/core/e2e/race-condition.e2e-spec.ts around lines 45 and 62, the
inline comments are written in Spanish; replace them with clear English
equivalents to keep the codebase language consistent. Update the comment at line
45 (and the similar one at line 62) to concisely describe the action being
performed in English, e.g., "Execute 5 concurrent requests to add the item" or
similar wording that matches the test intent.
af20480 to
3833bbe
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (1)
packages/core/src/service/services/order.service.ts (1)
1228-1230: Remove redundant Order reload.After calling
setShippingMethods(Line 1224), the Order is immediately reloaded withgetOrderOrThrow(Lines 1228-1230) before callingapplyPriceAdjustments. This reload appears unnecessary sinceapplyPriceAdjustmentswill reload the Order at Line 2151.🔎 Proposed refactor
const result = await this.orderModifier.setShippingMethods(ctx, order, shippingMethodIds); if (isGraphQlErrorResult(result)) { return result; } - const updatedOrder = await this.getOrderOrThrow(ctx, orderId, undefined, { - mode: 'pessimistic_write', - }); - await this.applyPriceAdjustments(ctx, updatedOrder); - return this.connection.getRepository(ctx, Order).save(updatedOrder); + await this.applyPriceAdjustments(ctx, order); + return this.connection.getRepository(ctx, Order).save(order);
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
.github/workflows/build_and_test.ymldocker-compose.ymle2e-common/test-config.tspackages/core/e2e/race-condition.e2e-spec.tspackages/core/src/service/services/order.service.ts
🚧 Files skipped from review as they are similar to previous changes (1)
- packages/core/e2e/race-condition.e2e-spec.ts
🧰 Additional context used
🧬 Code graph analysis (1)
packages/core/src/service/services/order.service.ts (3)
packages/core/src/event-bus/events/order-event.ts (1)
order(33-35)packages/core/src/api/resolvers/shop/shop-order.resolver.ts (1)
order(75-95)packages/core/src/api/resolvers/admin/order.resolver.ts (1)
order(66-73)
🔇 Additional comments (6)
packages/core/src/service/services/order.service.ts (2)
214-296: Pessimistic locking implementation looks solid.The addition of the
lockparameter tofindOneand its propagation throughgetOrderOrThrowto all order modification methods effectively prevents the race condition described in issue #3398.Key strengths:
- Consistent application of
pessimistic_writelocks across all state-modifying operations- Lock options properly threaded through TypeORM's
setFindOptions- Private
getOrderOrThrowhelper ensures lock usage is centralized
257-257: This change is safe. All Orders are assigned to at least one Channel viaassignToCurrentChannel()during creation (lines 456-475), so theinnerJoinwill not exclude any valid orders. The query is intentionally scoped to the current context's channel via theandWhereclause, consistent with other similar queries in this file.Likely an incorrect or invalid review comment.
e2e-common/test-config.ts (1)
107-107: LGTM! Username changes align with docker-compose configuration.The username change from
roottovendurefor MariaDB and MySQL correctly matches theMARIADB_USERandMYSQL_USERenvironment variables defined in docker-compose.yml.Also applies to: 116-116
.github/workflows/build_and_test.yml (3)
148-181: Excellent addition of service readiness check.The new "Wait for services to be ready" step ensures Redis, databases, and Elasticsearch are accepting connections before tests begin. This prevents flaky test failures due to service startup timing, which is especially important for the new race-condition test scenarios.
Implementation uses appropriate timeouts (60 seconds) and checks the correct ports for each service.
115-116: Reduced Elasticsearch memory allocation for CI environment.The change from 512MB to 256MB for
ES_JAVA_OPTSand disablingbootstrap.memory_lockreduces resource consumption in the shared CI environment. This is a reasonable optimization for test workloads.
87-89: Explicit database credentials improve configuration clarity.Adding explicit
MARIADB_USER,MARIADB_PASSWORD,MYSQL_USER, andMYSQL_PASSWORDenvironment variables (alongside the existing_DATABASEvariables) makes the service configuration more explicit and aligns with the updated health checks that reference these credentials.Also applies to: 97-99
docker-compose.yml
Outdated
| POSTGRES_DB: vendure-dev | ||
| POSTGRES_USER: vendure | ||
| POSTGRES_PASSWORD: password | ||
| POSTGRES_PASSWORD: postgres |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Password inconsistency across Postgres containers and test config.
The postgres_16 container now uses password postgres, but:
- The
postgres_12container (Line 48) still uses passwordpassword - The e2e test config at
e2e-common/test-config.ts(Line 99) uses passwordpassword
This inconsistency will cause connection failures when tests attempt to use postgres_16.
🔎 Recommended fix
- POSTGRES_PASSWORD: postgres
+ POSTGRES_PASSWORD: password📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| POSTGRES_PASSWORD: postgres | |
| POSTGRES_PASSWORD: password |
🤖 Prompt for AI Agents
In docker-compose.yml line 61 (postgres_16) and docker-compose.yml line 48
(postgres_12) plus e2e-common/test-config.ts line 99, the Postgres passwords are
inconsistent (postgres_16 uses "postgres" while postgres_12 and the e2e test
config use "password"); update the values so all three use the same password
(preferably change postgres_12 and e2e-common/test-config.ts to "postgres" to
match postgres_16) and ensure any related test or container env vars reference
that same value so tests can connect successfully.
3833bbe to
6c8b57d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (2)
.github/workflows/build_and_test.yml (1)
148-181: Consider adding timeout handling and progress feedback.The wait logic is a good practice to prevent test flakiness. However, consider these improvements:
- Explicit failure on timeout: Currently, if a service doesn't become ready within 60 seconds, the script continues silently. Consider failing explicitly.
- Progress feedback: Add periodic echo statements to show which service is being waited for and how long it's taking.
- Verify nc availability: The script assumes
nc(netcat) is installed. Consider adding a check or using a more portable approach.Example improvement with timeout handling
+ - name: Wait for services to be ready + run: | echo "Waiting for required services..." # Wait for redis - for i in {1..60}; do + echo "Checking Redis..." + for i in {1..60}; do if nc -z localhost $E2E_REDIS_PORT; then break; fi + if [ $i -eq 60 ]; then + echo "ERROR: Redis failed to become ready within 60 seconds" + exit 1 + fi sleep 1 done + echo "Redis is ready"packages/core/src/service/services/order.service.ts (1)
1214-1233: Consider refactoring the double-fetch pattern.The order is fetched twice with
pessimistic_writelock within the same method:
- Line 1219: Initial fetch with lock
- Lines 1228-1230: Second fetch with lock after
setShippingMethodsWhile this ensures you have the latest state after
setShippingMethodsmodifies the order, it's inefficient. Since you already hold the pessimistic write lock from the first fetch (maintained throughout the transaction), consider these alternatives:
- Return the updated order from
orderModifier.setShippingMethodsinstead of fetching again- Refresh the existing order entity using TypeORM's refresh mechanism if needed
- Document why the second fetch is necessary if there's a specific reason
This pattern also appears in
modifyOrder(lines 1330 and 1351), suggesting it might be worth addressing in multiple places.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
.github/workflows/build_and_test.ymle2e-common/test-config.tspackages/core/e2e/race-condition.e2e-spec.tspackages/core/src/service/services/order.service.ts
🚧 Files skipped from review as they are similar to previous changes (2)
- packages/core/e2e/race-condition.e2e-spec.ts
- e2e-common/test-config.ts
🧰 Additional context used
🧬 Code graph analysis (1)
packages/core/src/service/services/order.service.ts (3)
packages/core/src/event-bus/events/order-event.ts (1)
order(33-35)packages/core/src/api/resolvers/shop/shop-order.resolver.ts (1)
order(75-95)packages/core/src/api/resolvers/admin/order.resolver.ts (1)
order(66-73)
🔇 Additional comments (9)
.github/workflows/build_and_test.yml (4)
87-92: LGTM! Good security practice.Using a dedicated
vendureuser instead of root for database access is a better security practice. The credentials are consistent between the environment variables and the health-check command.
97-102: LGTM! Improved reliability.The MySQL configuration follows the same security improvements as MariaDB. The increased health-check timeout (20s vs 5s) should help prevent flaky test failures if MySQL takes longer to become ready.
110-110: LGTM! Proper health check.Using
pg_isready -U vendureis the correct approach for PostgreSQL health checks and aligns with the configuredPOSTGRES_USER.
115-116: LGTM! Resource optimization for CI.Reducing Elasticsearch memory allocation and disabling
bootstrap.memory_lockhelps prevent OOM issues in the shared CI environment while still providing sufficient resources for e2e tests.packages/core/src/service/services/order.service.ts (5)
214-219: LGTM! Backward-compatible API extension.Adding the optional
lockparameter maintains backward compatibility while enabling pessimistic locking for race condition prevention. The parameter correctly follows TypeORM's pessimistic locking API.
1980-2003: LGTM! Correct lock propagation.The private
getOrderOrThrowmethod correctly accepts and forwards the lock parameter tofindOne, maintaining proper encapsulation while enabling pessimistic locking throughout the service.
506-506: Excellent! Comprehensive and consistent pessimistic locking.The pessimistic locking strategy is correctly and consistently applied across all order mutation operations:
✅ Write operations use
pessimistic_write: All methods that modify order state acquire a write lock, preventing concurrent modifications.✅ Read operations remain lock-free: Query methods like
getEligibleShippingMethodsandgetEligiblePaymentMethodscorrectly don't use locks, maintaining good read performance.✅ Addresses the root cause: This implementation directly solves the "lost update" race condition described in issue #3398, where parallel requests could overwrite each other's changes.
This is a solid solution to the concurrency problem.
Also applies to: 522-524, 612-612, 759-759, 887-887, 930-930, 963-963, 983-983, 1005-1005, 1033-1033, 1084-1084, 1108-1108, 1134-1134, 1158-1158, 1219-1219, 1244-1244, 1330-1332, 1376-1376, 1445-1447, 1696-1698, 1790-1790, 1816-1816
214-219: No changes required; Vendure's transaction architecture properly handles pessimistic locks.Pessimistic locks in this codebase are correctly used within database transactions through Vendure's architecture. The
@Transaction()decorator is applied at the API resolver layer (e.g., inorder.resolver.ts), and theRequestContextpassed to service methods carries transaction state. TheTransactionalConnection.getRepository(ctx, ...)call respects ongoing transactions from the context, so service methods do not require their own@Transaction()decorators. The e2e test validates that concurrent order modifications are handled correctly without lost updates.Likely an incorrect or invalid review comment.
252-259: This is not a breaking change. TheinnerJoinonorder.channelsis the correct and intentional pattern for Vendure's multi-tenant channel-scoped access control. Orders are inherently created within a channel context, and theinnerJoinwith the channel filter ensures proper tenant isolation. This pattern is used consistently throughout the codebase in similar queries (e.g.,getActiveOrderForUser, product variant lookups, and the generic list query builder). No change fromleftJoinappears to have been made.Likely an incorrect or invalid review comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In @.github/workflows/build_and_test.yml:
- Around line 156-210: The step "Wait for services to be ready" references env
vars ($E2E_REDIS_PORT, $E2E_MARIADB_PORT, $E2E_MYSQL_PORT, $E2E_POSTGRES_PORT,
$E2E_ELASTIC_PORT) that are only defined in the later "e2e tests" step, causing
them to be undefined; fix by moving the env: block that defines those E2E_*
variables to the job level (so all steps including "Wait for services to be
ready" and "Configure database permissions" can see them) or alternatively
duplicate that env: block into both the "Wait for services to be ready" and
"Configure database permissions" steps so each step has the needed environment
variables.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
.github/workflows/build_and_test.ymldocker-compose.ymle2e-common/test-config.ts
🚧 Files skipped from review as they are similar to previous changes (1)
- e2e-common/test-config.ts
🔇 Additional comments (9)
docker-compose.yml (2)
48-49: Past password inconsistency has been resolved.The previous review flagged that
postgres_16andpostgres_12had different passwords. Both containers now correctly usePOSTGRES_PASSWORD: password, resolving the inconsistency.Also applies to: 62-62
14-14: LGTM: InnoDB configuration enables pessimistic locking.Explicitly setting InnoDB as the default storage engine and enabling per-table files is essential for the pessimistic locking (
SELECT ... FOR UPDATE) introduced in this PR. InnoDB supports row-level locking required by the race condition fix, whereas MyISAM does not..github/workflows/build_and_test.yml (7)
87-95: LGTM: MariaDB configuration aligns with pessimistic locking requirements.The added credentials (
vendure/password) and InnoDB-specific startup options (--innodb_file_per_table=1 --default-storage-engine=InnoDB) ensure the test environment supports row-level locking needed for the race condition fix. The performance tuning parameters (innodb_flush_log_at_trx_commit=2,innodb_log_buffer_size=32M) are appropriate for test environments.
100-105: LGTM: MySQL configuration with appropriate startup resilience.Credentials are consistent with other services, and the extended health check timeout (20s) and retries (10) appropriately account for MySQL 8's longer startup time.
113-113: LGTM: PostgreSQL health check updated for vendure user.The health check correctly references the
vendureuser and increased retries improve startup reliability.
118-119: Verify Elasticsearch tests with reduced memory allocation.The heap size is reduced to 256MB (from typically 1GB+) and memory locking is disabled. While this optimizes CI resource usage, ensure Elasticsearch-dependent tests don't fail due to memory constraints or increased GC pressure.
129-129: LGTM: Extended health check timeout compensates for reduced memory.The increased timeout (10s) and retries (5) appropriately account for slower Elasticsearch startup with reduced heap allocation.
151-155: LGTM: System tuning supports concurrent test scenarios.The
vm.overcommit_memory=1setting allows more aggressive memory allocation for multiple database containers, and the increased connection queue size supports the high-concurrency race condition tests introduced in this PR.
211-231: Critical: Same environment variable scoping issue.This step has the same problem as the "Wait for services to be ready" step: the variables
$E2E_MARIADB_PORT,$E2E_MYSQL_PORT, and$E2E_POSTGRES_PORTare undefined here. Apply the same fix by adding theenv:block to this step.Likely an incorrect or invalid review comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
.github/workflows/build_and_test.yml (1)
156-217: Previous critical issue resolved: Environment variables are now properly defined.The environment variables (
E2E_REDIS_PORT,E2E_MARIADB_PORT, etc.) are now correctly defined in the step-levelenv:block (lines 157-163), resolving the critical issue from the previous review. The readiness checks use appropriate connection tests rather than simple port checks, which is good.Optional: Consider explicit timeout failures.
The polling loops continue execution even if services don't become ready within 120 seconds. While this may provide more context through subsequent test failures, explicitly failing with a clear error message when timeouts are reached could speed up debugging in CI.
♻️ Example: Add explicit timeout failure
for i in {1..60}; do if nc -z localhost $E2E_REDIS_PORT && redis-cli -p $E2E_REDIS_PORT ping &>/dev/null; then echo "✓ Redis is ready" break fi echo "Waiting for Redis... ($i/60)" sleep 2 + if [ $i -eq 60 ]; then + echo "❌ Redis failed to become ready within 120 seconds" + exit 1 + fi doneApply similar logic to the other service wait loops.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
.github/workflows/build_and_test.yml
🔇 Additional comments (3)
.github/workflows/build_and_test.yml (3)
82-95: LGTM! InnoDB configuration supports pessimistic locking.The MariaDB service configuration correctly adds InnoDB-specific startup options (
--default-storage-engine=InnoDB,--innodb_file_per_table=1) which are essential for the pessimistic locking implementation (SELECT ... FOR UPDATE) introduced in this PR. The explicit database and user setup, along with the health check using the vendure user, aligns well with the E2E test requirements.
151-155: LGTM! System tuning appropriate for concurrent services.The sysctl configurations (
vm.overcommit_memory=1andnet.core.somaxconn=65535) optimize the CI environment for running multiple database services and handling concurrent connections, which is particularly relevant for testing race condition scenarios.
218-243: LGTM! Database permissions properly configured for E2E tests.The permission grants are appropriate for the E2E test environment:
- MariaDB/MySQL: Grants all privileges on wildcard databases and
e2e_%pattern databases- PostgreSQL: Grants CREATEDB privilege to allow test database creation
- Error handling with
|| echoprevents failures if permissions already existThe environment variables are properly defined in this step's
env:block, consistent with the previous step.
|
Hi! Can you review that part and make sure this PR contains only those changes that are absolutely necessary to address the issue at hand? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
🤖 Fix all issues with AI agents
In @.github/workflows/build_and_test.yml:
- Around line 244-247: Remove the "Free up ports" step that runs sudo fuser -k
3000/tcp; specifically delete the GitHub Actions job step with name "Free up
ports" (the block that echoes "Ensuring port 3000 is free..." and calls sudo
fuser -k 3000/tcp || true). If port conflicts are observed afterward, open a
separate issue to investigate why test processes launched by the workflow (or
tests themselves) are not terminating cleanly rather than reintroducing a
teardown workaround.
- Around line 151-155: The CI step named "Configure system memory for Docker"
that runs sudo sysctl -w vm.overcommit_memory=1 and sudo sysctl -w
net.core.somaxconn=65535 should be removed from
.github/workflows/build_and_test.yml unless you can prove it's required for the
race condition test; either delete that entire step block or move it into a
separate infrastructure-focused PR/workflow and reference its necessity in the
race-condition test PR. Ensure the workflow no longer contains the run commands
echo "Configuring system memory settings..." / sudo sysctl -w
vm.overcommit_memory=1 / sudo sysctl -w net.core.somaxconn=65535 if you opt to
remove them.
- Around line 218-243: The "Configure database permissions" CI step grants
overly broad privileges and appears unnecessary; remove this entire step block
(the job step named "Configure database permissions" that uses env vars
E2E_MYSQL_PORT/E2E_MARIADB_PORT/E2E_POSTGRES_PORT and DB) unless the specific
race-condition test requires dynamic DB creation; if dynamic creation is
required, replace the broad GRANT/ALTER commands with minimal, database-scoped
grants (or move them into the service startup/env config) and add a comment
explaining why elevated privileges are needed for the race-condition test.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
.github/workflows/build_and_test.yml
🔇 Additional comments (2)
.github/workflows/build_and_test.yml (2)
90-95: Clarify which MariaDB configuration flags are required for pessimistic locking.The added InnoDB-specific flags include:
--default-storage-engine=InnoDB: Likely necessary for pessimistic locking support (SELECT ... FOR UPDATE requires InnoDB)--innodb_file_per_table=1: Probably unnecessary for locking--innodb_flush_log_at_trx_commit=2: Performance tuning, unrelated to locking--innodb_log_buffer_size=32M: Performance tuning, unrelated to lockingThe comment mentions v11.6.2+ default changes causing test failures, but doesn't explain which specific default is problematic.
Please clarify:
- Is
--default-storage-engine=InnoDBthe only flag actually required for the race condition test?- What specific v11.6.2+ default change causes failures, and which flag addresses it?
- Can the performance tuning flags be removed?
If only the storage engine flag is needed, simplify to just that flag and document why it's required.
87-247: The workflow file.github/workflows/build_and_test.ymlhas not been modified in this PR. All the infrastructure components mentioned—database service configuration with InnoDB flags, system memory configuration, service readiness checks, database permission granting, and port cleanup—already exist in the repository from previous commits. The only new addition in this PR is the race condition test file (packages/core/e2e/race-condition.e2e-spec.ts), which properly leverages the existing workflow infrastructure designed to support pessimistic locking. No workflow scope issues apply to this PR.Likely an incorrect or invalid review comment.
.github/workflows/build_and_test.yml
Outdated
| - name: Configure system memory for Docker | ||
| run: | | ||
| echo "Configuring system memory settings..." | ||
| sudo sysctl -w vm.overcommit_memory=1 | ||
| sudo sysctl -w net.core.somaxconn=65535 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion | 🟠 Major
System memory configuration appears unrelated to the race condition fix.
Configuring vm.overcommit_memory and net.core.somaxconn are OS-level kernel parameters that don't directly relate to testing database pessimistic locking for race conditions. If these settings are addressing memory pressure or connection issues in CI, they should be handled in a separate infrastructure-focused PR.
Remove this step unless you can demonstrate it's specifically required for the race condition test to pass.
🤖 Prompt for AI Agents
In @.github/workflows/build_and_test.yml around lines 151 - 155, The CI step
named "Configure system memory for Docker" that runs sudo sysctl -w
vm.overcommit_memory=1 and sudo sysctl -w net.core.somaxconn=65535 should be
removed from .github/workflows/build_and_test.yml unless you can prove it's
required for the race condition test; either delete that entire step block or
move it into a separate infrastructure-focused PR/workflow and reference its
necessity in the race-condition test PR. Ensure the workflow no longer contains
the run commands echo "Configuring system memory settings..." / sudo sysctl -w
vm.overcommit_memory=1 / sudo sysctl -w net.core.somaxconn=65535 if you opt to
remove them.
.github/workflows/build_and_test.yml
Outdated
| - name: Wait for services to be ready | ||
| env: | ||
| E2E_MYSQL_PORT: ${{ job.services.mysql.ports['3306'] }} | ||
| E2E_MARIADB_PORT: ${{ job.services.mariadb.ports['3306'] }} | ||
| E2E_POSTGRES_PORT: ${{ job.services.postgres.ports['5432'] }} | ||
| E2E_ELASTIC_PORT: ${{ job.services.elastic.ports['9200'] }} | ||
| E2E_REDIS_PORT: ${{ job.services.redis.ports['6379'] }} | ||
| DB: ${{ matrix.db }} | ||
| run: | | ||
| echo "Waiting for required services..." | ||
| # Wait for redis with connection test | ||
| for i in {1..60}; do | ||
| if nc -z localhost $E2E_REDIS_PORT && redis-cli -p $E2E_REDIS_PORT ping &>/dev/null; then | ||
| echo "✓ Redis is ready" | ||
| break | ||
| fi | ||
| echo "Waiting for Redis... ($i/60)" | ||
| sleep 2 | ||
| done | ||
| # Wait for DB with actual connection test | ||
| if [ "$DB" = "mariadb" ]; then | ||
| for i in {1..60}; do | ||
| if mysqladmin ping -h 127.0.0.1 -P $E2E_MARIADB_PORT -u vendure -ppassword --silent 2>/dev/null; then | ||
| echo "✓ MariaDB is ready and accepting connections" | ||
| break | ||
| fi | ||
| echo "Waiting for MariaDB... ($i/60)" | ||
| sleep 2 | ||
| done | ||
| elif [ "$DB" = "mysql" ]; then | ||
| for i in {1..60}; do | ||
| if mysqladmin ping -h 127.0.0.1 -P $E2E_MYSQL_PORT -u vendure -ppassword --silent 2>/dev/null; then | ||
| echo "✓ MySQL is ready and accepting connections" | ||
| break | ||
| fi | ||
| echo "Waiting for MySQL... ($i/60)" | ||
| sleep 2 | ||
| done | ||
| elif [ "$DB" = "postgres" ]; then | ||
| for i in {1..60}; do | ||
| if pg_isready -h 127.0.0.1 -p $E2E_POSTGRES_PORT -U vendure &>/dev/null; then | ||
| echo "✓ PostgreSQL is ready and accepting connections" | ||
| break | ||
| fi | ||
| echo "Waiting for PostgreSQL... ($i/60)" | ||
| sleep 2 | ||
| done | ||
| fi | ||
| # Wait for Elasticsearch | ||
| for i in {1..60}; do | ||
| if curl --silent --fail http://localhost:$E2E_ELASTIC_PORT/_cluster/health >/dev/null 2>&1; then | ||
| echo "✓ Elasticsearch is ready" | ||
| break | ||
| fi | ||
| echo "Waiting for Elasticsearch... ($i/60)" | ||
| sleep 2 | ||
| done | ||
| echo "Services appear reachable." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion | 🟠 Major
Manual service readiness polling duplicates existing health checks.
GitHub Actions already performs health checks for each service (defined in the options parameter of each service). These health checks ensure services are ready before job steps execute. Adding manual polling with custom retry logic:
- Duplicates functionality that should already work
- Adds unnecessary complexity and execution time
- Masks potential issues with the service health check definitions
If the existing health checks are insufficient, fix them in the service definitions (lines 95, 105, 113, 129) rather than adding this workaround step.
Remove this step and ensure the service-level health checks are properly configured. If services are genuinely not ready when health checks pass, that indicates a problem with the health check commands themselves that should be addressed.
.github/workflows/build_and_test.yml
Outdated
| - name: Configure database permissions | ||
| env: | ||
| E2E_MYSQL_PORT: ${{ job.services.mysql.ports['3306'] }} | ||
| E2E_MARIADB_PORT: ${{ job.services.mariadb.ports['3306'] }} | ||
| E2E_POSTGRES_PORT: ${{ job.services.postgres.ports['5432'] }} | ||
| DB: ${{ matrix.db }} | ||
| run: | | ||
| echo "Granting database permissions to vendure user..." | ||
| # Grant permissions for MariaDB | ||
| if [ "$DB" = "mariadb" ]; then | ||
| mysql -h 127.0.0.1 -P $E2E_MARIADB_PORT -u root -ppassword -e "GRANT ALL PRIVILEGES ON \`%\`.* TO 'vendure'@'%'; GRANT ALL PRIVILEGES ON \`e2e_%\`.* TO 'vendure'@'%'; FLUSH PRIVILEGES;" 2>/dev/null || echo "MariaDB permissions already configured" | ||
| echo "✓ MariaDB permissions granted to vendure user" | ||
| fi | ||
| # Grant permissions for MySQL | ||
| if [ "$DB" = "mysql" ]; then | ||
| mysql -h 127.0.0.1 -P $E2E_MYSQL_PORT -u root -ppassword -e "GRANT ALL PRIVILEGES ON \`%\`.* TO 'vendure'@'%'; GRANT ALL PRIVILEGES ON \`e2e_%\`.* TO 'vendure'@'%'; FLUSH PRIVILEGES;" 2>/dev/null || echo "MySQL permissions already configured" | ||
| echo "✓ MySQL permissions granted to vendure user" | ||
| fi | ||
| # Grant permissions for PostgreSQL | ||
| if [ "$DB" = "postgres" ]; then | ||
| PGPASSWORD=password psql -h 127.0.0.1 -p $E2E_POSTGRES_PORT -U vendure -d postgres -c "ALTER USER vendure CREATEDB;" 2>/dev/null || echo "PostgreSQL permissions already configured" | ||
| echo "✓ PostgreSQL permissions granted to vendure user" | ||
| fi |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion | 🟠 Major
Database permission configuration step appears unnecessary.
This step grants broad privileges to the vendure user, but:
-
Existing e2e tests already work: The workflow has been running e2e tests against these databases, so the current permission setup should be sufficient.
-
Overly broad grants:
GRANT ALL PRIVILEGES ON \%`.* TO 'vendure'@'%'` grants access to all databases, which is excessive even for testing. -
Should be handled in service initialization: If specific privileges are needed, configure them in the service
envor startup commands, not in a separate step.
Remove this step unless the race condition test specifically requires dynamic database creation or privileges beyond what existing e2e tests need. If it does, document why in a comment.
🤖 Prompt for AI Agents
In @.github/workflows/build_and_test.yml around lines 218 - 243, The "Configure
database permissions" CI step grants overly broad privileges and appears
unnecessary; remove this entire step block (the job step named "Configure
database permissions" that uses env vars
E2E_MYSQL_PORT/E2E_MARIADB_PORT/E2E_POSTGRES_PORT and DB) unless the specific
race-condition test requires dynamic DB creation; if dynamic creation is
required, replace the broad GRANT/ALTER commands with minimal, database-scoped
grants (or move them into the service startup/env config) and add a comment
explaining why elevated privileges are needed for the race-condition test.
.github/workflows/build_and_test.yml
Outdated
| - name: Free up ports | ||
| run: | | ||
| echo "Ensuring port 3000 is free..." | ||
| sudo fuser -k 3000/tcp || true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion | 🟠 Major
Port cleanup step suggests underlying process cleanup issues.
Explicitly freeing port 3000 before tests indicates either:
- Previous test runs aren't properly terminating processes
- Concurrent jobs are conflicting (shouldn't happen in GitHub Actions)
- The CI environment has stale processes
This is an infrastructure/test cleanup issue, not a requirement for testing race conditions. If port conflicts occur, investigate why test processes aren't cleaning up properly rather than working around it.
Remove this step. If port conflicts genuinely occur, file a separate issue to investigate why test processes don't clean up correctly.
🤖 Prompt for AI Agents
In @.github/workflows/build_and_test.yml around lines 244 - 247, Remove the
"Free up ports" step that runs sudo fuser -k 3000/tcp; specifically delete the
GitHub Actions job step with name "Free up ports" (the block that echoes
"Ensuring port 3000 is free..." and calls sudo fuser -k 3000/tcp || true). If
port conflicts are observed afterward, open a separate issue to investigate why
test processes launched by the workflow (or tests themselves) are not
terminating cleanly rather than reintroducing a teardown workaround.
40f5ed1 to
6c8b57d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
.github/workflows/build_and_test.yml (1)
81-96: Critical: MariaDB serviceoptionsuses invalid--command=...(likely breaks container startup).Docker doesn’t support a
--commandflag; if Actions passes this through todocker create/run, the MariaDB service may fail to start (which matches the “workflow failures” note in the PR objectives).Proposed fix (remove invalid flag; rely on defaults + initSql in tests)
- options: --health-cmd="mariadb-admin ping -h localhost -u vendure -ppassword" --health-interval=10s --health-timeout=5s --health-retries=3 --command="mysqld --innodb_file_per_table=1 --default-storage-engine=InnoDB --innodb_flush_log_at_trx_commit=2 --innodb_log_buffer_size=32M" + options: --health-cmd="mariadb-admin ping -h localhost -u vendure -ppassword" --health-interval=10s --health-timeout=5s --health-retries=3e2e-common/test-config.ts (1)
92-110: ChangeconnectTimeouttoconnectTimeoutMSfor TypeORM PostgreSQL compatibility.The option
connectTimeoutis not recognized by TypeORM's PostgreSQL connection options. TypeORM expectsconnectTimeoutMSinstead. The current code will fail type-checking against TypeORM'sDataSourceOptionsinterface and the timeout will be silently ignored at runtime. TheconnectionTimeoutMillisin theextraobject is correct for configuring the underlying node-postgres driver pool.
🤖 Fix all issues with AI agents
In @.github/workflows/build_and_test.yml:
- Around line 151-189: The wait loops for Redis, DB (postgres/mariadb/mysql) and
Elasticsearch currently only break on success and never fail on timeout; update
each loop (or replace with a shared helper like wait_for_port) to detect when
the loop finishes without success and then echo a clear error and exit 1 so the
job fails fast; specifically ensure checks using E2E_REDIS_PORT,
E2E_POSTGRES_PORT, E2E_MARIADB_PORT, E2E_MYSQL_PORT and the Elasticsearch curl
health check cause a non-zero exit when their respective wait loops time out.
🧹 Nitpick comments (2)
.github/workflows/build_and_test.yml (1)
134-139: Scope creep risk: expanded e2e DB matrix may be better as a separate PR.Given the PR objective/comment about extra workflow changes causing failures, consider splitting “CI hardening + matrix expansion” into a dedicated PR, keeping this one focused on the OrderService locking fix + minimal test coverage.
e2e-common/test-config.ts (1)
111-154: Make MySQL/MariaDB credentials environment-configurable (avoid local e2e breakage).Hard-coded
username: 'vendure'andpassword: 'password'can make local e2e runs fail unless developers use identical credentials. Consider readingE2E_DB_USERandE2E_DB_PASSWORDfrom environment variables (defaulting to current values) to improve portability across developer setups.The driver-specific options (
initSqlunderextra,acquireTimeoutat top-level) are already correctly placed per TypeORM's MySQL/MariaDB DataSourceOptions structure.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
.github/workflows/build_and_test.ymldocker-compose.ymle2e-common/test-config.ts
🚧 Files skipped from review as they are similar to previous changes (1)
- docker-compose.yml
57db869 to
ce5d68a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In @.github/workflows/build_and_test.yml:
- Around line 87-95: The workflow adds redundant MariaDB InnoDB configuration
beyond the required version pin; remove the extra environment variables
MARIADB_MYSQL_LOCALHOST_USER and MARIADB_INITDB_SKIP_TZINFO and drop the
non-essential mysqld options (--innodb_flush_log_at_trx_commit and
--innodb_log_buffer_size) from the options string, keeping only the explicit
version pin to 11.5 and any minimal command flags required for tests (e.g.,
--innodb_file_per_table=1 and --default-storage-engine=InnoDB if you decide to
keep them), since InnoDB is already configured in docker-compose.yml and
e2e-common/test-config.ts.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
.github/workflows/build_and_test.ymldocker-compose.ymle2e-common/test-config.ts
🚧 Files skipped from review as they are similar to previous changes (2)
- docker-compose.yml
- e2e-common/test-config.ts
🔇 Additional comments (3)
.github/workflows/build_and_test.yml (3)
100-105: Verify whether MySQL configuration changes are necessary for the race-condition fix.Similar to the MariaDB changes, these MySQL service configuration updates appear to be infrastructure changes that may not be directly related to implementing pessimistic locking. The maintainer specifically requested restricting the PR to only changes necessary for the race-condition fix.
Based on learnings from the maintainer's comment about extra workflow changes causing CI failures, please confirm whether these MySQL environment variables and health check modifications are essential for the pessimistic locking feature to function correctly in the e2e tests.
118-119: The review comment is incorrect—this is a newly created workflow file, not a modification to existing configuration.The commit
ce5d68ae2created.github/workflows/build_and_test.ymlas a new file (indicated bynew file mode 100644in the git diff). The Elasticsearch settings at lines 118-119 (bootstrap.memory_lock: "false" and ES_JAVA_OPTS: -Xms256m -Xmx256m) are the initial configuration for the new CI setup, not changes from a prior state of 512m or true values. There is no previous version to compare against.These settings are part of the complete e2e testing infrastructure being established in this commit. The commit message appropriately identifies the focus as "Fix race condition with InnoDB and postgres password," which addresses database-level locking concerns through InnoDB configuration changes (visible in the MariaDB service options).
Likely an incorrect or invalid review comment.
151-194: The wait step addresses a gap in health checks but could be implemented more cleanly.The review comment's claim about existing health checks is incorrect: MariaDB, MySQL, Postgres, and Elasticsearch have
--health-cmdoptions configured (lines 95, 105, 113, 129), but Redis has no health check defined (lines 130-133). This means the wait step is not purely redundant as stated—the Redis readiness check fills a genuine gap.However, instead of adding a separate 44-line wait step, the proper solution would be to add a health check to the Redis service configuration itself (e.g.,
--health-cmd="redis-cli ping"). This would:
- Leverage the built-in health check mechanism consistently across all services
- Eliminate the need for a separate wait loop
- Reduce workflow complexity
If this wait step was added as a workaround rather than addressing the root cause (missing Redis health check), consider reverting it and properly configuring Redis health checks instead.
Likely an incorrect or invalid review comment.
.github/workflows/build_and_test.yml
Outdated
| MARIADB_DATABASE: vendure | ||
| MARIADB_USER: vendure | ||
| MARIADB_PASSWORD: password | ||
| # Ensure InnoDB is used for locking support | ||
| MARIADB_MYSQL_LOCALHOST_USER: 1 | ||
| MARIADB_INITDB_SKIP_TZINFO: 1 | ||
| ports: | ||
| - 3306 | ||
| options: --health-cmd="mariadb-admin ping -h localhost -u vendure -ppassword" --health-interval=10s --health-timeout=5s --health-retries=3 | ||
| options: --health-cmd="mariadb-admin ping -h localhost -u vendure -ppassword" --health-interval=10s --health-timeout=5s --health-retries=3 --command="mysqld --innodb_file_per_table=1 --default-storage-engine=InnoDB --innodb_flush_log_at_trx_commit=2 --innodb_log_buffer_size=32M" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
# First, get a general overview of the repository structure
git ls-files | head -30Repository: vendurehq/vendure
Length of output: 1389
🏁 Script executed:
# Look for docker-compose and test configuration files
fd -e yml -e yaml | grep -E "(docker|compose|test)" | head -20Repository: vendurehq/vendure
Length of output: 79
🏁 Script executed:
# Search for existing MariaDB or database configuration
fd -e yml -e yaml -o -e js -o -e ts | xargs rg -l -i "mariadb|innodb" 2>/dev/null | head -10Repository: vendurehq/vendure
Length of output: 700
🏁 Script executed:
# Read the docker-compose.yml to see existing database configuration
cat docker-compose.ymlRepository: vendurehq/vendure
Length of output: 4762
🏁 Script executed:
# Read the test configuration file
cat e2e-common/test-config.tsRepository: vendurehq/vendure
Length of output: 5265
🏁 Script executed:
# Read the workflow file around the MariaDB service definition (lines 70-100)
sed -n '70,110p' .github/workflows/build_and_test.ymlRepository: vendurehq/vendure
Length of output: 1999
🏁 Script executed:
# Search for OrderService changes and pessimistic locking implementation
fd -e ts | xargs rg -l "OrderService|SELECT.*FOR UPDATE|pessimistic" | head -10Repository: vendurehq/vendure
Length of output: 678
🏁 Script executed:
# Check git diff or PR context for the actual code changes
git log --oneline --all --grep="race" | head -10Repository: vendurehq/vendure
Length of output: 132
🏁 Script executed:
# Get the full commit to see all changes
git show ce5d68ae2 --statRepository: vendurehq/vendure
Length of output: 50374
🏁 Script executed:
# Check the full diff for that commit (focusing on summary)
git show ce5d68ae2 | head -200Repository: vendurehq/vendure
Length of output: 5806
🏁 Script executed:
# Search for postgres password changes in the workflow
sed -n '105,125p' .github/workflows/build_and_test.ymlRepository: vendurehq/vendure
Length of output: 1408
The MariaDB version pinning to 11.5 is necessary, but the InnoDB configuration changes appear redundant.
The workflow comments correctly identify that MariaDB v11.6.2+ introduced a breaking change requiring the version pin. However, InnoDB support is already configured elsewhere:
- The local
docker-compose.ymlhas identical InnoDB options in the MariaDB command - The
e2e-common/test-config.tsalready setsdefault_storage_engine=InnoDBviainitSqlin the TypeORM connection config
The additional environment variables (MARIADB_MYSQL_LOCALHOST_USER, MARIADB_INITDB_SKIP_TZINFO) and extra mysqld options (innodb_flush_log_at_trx_commit, innodb_log_buffer_size) appear to be stability improvements rather than requirements for pessimistic locking support. Consider whether these changes are necessary for the race-condition fix or if they exceed the scope identified by the maintainer.
🤖 Prompt for AI Agents
In @.github/workflows/build_and_test.yml around lines 87 - 95, The workflow adds
redundant MariaDB InnoDB configuration beyond the required version pin; remove
the extra environment variables MARIADB_MYSQL_LOCALHOST_USER and
MARIADB_INITDB_SKIP_TZINFO and drop the non-essential mysqld options
(--innodb_flush_log_at_trx_commit and --innodb_log_buffer_size) from the options
string, keeping only the explicit version pin to 11.5 and any minimal command
flags required for tests (e.g., --innodb_file_per_table=1 and
--default-storage-engine=InnoDB if you decide to keep them), since InnoDB is
already configured in docker-compose.yml and e2e-common/test-config.ts.
ce5d68a to
307283e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
.github/workflows/build_and_test.yml (1)
138-138: Harden the wait step to eliminate nc dependency and improve service readiness checks.The
nc -zchecks on lines 163, 171, 177, 183 are problematic:netcatis not guaranteed on ubuntu-latest runners (it's a virtual package requiring explicit installation), and port-open checks don't guarantee the service is actually ready to accept connections. The Elasticsearch check (line 191) usingcurlis more robust and should be the model.Replace
ncwith bash's built-in/dev/tcpto eliminate the dependency, and improve database checks to verify actual readiness:Improved wait step
- name: Wait for services to be ready env: E2E_MYSQL_PORT: ${{ job.services.mysql.ports['3306'] }} E2E_MARIADB_PORT: ${{ job.services.mariadb.ports['3306'] }} E2E_POSTGRES_PORT: ${{ job.services.postgres.ports['5432'] }} E2E_ELASTIC_PORT: ${{ job.services.elastic.ports['9200'] }} E2E_REDIS_PORT: ${{ job.services.redis.ports['6379'] }} DB: ${{ matrix.db }} run: | + set -euo pipefail echo "Waiting for required services..." # Wait for redis for i in {1..30}; do - if nc -z localhost $E2E_REDIS_PORT; then echo "Redis Ready"; break; fi + if (echo > /dev/tcp/localhost/$E2E_REDIS_PORT) >/dev/null 2>&1; then echo "Redis Ready"; break; fi if [ $i -eq 30 ]; then echo "Redis Timeout"; exit 1; fi sleep 1 done # Wait for DB ports depending on matrix value if [ "$DB" = "postgres" ]; then for i in {1..30}; do - if nc -z localhost $E2E_POSTGRES_PORT; then echo "Postgres Ready"; break; fi + if (echo > /dev/tcp/localhost/$E2E_POSTGRES_PORT) >/dev/null 2>&1; then echo "Postgres Ready"; break; fi if [ $i -eq 30 ]; then echo "Postgres Timeout"; exit 1; fi sleep 1 done elif [ "$DB" = "mariadb" ]; then for i in {1..30}; do - if nc -z localhost $E2E_MARIADB_PORT; then echo "MariaDB Ready"; break; fi + if (echo > /dev/tcp/localhost/$E2E_MARIADB_PORT) >/dev/null 2>&1; then echo "MariaDB Ready"; break; fi if [ $i -eq 30 ]; then echo "MariaDB Timeout"; exit 1; fi sleep 1 done elif [ "$DB" = "mysql" ]; then for i in {1..30}; do - if nc -z localhost $E2E_MYSQL_PORT; then echo "MySQL Ready"; break; fi + if (echo > /dev/tcp/localhost/$E2E_MYSQL_PORT) >/dev/null 2>&1; then echo "MySQL Ready"; break; fi if [ $i -eq 30 ]; then echo "MySQL Timeout"; exit 1; fi sleep 1 done fiRegarding the DB matrix expansion at line 138: this adds legitimate test coverage. However, combined with this new wait step, the PR introduces meaningful CI infrastructure changes. If the maintainer workflow concerns mentioned in the PR are significant, consider whether these should be bundled together or separated.
🤖 Fix all issues with AI agents
In @.github/workflows/build_and_test.yml:
- Around line 118-120: The current readiness probe hitting "/_cluster/health"
can return HTTP 200 while cluster status is RED; update the check to wait for at
least yellow by calling the health API with wait_for_status=yellow (e.g. GET
/_cluster/health?wait_for_status=yellow&timeout=60s) or by parsing the returned
JSON and asserting status !== "red"; modify the workflow step that currently
hits "/_cluster/health" to use the wait_for_status=yellow query (and add a
reasonable timeout) so tests only proceed when the cluster is yellow or green.
- Line 95: The services.mariadb.options block contains an invalid Docker flag
`--command="mysqld ..."` which GitHub Actions services do not support; remove
the `--command` entry from the options and instead rely on the image's default
CMD or supply initialization via a proper workaround (e.g., use a custom image,
entrypoint script, or a dedicated job step) and tighten the healthcheck by
keeping `--health-cmd="mariadb-admin ping -h localhost -u vendure -ppassword"`
and its interval/timeout/retries; update the `services.mariadb.options` to drop
`--command` and ensure the MariaDB container starts with the intended
configuration via a supported method.
In @e2e-common/test-config.ts:
- Around line 119-126: Replace the unrecognized option name initSql with the
correct init_command in the configuration objects inside the extra block (the
object that currently contains initSql, connectionLimit, waitForConnections,
queueLimit); update both occurrences used for MariaDB and MySQL so the storage
engine is actually set (change initSql: "SET default_storage_engine=InnoDB;" to
init_command: "SET default_storage_engine=INNODB").
🧹 Nitpick comments (1)
.github/workflows/build_and_test.yml (1)
100-105: DB service healthchecks: consider consistency + avoid relying on non-root users for ping.For MySQL, the healthcheck pings as
vendure(Line 105). That can be brittle if user creation is delayed; usingroottends to be more stable for readiness signaling. For Postgres (Line 113),pg_isreadyis fine.Optional tweak (MySQL healthcheck)
- options: --health-cmd="mysqladmin ping -h localhost -u vendure -ppassword" --health-interval=10s --health-timeout=20s --health-retries=10 + options: --health-cmd="mysqladmin ping -h localhost -u root -ppassword" --health-interval=10s --health-timeout=20s --health-retries=10Also applies to: 113-113
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
.github/workflows/build_and_test.ymldocker-compose.ymle2e-common/test-config.ts
🚧 Files skipped from review as they are similar to previous changes (1)
- docker-compose.yml
🔇 Additional comments (4)
e2e-common/test-config.ts (3)
100-109: LGTM! Connection pool settings are appropriate for concurrent e2e tests.The extended timeouts (60s), connection pool configuration (max 10 connections), and
keepConnectionAlivesetting are well-suited for testing concurrent order modifications under the new pessimistic locking implementation.
119-131: Connection pool and timeout settings look reasonable for concurrent e2e tests.The configuration includes:
- Connection pooling (10 connections) to handle concurrent test operations
- Extended timeouts (60s) to accommodate slower test environments
queueLimit: 0allows unlimited queuing, which is acceptable for tests but note this could cause memory issues in production if many connections wait simultaneouslykeepConnectionAlive: truehelps maintain stable connections during concurrent operationsThese settings appropriately support the new race-condition e2e tests that exercise concurrent order modifications under pessimistic locking.
Also applies to: 141-153
117-117: No action needed. The username change tovendureis properly coordinated with docker-compose configuration. Both the MariaDB and MySQL services in docker-compose.yml are configured withMARIADB_USER: vendureandMYSQL_USER: vendurerespectively, which matches the credentials used in test-config.ts. The official Docker images automatically create and configure these users during container initialization..github/workflows/build_and_test.yml (1)
87-92: No action required — MariaDB environment variables are supported.
MARIADB_MYSQL_LOCALHOST_USERandMARIADB_INITDB_SKIP_TZINFOare documented, supported environment variables in the officialmariadb:11.5Docker image. Both are applied only during initial database initialization; the image will initialize correctly with these settings. The healthcheck using thevendureuser credentials (matchingMARIADB_USERandMARIADB_PASSWORD) is appropriate and will function as expected once the database initializes.Likely an incorrect or invalid review comment.
| extra: { | ||
| // Ensure tables use InnoDB for locking support | ||
| initSql: "SET default_storage_engine=InnoDB;", | ||
| // Connection pool settings | ||
| connectionLimit: 10, | ||
| waitForConnections: true, | ||
| queueLimit: 0, | ||
| }, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
# Check if initSql is used anywhere in the codebase
rg -i "initSql" --type ts --type jsRepository: vendurehq/vendure
Length of output: 246
🏁 Script executed:
# Check the surrounding code context for mysql2 configuration patterns
cd e2e-common && head -150 test-config.ts | tail -50Repository: vendurehq/vendure
Length of output: 2087
🌐 Web query:
mysql2 connection options initSql documentation
💡 Result:
Short answer: mysql2 has no built‑in connection option named initSql / initSQL. To run initialization SQL when a connection is created, acquire the connection and run the statements (or listen for pool events and run them on new connections). See mysql2 docs for createPool/createConnection and examples, and the classic node-mysql pool "connection" event pattern for running session init statements. [1][2]
References
- mysql2 docs (createConnection / createPool examples). [1]
- node-mysql (pool events) — use pool.on('connection', conn => conn.query('SET ...')) as the pattern to run init SQL on new pooled connections. [2]
🌐 Web query:
TypeORM mysql2 driver initialization commands SET storage engine
💡 Result:
Use the TypeORM DataSource option extra to pass the driver's initial SQL command (the underlying mysql/mysql2 client will run it on connect). Example — set default storage engine to InnoDB:
TypeScript:
const AppDataSource = new DataSource({
type: "mysql",
host: "localhost",
port: 3306,
username: "user",
password: "pw",
database: "db",
extra: {
init_command: "SET default_storage_engine=INNODB"
}
});
Notes:
- TypeORM forwards extra to the mysql/mysql2 client. [1][2]
- The driver/client executes init_command on each new connection (so remove it after migrations if you don't want it run for every session). [2][3]
Sources:
[1] TypeORM — MySQL driver docs (data-source / connection options).
[2] TypeORM — DataSource / DataSourceOptions (extra passes options to underlying driver).
[3] mysql client docs (init_command / initial command executed on connect).
Replace initSql with init_command in both MariaDB and MySQL configurations.
The option initSql is not recognized by mysql2 or TypeORM. The correct option name is init_command. Without this correction, the storage engine setting is silently ignored, and tables may default to MyISAM on some MySQL configurations, which does not support row-level locking and will break the pessimistic locking (SELECT ... FOR UPDATE) implementation.
Change both occurrences from:
initSql: "SET default_storage_engine=InnoDB;",to:
init_command: "SET default_storage_engine=INNODB",This applies to lines 119-126 (MariaDB) and 141-148 (MySQL).
🤖 Prompt for AI Agents
In @e2e-common/test-config.ts around lines 119 - 126, Replace the unrecognized
option name initSql with the correct init_command in the configuration objects
inside the extra block (the object that currently contains initSql,
connectionLimit, waitForConnections, queueLimit); update both occurrences used
for MariaDB and MySQL so the storage engine is actually set (change initSql:
"SET default_storage_engine=InnoDB;" to init_command: "SET
default_storage_engine=INNODB").
307283e to
5491b0f
Compare
|




Description
This PR addresses a critical race condition ("lost update" anomaly) that occurs when multiple concurrent requests attempt to modify the same Order simultaneously.
For example, if two
addItemToOrderrequests arrive at the same time for the same order, they both read the initial state, calculate the new state, and save it. The last transaction to commit overwrites the changes made by the first one, leading to incorrect order totals or missing items.Solution
I have implemented Pessimistic Locking (
SELECT ... FOR UPDATE, specificallypessimistic_writemode) in theOrderService.getOrderOrThrow(and the underlyingfindOne) to accept alockoption.OrderServicemethods that modify order state (e.g.,addItemsToOrder,adjustOrderLines,applyCouponCode,setShippingAddress, etc.).Fixes
Fixes #3398)
This test simulates concurrent
addItemToOrderoperations and asserts that the final order quantity is consistent with the number of requests sent.OrderService.race-condition.e2e-spec.tslocally.