Skip to content

Conversation

@RubenDarioGuerreroNeira

Description

This PR addresses a critical race condition ("lost update" anomaly) that occurs when multiple concurrent requests attempt to modify the same Order simultaneously.

For example, if two addItemToOrder requests arrive at the same time for the same order, they both read the initial state, calculate the new state, and save it. The last transaction to commit overwrites the changes made by the first one, leading to incorrect order totals or missing items.

Solution

I have implemented Pessimistic Locking (SELECT ... FOR UPDATE, specifically pessimistic_write mode) in the OrderService.

  • Updated getOrderOrThrow (and the underlying findOne) to accept a lock option.
  • Applied this lock to all critical OrderService methods that modify order state (e.g., addItemsToOrder, adjustOrderLines, applyCouponCode, setShippingAddress, etc.).
  • This ensures that when a transaction fetches an order for modification, it locks the row in the database. Any concurrent transaction attempting to read the same order for modification will block and wait until the first transaction releases the lock.

Fixes

Fixes #3398)

This test simulates concurrent addItemToOrder operations and asserts that the final order quantity is consistent with the number of requests sent.

  • Verified that pessimistic locking is applied to all modification paths in OrderService.
  • Tested with race-condition.e2e-spec.ts locally.

@vercel
Copy link

vercel bot commented Dec 20, 2025

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Review Updated (UTC)
docs Ready Ready Preview Jan 9, 2026 1:51am
vendure-storybook Ready Ready Preview, Comment Jan 9, 2026 1:51am

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Dec 20, 2025

Walkthrough

Adds pessimistic locking to OrderService methods to prevent concurrent-write races, introduces an e2e race-condition test performing parallel AddItemToOrder mutations, and updates CI/e2e readiness, DB connection settings, and MariaDB startup defaults for more reliable end-to-end testing.

Changes

Cohort / File(s) Summary
Order service locking
packages/core/src/service/services/order.service.ts
Added optional `lock?: { mode: 'pessimistic_read'
E2E race-condition test
packages/core/e2e/race-condition.e2e-spec.ts
New end-to-end test that starts a Vendure server, logs in, runs multiple parallel AddItemToOrder mutations, and asserts the final activeOrder.totalQuantity equals the expected concurrent total; includes setup/teardown and handles ErrorResult/Order responses.
CI workflow & readiness
.github/workflows/build_and_test.yml
Expanded e2e matrix (sqljs, mariadb, mysql, postgres); added service readiness waits and improved DB/Redis/Elasticsearch health checks, timeouts, and retries; adjusted ES JVM options and exposed DB env vars for wait step.
E2E DB connection config
e2e-common/test-config.ts
Added connectTimeout, acquireTimeout, keepConnectionAlive, and pool settings for Postgres/MariaDB/MySQL; changed MariaDB/MySQL username to vendure; added init SQL/pool options for MySQL/MariaDB to stabilize connections.
Docker Compose: MariaDB defaults
docker-compose.yml
Added command: ["mysqld", "--innodb_file_per_table=1", "--default-storage-engine=InnoDB"] to the mariadb service to enforce InnoDB defaults at startup.
Manifest / package
package.json
Manifest reference updated; no exported/public API signature changes.

Possibly related issues

  • Issue #3398 — Race condition when performing parallel order actions: the introduction of pessimistic locks in OrderService and the new e2e race-condition test target the concurrent-update behavior described and could address this issue.
🚥 Pre-merge checks | ✅ 3 | ❌ 2
❌ Failed checks (1 warning, 1 inconclusive)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Out of Scope Changes check ❓ Inconclusive While the core locking changes are in scope, the PR includes significant workflow and database configuration changes (.github/workflows, docker-compose.yml, e2e-common/test-config.ts) that appear tangential to the race-condition fix. Review whether the workflow and database configuration changes (GitHub Actions, Docker Compose, e2e test config) are necessary for the race-condition fix or should be in a separate PR.
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly summarizes the main change: applying pessimistic locking to prevent order modification race conditions.
Description check ✅ Passed The description comprehensively explains the race condition problem, the pessimistic locking solution, and includes references to testing with race-condition.e2e-spec.ts.
Linked Issues check ✅ Passed The PR implements pessimistic locking across all critical OrderService modification methods, directly addressing issue #3398's requirement to prevent lost updates during concurrent order operations.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings

📜 Recent review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 307283e and 5491b0f.

📒 Files selected for processing (3)
  • .github/workflows/build_and_test.yml
  • docker-compose.yml
  • e2e-common/test-config.ts
🚧 Files skipped from review as they are similar to previous changes (3)
  • .github/workflows/build_and_test.yml
  • e2e-common/test-config.ts
  • docker-compose.yml

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
packages/core/src/service/services/order.service.ts (1)

1214-1233: Remove redundant order fetch in setShippingMethod.

The method fetches the order with a pessimistic write lock at Line 1219, then calls orderModifier.setShippingMethods, and immediately fetches the same order again with the same lock at lines 1228-1230. The second fetch appears unnecessary and could impact performance. Consider whether orderModifier.setShippingMethods already updates the order entity in place, making the second fetch redundant.

🔎 Possible simplification
     async setShippingMethod(
         ctx: RequestContext,
         orderId: ID,
         shippingMethodIds: ID[],
     ): Promise<ErrorResultUnion<SetOrderShippingMethodResult, Order>> {
         const order = await this.getOrderOrThrow(ctx, orderId, undefined, { mode: 'pessimistic_write' });
         const validationError = this.assertAddingItemsState(order);
         if (validationError) {
             return validationError;
         }
         const result = await this.orderModifier.setShippingMethods(ctx, order, shippingMethodIds);
         if (isGraphQlErrorResult(result)) {
             return result;
         }
-        const updatedOrder = await this.getOrderOrThrow(ctx, orderId, undefined, {
-            mode: 'pessimistic_write',
-        });
-        await this.applyPriceAdjustments(ctx, updatedOrder);
-        return this.connection.getRepository(ctx, Order).save(updatedOrder);
+        await this.applyPriceAdjustments(ctx, order);
+        return this.connection.getRepository(ctx, Order).save(order);

Verify whether orderModifier.setShippingMethods modifies the order in place or returns a modified order before applying this change.

🧹 Nitpick comments (2)
packages/core/e2e/race-condition.e2e-spec.ts (2)

46-56: Verify individual mutation results to detect partial failures.

The test only checks the final totalQuantity but doesn't verify that all individual addItemToOrder mutations succeeded. If some mutations fail with errors, the test might pass incorrectly if the successful ones happen to produce the expected total.

🔎 Recommended enhancement
-        const promises: Array<Promise<any>> = [];
+        const results: Array<any> = [];
         for (let i = 0; i < concurrency; i++) {
-            promises.push(
-                shopClient.query(ADD_ITEM_TO_ORDER, {
-                    productVariantId: variantId,
-                    quantity: quantityPerRequest,
-                }),
-            );
+            const result = await shopClient.query(ADD_ITEM_TO_ORDER, {
+                productVariantId: variantId,
+                quantity: quantityPerRequest,
+            });
+            results.push(result);
         }
 
-        await Promise.all(promises);
+        // Verify all mutations succeeded
+        for (const result of results) {
+            expect(result.addItemToOrder.__typename).not.toBe('ErrorResult');
+            expect(result.addItemToOrder.id).toBeDefined();
+        }

Note: This changes the test to sequential execution, which defeats the race condition testing purpose. For concurrent execution with verification, consider:

         const promises: Array<Promise<any>> = [];
         for (let i = 0; i < concurrency; i++) {
             promises.push(
                 shopClient.query(ADD_ITEM_TO_ORDER, {
                     productVariantId: variantId,
                     quantity: quantityPerRequest,
                 }),
             );
         }
 
-        await Promise.all(promises);
+        const results = await Promise.all(promises);
+        
+        // Verify all mutations succeeded
+        for (const result of results) {
+            expect(result.addItemToOrder.__typename).not.toBe('ErrorResult');
+            expect(result.addItemToOrder.id).toBeDefined();
+        }

46-46: Replace Array<Promise<any>> with a properly typed array.

Using any bypasses type safety. Consider defining a proper type for the mutation result or using the generated GraphQL types.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e235d85 and af20480.

📒 Files selected for processing (2)
  • packages/core/e2e/race-condition.e2e-spec.ts (1 hunks)
  • packages/core/src/service/services/order.service.ts (28 hunks)
🧰 Additional context used
🧠 Learnings (3)
📚 Learning: 2025-11-21T18:29:30.413Z
Learnt from: DanielBiegler
Repo: vendure-ecommerce/vendure PR: 3815
File: packages/core/e2e/api-key.e2e-spec.ts:160-224
Timestamp: 2025-11-21T18:29:30.413Z
Learning: In the vendure-ecommerce/vendure repository, hand-written `gql` template literals (e.g., `CREATE_API_KEY`, `API_KEY`, etc.) in test files like `packages/core/e2e/api-key.e2e-spec.ts` are the source documents for GraphQL code generation. The code generator processes these to produce typed `*Document` exports (e.g., `CreateApiKeyDocument`, `ApiKeyDocument`) that are then imported and used in the tests. These hand-written gql constants should not be considered unused or dead code—they are essential inputs to the code generation process.

Applied to files:

  • packages/core/e2e/race-condition.e2e-spec.ts
📚 Learning: 2025-06-25T11:20:48.133Z
Learnt from: dlhck
Repo: vendure-ecommerce/vendure PR: 3615
File: packages/dashboard/src/app/routes/_authenticated/_products/components/product-bulk-actions.tsx:161-214
Timestamp: 2025-06-25T11:20:48.133Z
Learning: Using `useMutation` with multiple concurrent `mutate` calls (e.g., in a `map` operation) causes race conditions and unpredictable behavior. This pattern has caused issues in the Vendure dashboard codebase. Instead, use `mutateAsync` with sequential processing (for loop with await) or proper chaining to ensure only one mutation runs at a time.

Applied to files:

  • packages/core/e2e/race-condition.e2e-spec.ts
📚 Learning: 2025-11-10T14:23:14.480Z
Learnt from: gabriellbui
Repo: vendure-ecommerce/vendure PR: 3953
File: packages/dev-server/graphql/graphql-env.d.ts:335-335
Timestamp: 2025-11-10T14:23:14.480Z
Learning: Repo vendure-ecommerce/vendure: packages/dev-server/graphql/graphql-env.d.ts is a generated dev-server introspection file and should be gitignored; prefer regeneration and/or removing it from PRs rather than reviewing its contents.

Applied to files:

  • packages/core/e2e/race-condition.e2e-spec.ts
🧬 Code graph analysis (2)
packages/core/e2e/race-condition.e2e-spec.ts (2)
e2e-common/test-config.ts (1)
  • TEST_SETUP_TIMEOUT_MS (28-28)
packages/core/src/api/resolvers/shop/shop-order.resolver.ts (1)
  • activeOrder (99-116)
packages/core/src/service/services/order.service.ts (3)
packages/core/src/event-bus/events/order-event.ts (1)
  • order (33-35)
packages/core/src/api/resolvers/shop/shop-order.resolver.ts (1)
  • order (75-95)
packages/core/src/api/resolvers/admin/order.resolver.ts (1)
  • order (66-73)
🔇 Additional comments (4)
packages/core/src/service/services/order.service.ts (4)

1980-2003: LGTM!

The getOrderOrThrow method correctly propagates the lock parameter to findOne and maintains proper error handling.


214-296: Apply pessimistic lock to OrderLines query to prevent concurrent modifications.

The pessimistic lock on the Order query (line 255) does not extend to the separately-fetched OrderLines (lines 268-279). Each query requires explicit locking—when the main Order is locked, the separate OrderLines query lacks a lock parameter and can be modified by concurrent transactions. Either apply the lock to the linesQb as well, or document that this method must be called within a transaction where the caller is responsible for ensuring the lines' consistency.


214-218: Comprehensive pessimistic locking is correctly implemented across all order modification paths.

The OrderService implements consistent pessimistic_write locking for all methods that modify orders: updateCustomFields, updateOrderCustomer, addItemsToOrder, adjustOrderLine, removeItemsFromOrder, addSurchargeToOrder, removeSurchargeFromOrder, applyCouponCode, removeCouponCode, setShippingAddress, setBillingAddress, unsetShippingAddress, unsetBillingAddress, setShippingMethods, transitionToState, and multiple state transition handlers. New order creation (create/createDraft) correctly omits initial locks since they're new records, then immediately calls transitionToState which applies locking. OrderModifier helper methods operate on already-locked orders, and event bus publications occur within locked transactions without bypassing service methods. No unprotected modification paths were identified.


214-218: Transaction wrapping for pessimistic locks is already in place via the @transaction decorator.

While the findOne method doesn't explicitly wrap queries in transactions at the service level, the codebase implements transaction handling at the resolver boundary layer. The @Transaction() decorator on resolver methods (e.g., addItemsToOrder at line 465) creates a transactional QueryRunner that's attached to the RequestContext. Service methods called within these decorated resolvers automatically use the transactional context. All methods using pessimistic locks (via getOrderOrThrow) are invoked exclusively from @Transaction-decorated resolver methods, ensuring proper transaction boundaries. This architectural pattern—transactions at the boundary rather than service layer—is valid and correctly implemented in this codebase.

Likely an incorrect or invalid review comment.

const quantityPerRequest = 1;
const concurrency = 5;

// Ejecutamos 5 peticiones simultáneas para añadir el item
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major

Replace Spanish comments with English.

The codebase should maintain English as the primary language for comments to ensure consistency and accessibility for all contributors.

🔎 Proposed fix
-        // Ejecutamos 5 peticiones simultáneas para añadir el item
+        // Execute 5 concurrent requests to add the item
         const promises: Array<Promise<any>> = [];
-        // Si hay condición de carrera, el total será menor a 5 (algunas escrituras se sobrescribieron)
+        // If there's a race condition, the total will be less than 5 (some writes were overwritten)
         expect(activeOrder.totalQuantity).toBe(concurrency * quantityPerRequest);

Also applies to: 62-62

🤖 Prompt for AI Agents
In packages/core/e2e/race-condition.e2e-spec.ts around lines 45 and 62, the
inline comments are written in Spanish; replace them with clear English
equivalents to keep the codebase language consistent. Update the comment at line
45 (and the similar one at line 62) to concisely describe the action being
performed in English, e.g., "Execute 5 concurrent requests to add the item" or
similar wording that matches the test intent.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
packages/core/src/service/services/order.service.ts (1)

1228-1230: Remove redundant Order reload.

After calling setShippingMethods (Line 1224), the Order is immediately reloaded with getOrderOrThrow (Lines 1228-1230) before calling applyPriceAdjustments. This reload appears unnecessary since applyPriceAdjustments will reload the Order at Line 2151.

🔎 Proposed refactor
         const result = await this.orderModifier.setShippingMethods(ctx, order, shippingMethodIds);
         if (isGraphQlErrorResult(result)) {
             return result;
         }
-        const updatedOrder = await this.getOrderOrThrow(ctx, orderId, undefined, {
-            mode: 'pessimistic_write',
-        });
-        await this.applyPriceAdjustments(ctx, updatedOrder);
-        return this.connection.getRepository(ctx, Order).save(updatedOrder);
+        await this.applyPriceAdjustments(ctx, order);
+        return this.connection.getRepository(ctx, Order).save(order);
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between af20480 and 3833bbe.

📒 Files selected for processing (5)
  • .github/workflows/build_and_test.yml
  • docker-compose.yml
  • e2e-common/test-config.ts
  • packages/core/e2e/race-condition.e2e-spec.ts
  • packages/core/src/service/services/order.service.ts
🚧 Files skipped from review as they are similar to previous changes (1)
  • packages/core/e2e/race-condition.e2e-spec.ts
🧰 Additional context used
🧬 Code graph analysis (1)
packages/core/src/service/services/order.service.ts (3)
packages/core/src/event-bus/events/order-event.ts (1)
  • order (33-35)
packages/core/src/api/resolvers/shop/shop-order.resolver.ts (1)
  • order (75-95)
packages/core/src/api/resolvers/admin/order.resolver.ts (1)
  • order (66-73)
🔇 Additional comments (6)
packages/core/src/service/services/order.service.ts (2)

214-296: Pessimistic locking implementation looks solid.

The addition of the lock parameter to findOne and its propagation through getOrderOrThrow to all order modification methods effectively prevents the race condition described in issue #3398.

Key strengths:

  • Consistent application of pessimistic_write locks across all state-modifying operations
  • Lock options properly threaded through TypeORM's setFindOptions
  • Private getOrderOrThrow helper ensures lock usage is centralized

257-257: This change is safe. All Orders are assigned to at least one Channel via assignToCurrentChannel() during creation (lines 456-475), so the innerJoin will not exclude any valid orders. The query is intentionally scoped to the current context's channel via the andWhere clause, consistent with other similar queries in this file.

Likely an incorrect or invalid review comment.

e2e-common/test-config.ts (1)

107-107: LGTM! Username changes align with docker-compose configuration.

The username change from root to vendure for MariaDB and MySQL correctly matches the MARIADB_USER and MYSQL_USER environment variables defined in docker-compose.yml.

Also applies to: 116-116

.github/workflows/build_and_test.yml (3)

148-181: Excellent addition of service readiness check.

The new "Wait for services to be ready" step ensures Redis, databases, and Elasticsearch are accepting connections before tests begin. This prevents flaky test failures due to service startup timing, which is especially important for the new race-condition test scenarios.

Implementation uses appropriate timeouts (60 seconds) and checks the correct ports for each service.


115-116: Reduced Elasticsearch memory allocation for CI environment.

The change from 512MB to 256MB for ES_JAVA_OPTS and disabling bootstrap.memory_lock reduces resource consumption in the shared CI environment. This is a reasonable optimization for test workloads.


87-89: Explicit database credentials improve configuration clarity.

Adding explicit MARIADB_USER, MARIADB_PASSWORD, MYSQL_USER, and MYSQL_PASSWORD environment variables (alongside the existing _DATABASE variables) makes the service configuration more explicit and aligns with the updated health checks that reference these credentials.

Also applies to: 97-99

POSTGRES_DB: vendure-dev
POSTGRES_USER: vendure
POSTGRES_PASSWORD: password
POSTGRES_PASSWORD: postgres
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Password inconsistency across Postgres containers and test config.

The postgres_16 container now uses password postgres, but:

  • The postgres_12 container (Line 48) still uses password password
  • The e2e test config at e2e-common/test-config.ts (Line 99) uses password password

This inconsistency will cause connection failures when tests attempt to use postgres_16.

🔎 Recommended fix
-            POSTGRES_PASSWORD: postgres
+            POSTGRES_PASSWORD: password
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
POSTGRES_PASSWORD: postgres
POSTGRES_PASSWORD: password
🤖 Prompt for AI Agents
In docker-compose.yml line 61 (postgres_16) and docker-compose.yml line 48
(postgres_12) plus e2e-common/test-config.ts line 99, the Postgres passwords are
inconsistent (postgres_16 uses "postgres" while postgres_12 and the e2e test
config use "password"); update the values so all three use the same password
(preferably change postgres_12 and e2e-common/test-config.ts to "postgres" to
match postgres_16) and ensure any related test or container env vars reference
that same value so tests can connect successfully.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
.github/workflows/build_and_test.yml (1)

148-181: Consider adding timeout handling and progress feedback.

The wait logic is a good practice to prevent test flakiness. However, consider these improvements:

  1. Explicit failure on timeout: Currently, if a service doesn't become ready within 60 seconds, the script continues silently. Consider failing explicitly.
  2. Progress feedback: Add periodic echo statements to show which service is being waited for and how long it's taking.
  3. Verify nc availability: The script assumes nc (netcat) is installed. Consider adding a check or using a more portable approach.
Example improvement with timeout handling
+            - name: Wait for services to be ready
+              run: |
                  echo "Waiting for required services..."
                  # Wait for redis
-                 for i in {1..60}; do
+                 echo "Checking Redis..."
+                 for i in {1..60}; do
                    if nc -z localhost $E2E_REDIS_PORT; then break; fi
+                   if [ $i -eq 60 ]; then
+                     echo "ERROR: Redis failed to become ready within 60 seconds"
+                     exit 1
+                   fi
                    sleep 1
                  done
+                 echo "Redis is ready"
packages/core/src/service/services/order.service.ts (1)

1214-1233: Consider refactoring the double-fetch pattern.

The order is fetched twice with pessimistic_write lock within the same method:

  1. Line 1219: Initial fetch with lock
  2. Lines 1228-1230: Second fetch with lock after setShippingMethods

While this ensures you have the latest state after setShippingMethods modifies the order, it's inefficient. Since you already hold the pessimistic write lock from the first fetch (maintained throughout the transaction), consider these alternatives:

  1. Return the updated order from orderModifier.setShippingMethods instead of fetching again
  2. Refresh the existing order entity using TypeORM's refresh mechanism if needed
  3. Document why the second fetch is necessary if there's a specific reason

This pattern also appears in modifyOrder (lines 1330 and 1351), suggesting it might be worth addressing in multiple places.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3833bbe and 6c8b57d.

📒 Files selected for processing (4)
  • .github/workflows/build_and_test.yml
  • e2e-common/test-config.ts
  • packages/core/e2e/race-condition.e2e-spec.ts
  • packages/core/src/service/services/order.service.ts
🚧 Files skipped from review as they are similar to previous changes (2)
  • packages/core/e2e/race-condition.e2e-spec.ts
  • e2e-common/test-config.ts
🧰 Additional context used
🧬 Code graph analysis (1)
packages/core/src/service/services/order.service.ts (3)
packages/core/src/event-bus/events/order-event.ts (1)
  • order (33-35)
packages/core/src/api/resolvers/shop/shop-order.resolver.ts (1)
  • order (75-95)
packages/core/src/api/resolvers/admin/order.resolver.ts (1)
  • order (66-73)
🔇 Additional comments (9)
.github/workflows/build_and_test.yml (4)

87-92: LGTM! Good security practice.

Using a dedicated vendure user instead of root for database access is a better security practice. The credentials are consistent between the environment variables and the health-check command.


97-102: LGTM! Improved reliability.

The MySQL configuration follows the same security improvements as MariaDB. The increased health-check timeout (20s vs 5s) should help prevent flaky test failures if MySQL takes longer to become ready.


110-110: LGTM! Proper health check.

Using pg_isready -U vendure is the correct approach for PostgreSQL health checks and aligns with the configured POSTGRES_USER.


115-116: LGTM! Resource optimization for CI.

Reducing Elasticsearch memory allocation and disabling bootstrap.memory_lock helps prevent OOM issues in the shared CI environment while still providing sufficient resources for e2e tests.

packages/core/src/service/services/order.service.ts (5)

214-219: LGTM! Backward-compatible API extension.

Adding the optional lock parameter maintains backward compatibility while enabling pessimistic locking for race condition prevention. The parameter correctly follows TypeORM's pessimistic locking API.


1980-2003: LGTM! Correct lock propagation.

The private getOrderOrThrow method correctly accepts and forwards the lock parameter to findOne, maintaining proper encapsulation while enabling pessimistic locking throughout the service.


506-506: Excellent! Comprehensive and consistent pessimistic locking.

The pessimistic locking strategy is correctly and consistently applied across all order mutation operations:

Write operations use pessimistic_write: All methods that modify order state acquire a write lock, preventing concurrent modifications.

Read operations remain lock-free: Query methods like getEligibleShippingMethods and getEligiblePaymentMethods correctly don't use locks, maintaining good read performance.

Addresses the root cause: This implementation directly solves the "lost update" race condition described in issue #3398, where parallel requests could overwrite each other's changes.

This is a solid solution to the concurrency problem.

Also applies to: 522-524, 612-612, 759-759, 887-887, 930-930, 963-963, 983-983, 1005-1005, 1033-1033, 1084-1084, 1108-1108, 1134-1134, 1158-1158, 1219-1219, 1244-1244, 1330-1332, 1376-1376, 1445-1447, 1696-1698, 1790-1790, 1816-1816


214-219: No changes required; Vendure's transaction architecture properly handles pessimistic locks.

Pessimistic locks in this codebase are correctly used within database transactions through Vendure's architecture. The @Transaction() decorator is applied at the API resolver layer (e.g., in order.resolver.ts), and the RequestContext passed to service methods carries transaction state. The TransactionalConnection.getRepository(ctx, ...) call respects ongoing transactions from the context, so service methods do not require their own @Transaction() decorators. The e2e test validates that concurrent order modifications are handled correctly without lost updates.

Likely an incorrect or invalid review comment.


252-259: This is not a breaking change. The innerJoin on order.channels is the correct and intentional pattern for Vendure's multi-tenant channel-scoped access control. Orders are inherently created within a channel context, and the innerJoin with the channel filter ensures proper tenant isolation. This pattern is used consistently throughout the codebase in similar queries (e.g., getActiveOrderForUser, product variant lookups, and the generic list query builder). No change from leftJoin appears to have been made.

Likely an incorrect or invalid review comment.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In @.github/workflows/build_and_test.yml:
- Around line 156-210: The step "Wait for services to be ready" references env
vars ($E2E_REDIS_PORT, $E2E_MARIADB_PORT, $E2E_MYSQL_PORT, $E2E_POSTGRES_PORT,
$E2E_ELASTIC_PORT) that are only defined in the later "e2e tests" step, causing
them to be undefined; fix by moving the env: block that defines those E2E_*
variables to the job level (so all steps including "Wait for services to be
ready" and "Configure database permissions" can see them) or alternatively
duplicate that env: block into both the "Wait for services to be ready" and
"Configure database permissions" steps so each step has the needed environment
variables.
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6c8b57d and c620161.

📒 Files selected for processing (3)
  • .github/workflows/build_and_test.yml
  • docker-compose.yml
  • e2e-common/test-config.ts
🚧 Files skipped from review as they are similar to previous changes (1)
  • e2e-common/test-config.ts
🔇 Additional comments (9)
docker-compose.yml (2)

48-49: Past password inconsistency has been resolved.

The previous review flagged that postgres_16 and postgres_12 had different passwords. Both containers now correctly use POSTGRES_PASSWORD: password, resolving the inconsistency.

Also applies to: 62-62


14-14: LGTM: InnoDB configuration enables pessimistic locking.

Explicitly setting InnoDB as the default storage engine and enabling per-table files is essential for the pessimistic locking (SELECT ... FOR UPDATE) introduced in this PR. InnoDB supports row-level locking required by the race condition fix, whereas MyISAM does not.

.github/workflows/build_and_test.yml (7)

87-95: LGTM: MariaDB configuration aligns with pessimistic locking requirements.

The added credentials (vendure/password) and InnoDB-specific startup options (--innodb_file_per_table=1 --default-storage-engine=InnoDB) ensure the test environment supports row-level locking needed for the race condition fix. The performance tuning parameters (innodb_flush_log_at_trx_commit=2, innodb_log_buffer_size=32M) are appropriate for test environments.


100-105: LGTM: MySQL configuration with appropriate startup resilience.

Credentials are consistent with other services, and the extended health check timeout (20s) and retries (10) appropriately account for MySQL 8's longer startup time.


113-113: LGTM: PostgreSQL health check updated for vendure user.

The health check correctly references the vendure user and increased retries improve startup reliability.


118-119: Verify Elasticsearch tests with reduced memory allocation.

The heap size is reduced to 256MB (from typically 1GB+) and memory locking is disabled. While this optimizes CI resource usage, ensure Elasticsearch-dependent tests don't fail due to memory constraints or increased GC pressure.


129-129: LGTM: Extended health check timeout compensates for reduced memory.

The increased timeout (10s) and retries (5) appropriately account for slower Elasticsearch startup with reduced heap allocation.


151-155: LGTM: System tuning supports concurrent test scenarios.

The vm.overcommit_memory=1 setting allows more aggressive memory allocation for multiple database containers, and the increased connection queue size supports the high-concurrency race condition tests introduced in this PR.


211-231: Critical: Same environment variable scoping issue.

This step has the same problem as the "Wait for services to be ready" step: the variables $E2E_MARIADB_PORT, $E2E_MYSQL_PORT, and $E2E_POSTGRES_PORT are undefined here. Apply the same fix by adding the env: block to this step.

Likely an incorrect or invalid review comment.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
.github/workflows/build_and_test.yml (1)

156-217: Previous critical issue resolved: Environment variables are now properly defined.

The environment variables (E2E_REDIS_PORT, E2E_MARIADB_PORT, etc.) are now correctly defined in the step-level env: block (lines 157-163), resolving the critical issue from the previous review. The readiness checks use appropriate connection tests rather than simple port checks, which is good.

Optional: Consider explicit timeout failures.

The polling loops continue execution even if services don't become ready within 120 seconds. While this may provide more context through subsequent test failures, explicitly failing with a clear error message when timeouts are reached could speed up debugging in CI.

♻️ Example: Add explicit timeout failure
   for i in {1..60}; do
     if nc -z localhost $E2E_REDIS_PORT && redis-cli -p $E2E_REDIS_PORT ping &>/dev/null; then 
       echo "✓ Redis is ready"
       break
     fi
     echo "Waiting for Redis... ($i/60)"
     sleep 2
+    if [ $i -eq 60 ]; then
+      echo "❌ Redis failed to become ready within 120 seconds"
+      exit 1
+    fi
   done

Apply similar logic to the other service wait loops.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c620161 and aa7c1c4.

📒 Files selected for processing (1)
  • .github/workflows/build_and_test.yml
🔇 Additional comments (3)
.github/workflows/build_and_test.yml (3)

82-95: LGTM! InnoDB configuration supports pessimistic locking.

The MariaDB service configuration correctly adds InnoDB-specific startup options (--default-storage-engine=InnoDB, --innodb_file_per_table=1) which are essential for the pessimistic locking implementation (SELECT ... FOR UPDATE) introduced in this PR. The explicit database and user setup, along with the health check using the vendure user, aligns well with the E2E test requirements.


151-155: LGTM! System tuning appropriate for concurrent services.

The sysctl configurations (vm.overcommit_memory=1 and net.core.somaxconn=65535) optimize the CI environment for running multiple database services and handling concurrent connections, which is particularly relevant for testing race condition scenarios.


218-243: LGTM! Database permissions properly configured for E2E tests.

The permission grants are appropriate for the E2E test environment:

  • MariaDB/MySQL: Grants all privileges on wildcard databases and e2e_% pattern databases
  • PostgreSQL: Grants CREATEDB privilege to allow test database creation
  • Error handling with || echo prevents failures if permissions already exist

The environment variables are properly defined in this step's env: block, consistent with the previous step.

@michaelbromley
Copy link
Member

Hi!
Thank you for looking into this. I notice a lot of extra workflow-related changes in the PR, which seem to be causing the workflows to fail.

Can you review that part and make sure this PR contains only those changes that are absolutely necessary to address the issue at hand?

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🤖 Fix all issues with AI agents
In @.github/workflows/build_and_test.yml:
- Around line 244-247: Remove the "Free up ports" step that runs sudo fuser -k
3000/tcp; specifically delete the GitHub Actions job step with name "Free up
ports" (the block that echoes "Ensuring port 3000 is free..." and calls sudo
fuser -k 3000/tcp || true). If port conflicts are observed afterward, open a
separate issue to investigate why test processes launched by the workflow (or
tests themselves) are not terminating cleanly rather than reintroducing a
teardown workaround.
- Around line 151-155: The CI step named "Configure system memory for Docker"
that runs sudo sysctl -w vm.overcommit_memory=1 and sudo sysctl -w
net.core.somaxconn=65535 should be removed from
.github/workflows/build_and_test.yml unless you can prove it's required for the
race condition test; either delete that entire step block or move it into a
separate infrastructure-focused PR/workflow and reference its necessity in the
race-condition test PR. Ensure the workflow no longer contains the run commands
echo "Configuring system memory settings..." / sudo sysctl -w
vm.overcommit_memory=1 / sudo sysctl -w net.core.somaxconn=65535 if you opt to
remove them.
- Around line 218-243: The "Configure database permissions" CI step grants
overly broad privileges and appears unnecessary; remove this entire step block
(the job step named "Configure database permissions" that uses env vars
E2E_MYSQL_PORT/E2E_MARIADB_PORT/E2E_POSTGRES_PORT and DB) unless the specific
race-condition test requires dynamic DB creation; if dynamic creation is
required, replace the broad GRANT/ALTER commands with minimal, database-scoped
grants (or move them into the service startup/env config) and add a comment
explaining why elevated privileges are needed for the race-condition test.
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between aa7c1c4 and 40f5ed1.

📒 Files selected for processing (1)
  • .github/workflows/build_and_test.yml
🔇 Additional comments (2)
.github/workflows/build_and_test.yml (2)

90-95: Clarify which MariaDB configuration flags are required for pessimistic locking.

The added InnoDB-specific flags include:

  • --default-storage-engine=InnoDB: Likely necessary for pessimistic locking support (SELECT ... FOR UPDATE requires InnoDB)
  • --innodb_file_per_table=1: Probably unnecessary for locking
  • --innodb_flush_log_at_trx_commit=2: Performance tuning, unrelated to locking
  • --innodb_log_buffer_size=32M: Performance tuning, unrelated to locking

The comment mentions v11.6.2+ default changes causing test failures, but doesn't explain which specific default is problematic.

Please clarify:

  1. Is --default-storage-engine=InnoDB the only flag actually required for the race condition test?
  2. What specific v11.6.2+ default change causes failures, and which flag addresses it?
  3. Can the performance tuning flags be removed?

If only the storage engine flag is needed, simplify to just that flag and document why it's required.


87-247: The workflow file .github/workflows/build_and_test.yml has not been modified in this PR. All the infrastructure components mentioned—database service configuration with InnoDB flags, system memory configuration, service readiness checks, database permission granting, and port cleanup—already exist in the repository from previous commits. The only new addition in this PR is the race condition test file (packages/core/e2e/race-condition.e2e-spec.ts), which properly leverages the existing workflow infrastructure designed to support pessimistic locking. No workflow scope issues apply to this PR.

Likely an incorrect or invalid review comment.

Comment on lines 151 to 155
- name: Configure system memory for Docker
run: |
echo "Configuring system memory settings..."
sudo sysctl -w vm.overcommit_memory=1
sudo sysctl -w net.core.somaxconn=65535
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major

System memory configuration appears unrelated to the race condition fix.

Configuring vm.overcommit_memory and net.core.somaxconn are OS-level kernel parameters that don't directly relate to testing database pessimistic locking for race conditions. If these settings are addressing memory pressure or connection issues in CI, they should be handled in a separate infrastructure-focused PR.

Remove this step unless you can demonstrate it's specifically required for the race condition test to pass.

🤖 Prompt for AI Agents
In @.github/workflows/build_and_test.yml around lines 151 - 155, The CI step
named "Configure system memory for Docker" that runs sudo sysctl -w
vm.overcommit_memory=1 and sudo sysctl -w net.core.somaxconn=65535 should be
removed from .github/workflows/build_and_test.yml unless you can prove it's
required for the race condition test; either delete that entire step block or
move it into a separate infrastructure-focused PR/workflow and reference its
necessity in the race-condition test PR. Ensure the workflow no longer contains
the run commands echo "Configuring system memory settings..." / sudo sysctl -w
vm.overcommit_memory=1 / sudo sysctl -w net.core.somaxconn=65535 if you opt to
remove them.

Comment on lines 156 to 181
- name: Wait for services to be ready
env:
E2E_MYSQL_PORT: ${{ job.services.mysql.ports['3306'] }}
E2E_MARIADB_PORT: ${{ job.services.mariadb.ports['3306'] }}
E2E_POSTGRES_PORT: ${{ job.services.postgres.ports['5432'] }}
E2E_ELASTIC_PORT: ${{ job.services.elastic.ports['9200'] }}
E2E_REDIS_PORT: ${{ job.services.redis.ports['6379'] }}
DB: ${{ matrix.db }}
run: |
echo "Waiting for required services..."
# Wait for redis with connection test
for i in {1..60}; do
if nc -z localhost $E2E_REDIS_PORT && redis-cli -p $E2E_REDIS_PORT ping &>/dev/null; then
echo "✓ Redis is ready"
break
fi
echo "Waiting for Redis... ($i/60)"
sleep 2
done
# Wait for DB with actual connection test
if [ "$DB" = "mariadb" ]; then
for i in {1..60}; do
if mysqladmin ping -h 127.0.0.1 -P $E2E_MARIADB_PORT -u vendure -ppassword --silent 2>/dev/null; then
echo "✓ MariaDB is ready and accepting connections"
break
fi
echo "Waiting for MariaDB... ($i/60)"
sleep 2
done
elif [ "$DB" = "mysql" ]; then
for i in {1..60}; do
if mysqladmin ping -h 127.0.0.1 -P $E2E_MYSQL_PORT -u vendure -ppassword --silent 2>/dev/null; then
echo "✓ MySQL is ready and accepting connections"
break
fi
echo "Waiting for MySQL... ($i/60)"
sleep 2
done
elif [ "$DB" = "postgres" ]; then
for i in {1..60}; do
if pg_isready -h 127.0.0.1 -p $E2E_POSTGRES_PORT -U vendure &>/dev/null; then
echo "✓ PostgreSQL is ready and accepting connections"
break
fi
echo "Waiting for PostgreSQL... ($i/60)"
sleep 2
done
fi
# Wait for Elasticsearch
for i in {1..60}; do
if curl --silent --fail http://localhost:$E2E_ELASTIC_PORT/_cluster/health >/dev/null 2>&1; then
echo "✓ Elasticsearch is ready"
break
fi
echo "Waiting for Elasticsearch... ($i/60)"
sleep 2
done
echo "Services appear reachable."
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major

Manual service readiness polling duplicates existing health checks.

GitHub Actions already performs health checks for each service (defined in the options parameter of each service). These health checks ensure services are ready before job steps execute. Adding manual polling with custom retry logic:

  1. Duplicates functionality that should already work
  2. Adds unnecessary complexity and execution time
  3. Masks potential issues with the service health check definitions

If the existing health checks are insufficient, fix them in the service definitions (lines 95, 105, 113, 129) rather than adding this workaround step.

Remove this step and ensure the service-level health checks are properly configured. If services are genuinely not ready when health checks pass, that indicates a problem with the health check commands themselves that should be addressed.

Comment on lines 218 to 243
- name: Configure database permissions
env:
E2E_MYSQL_PORT: ${{ job.services.mysql.ports['3306'] }}
E2E_MARIADB_PORT: ${{ job.services.mariadb.ports['3306'] }}
E2E_POSTGRES_PORT: ${{ job.services.postgres.ports['5432'] }}
DB: ${{ matrix.db }}
run: |
echo "Granting database permissions to vendure user..."
# Grant permissions for MariaDB
if [ "$DB" = "mariadb" ]; then
mysql -h 127.0.0.1 -P $E2E_MARIADB_PORT -u root -ppassword -e "GRANT ALL PRIVILEGES ON \`%\`.* TO 'vendure'@'%'; GRANT ALL PRIVILEGES ON \`e2e_%\`.* TO 'vendure'@'%'; FLUSH PRIVILEGES;" 2>/dev/null || echo "MariaDB permissions already configured"
echo "✓ MariaDB permissions granted to vendure user"
fi
# Grant permissions for MySQL
if [ "$DB" = "mysql" ]; then
mysql -h 127.0.0.1 -P $E2E_MYSQL_PORT -u root -ppassword -e "GRANT ALL PRIVILEGES ON \`%\`.* TO 'vendure'@'%'; GRANT ALL PRIVILEGES ON \`e2e_%\`.* TO 'vendure'@'%'; FLUSH PRIVILEGES;" 2>/dev/null || echo "MySQL permissions already configured"
echo "✓ MySQL permissions granted to vendure user"
fi
# Grant permissions for PostgreSQL
if [ "$DB" = "postgres" ]; then
PGPASSWORD=password psql -h 127.0.0.1 -p $E2E_POSTGRES_PORT -U vendure -d postgres -c "ALTER USER vendure CREATEDB;" 2>/dev/null || echo "PostgreSQL permissions already configured"
echo "✓ PostgreSQL permissions granted to vendure user"
fi
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major

Database permission configuration step appears unnecessary.

This step grants broad privileges to the vendure user, but:

  1. Existing e2e tests already work: The workflow has been running e2e tests against these databases, so the current permission setup should be sufficient.

  2. Overly broad grants: GRANT ALL PRIVILEGES ON \%`.* TO 'vendure'@'%'` grants access to all databases, which is excessive even for testing.

  3. Should be handled in service initialization: If specific privileges are needed, configure them in the service env or startup commands, not in a separate step.

Remove this step unless the race condition test specifically requires dynamic database creation or privileges beyond what existing e2e tests need. If it does, document why in a comment.

🤖 Prompt for AI Agents
In @.github/workflows/build_and_test.yml around lines 218 - 243, The "Configure
database permissions" CI step grants overly broad privileges and appears
unnecessary; remove this entire step block (the job step named "Configure
database permissions" that uses env vars
E2E_MYSQL_PORT/E2E_MARIADB_PORT/E2E_POSTGRES_PORT and DB) unless the specific
race-condition test requires dynamic DB creation; if dynamic creation is
required, replace the broad GRANT/ALTER commands with minimal, database-scoped
grants (or move them into the service startup/env config) and add a comment
explaining why elevated privileges are needed for the race-condition test.

Comment on lines 244 to 247
- name: Free up ports
run: |
echo "Ensuring port 3000 is free..."
sudo fuser -k 3000/tcp || true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major

Port cleanup step suggests underlying process cleanup issues.

Explicitly freeing port 3000 before tests indicates either:

  1. Previous test runs aren't properly terminating processes
  2. Concurrent jobs are conflicting (shouldn't happen in GitHub Actions)
  3. The CI environment has stale processes

This is an infrastructure/test cleanup issue, not a requirement for testing race conditions. If port conflicts occur, investigate why test processes aren't cleaning up properly rather than working around it.

Remove this step. If port conflicts genuinely occur, file a separate issue to investigate why test processes don't clean up correctly.

🤖 Prompt for AI Agents
In @.github/workflows/build_and_test.yml around lines 244 - 247, Remove the
"Free up ports" step that runs sudo fuser -k 3000/tcp; specifically delete the
GitHub Actions job step with name "Free up ports" (the block that echoes
"Ensuring port 3000 is free..." and calls sudo fuser -k 3000/tcp || true). If
port conflicts are observed afterward, open a separate issue to investigate why
test processes launched by the workflow (or tests themselves) are not
terminating cleanly rather than reintroducing a teardown workaround.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
.github/workflows/build_and_test.yml (1)

81-96: Critical: MariaDB service options uses invalid --command=... (likely breaks container startup).

Docker doesn’t support a --command flag; if Actions passes this through to docker create/run, the MariaDB service may fail to start (which matches the “workflow failures” note in the PR objectives).

Proposed fix (remove invalid flag; rely on defaults + initSql in tests)
-                options: --health-cmd="mariadb-admin ping -h localhost -u vendure -ppassword" --health-interval=10s --health-timeout=5s --health-retries=3 --command="mysqld --innodb_file_per_table=1 --default-storage-engine=InnoDB --innodb_flush_log_at_trx_commit=2 --innodb_log_buffer_size=32M"
+                options: --health-cmd="mariadb-admin ping -h localhost -u vendure -ppassword" --health-interval=10s --health-timeout=5s --health-retries=3
e2e-common/test-config.ts (1)

92-110: Change connectTimeout to connectTimeoutMS for TypeORM PostgreSQL compatibility.

The option connectTimeout is not recognized by TypeORM's PostgreSQL connection options. TypeORM expects connectTimeoutMS instead. The current code will fail type-checking against TypeORM's DataSourceOptions interface and the timeout will be silently ignored at runtime. The connectionTimeoutMillis in the extra object is correct for configuring the underlying node-postgres driver pool.

🤖 Fix all issues with AI agents
In @.github/workflows/build_and_test.yml:
- Around line 151-189: The wait loops for Redis, DB (postgres/mariadb/mysql) and
Elasticsearch currently only break on success and never fail on timeout; update
each loop (or replace with a shared helper like wait_for_port) to detect when
the loop finishes without success and then echo a clear error and exit 1 so the
job fails fast; specifically ensure checks using E2E_REDIS_PORT,
E2E_POSTGRES_PORT, E2E_MARIADB_PORT, E2E_MYSQL_PORT and the Elasticsearch curl
health check cause a non-zero exit when their respective wait loops time out.
🧹 Nitpick comments (2)
.github/workflows/build_and_test.yml (1)

134-139: Scope creep risk: expanded e2e DB matrix may be better as a separate PR.

Given the PR objective/comment about extra workflow changes causing failures, consider splitting “CI hardening + matrix expansion” into a dedicated PR, keeping this one focused on the OrderService locking fix + minimal test coverage.

e2e-common/test-config.ts (1)

111-154: Make MySQL/MariaDB credentials environment-configurable (avoid local e2e breakage).

Hard-coded username: 'vendure' and password: 'password' can make local e2e runs fail unless developers use identical credentials. Consider reading E2E_DB_USER and E2E_DB_PASSWORD from environment variables (defaulting to current values) to improve portability across developer setups.

The driver-specific options (initSql under extra, acquireTimeout at top-level) are already correctly placed per TypeORM's MySQL/MariaDB DataSourceOptions structure.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 40f5ed1 and 57db869.

📒 Files selected for processing (3)
  • .github/workflows/build_and_test.yml
  • docker-compose.yml
  • e2e-common/test-config.ts
🚧 Files skipped from review as they are similar to previous changes (1)
  • docker-compose.yml

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In @.github/workflows/build_and_test.yml:
- Around line 87-95: The workflow adds redundant MariaDB InnoDB configuration
beyond the required version pin; remove the extra environment variables
MARIADB_MYSQL_LOCALHOST_USER and MARIADB_INITDB_SKIP_TZINFO and drop the
non-essential mysqld options (--innodb_flush_log_at_trx_commit and
--innodb_log_buffer_size) from the options string, keeping only the explicit
version pin to 11.5 and any minimal command flags required for tests (e.g.,
--innodb_file_per_table=1 and --default-storage-engine=InnoDB if you decide to
keep them), since InnoDB is already configured in docker-compose.yml and
e2e-common/test-config.ts.
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 57db869 and ce5d68a.

📒 Files selected for processing (3)
  • .github/workflows/build_and_test.yml
  • docker-compose.yml
  • e2e-common/test-config.ts
🚧 Files skipped from review as they are similar to previous changes (2)
  • docker-compose.yml
  • e2e-common/test-config.ts
🔇 Additional comments (3)
.github/workflows/build_and_test.yml (3)

100-105: Verify whether MySQL configuration changes are necessary for the race-condition fix.

Similar to the MariaDB changes, these MySQL service configuration updates appear to be infrastructure changes that may not be directly related to implementing pessimistic locking. The maintainer specifically requested restricting the PR to only changes necessary for the race-condition fix.

Based on learnings from the maintainer's comment about extra workflow changes causing CI failures, please confirm whether these MySQL environment variables and health check modifications are essential for the pessimistic locking feature to function correctly in the e2e tests.


118-119: The review comment is incorrect—this is a newly created workflow file, not a modification to existing configuration.

The commit ce5d68ae2 created .github/workflows/build_and_test.yml as a new file (indicated by new file mode 100644 in the git diff). The Elasticsearch settings at lines 118-119 (bootstrap.memory_lock: "false" and ES_JAVA_OPTS: -Xms256m -Xmx256m) are the initial configuration for the new CI setup, not changes from a prior state of 512m or true values. There is no previous version to compare against.

These settings are part of the complete e2e testing infrastructure being established in this commit. The commit message appropriately identifies the focus as "Fix race condition with InnoDB and postgres password," which addresses database-level locking concerns through InnoDB configuration changes (visible in the MariaDB service options).

Likely an incorrect or invalid review comment.


151-194: The wait step addresses a gap in health checks but could be implemented more cleanly.

The review comment's claim about existing health checks is incorrect: MariaDB, MySQL, Postgres, and Elasticsearch have --health-cmd options configured (lines 95, 105, 113, 129), but Redis has no health check defined (lines 130-133). This means the wait step is not purely redundant as stated—the Redis readiness check fills a genuine gap.

However, instead of adding a separate 44-line wait step, the proper solution would be to add a health check to the Redis service configuration itself (e.g., --health-cmd="redis-cli ping"). This would:

  • Leverage the built-in health check mechanism consistently across all services
  • Eliminate the need for a separate wait loop
  • Reduce workflow complexity

If this wait step was added as a workaround rather than addressing the root cause (missing Redis health check), consider reverting it and properly configuring Redis health checks instead.

Likely an incorrect or invalid review comment.

Comment on lines 87 to 95
MARIADB_DATABASE: vendure
MARIADB_USER: vendure
MARIADB_PASSWORD: password
# Ensure InnoDB is used for locking support
MARIADB_MYSQL_LOCALHOST_USER: 1
MARIADB_INITDB_SKIP_TZINFO: 1
ports:
- 3306
options: --health-cmd="mariadb-admin ping -h localhost -u vendure -ppassword" --health-interval=10s --health-timeout=5s --health-retries=3
options: --health-cmd="mariadb-admin ping -h localhost -u vendure -ppassword" --health-interval=10s --health-timeout=5s --health-retries=3 --command="mysqld --innodb_file_per_table=1 --default-storage-engine=InnoDB --innodb_flush_log_at_trx_commit=2 --innodb_log_buffer_size=32M"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

# First, get a general overview of the repository structure
git ls-files | head -30

Repository: vendurehq/vendure

Length of output: 1389


🏁 Script executed:

# Look for docker-compose and test configuration files
fd -e yml -e yaml | grep -E "(docker|compose|test)" | head -20

Repository: vendurehq/vendure

Length of output: 79


🏁 Script executed:

# Search for existing MariaDB or database configuration
fd -e yml -e yaml -o -e js -o -e ts | xargs rg -l -i "mariadb|innodb" 2>/dev/null | head -10

Repository: vendurehq/vendure

Length of output: 700


🏁 Script executed:

# Read the docker-compose.yml to see existing database configuration
cat docker-compose.yml

Repository: vendurehq/vendure

Length of output: 4762


🏁 Script executed:

# Read the test configuration file
cat e2e-common/test-config.ts

Repository: vendurehq/vendure

Length of output: 5265


🏁 Script executed:

# Read the workflow file around the MariaDB service definition (lines 70-100)
sed -n '70,110p' .github/workflows/build_and_test.yml

Repository: vendurehq/vendure

Length of output: 1999


🏁 Script executed:

# Search for OrderService changes and pessimistic locking implementation
fd -e ts | xargs rg -l "OrderService|SELECT.*FOR UPDATE|pessimistic" | head -10

Repository: vendurehq/vendure

Length of output: 678


🏁 Script executed:

# Check git diff or PR context for the actual code changes
git log --oneline --all --grep="race" | head -10

Repository: vendurehq/vendure

Length of output: 132


🏁 Script executed:

# Get the full commit to see all changes
git show ce5d68ae2 --stat

Repository: vendurehq/vendure

Length of output: 50374


🏁 Script executed:

# Check the full diff for that commit (focusing on summary)
git show ce5d68ae2 | head -200

Repository: vendurehq/vendure

Length of output: 5806


🏁 Script executed:

# Search for postgres password changes in the workflow
sed -n '105,125p' .github/workflows/build_and_test.yml

Repository: vendurehq/vendure

Length of output: 1408


The MariaDB version pinning to 11.5 is necessary, but the InnoDB configuration changes appear redundant.

The workflow comments correctly identify that MariaDB v11.6.2+ introduced a breaking change requiring the version pin. However, InnoDB support is already configured elsewhere:

  • The local docker-compose.yml has identical InnoDB options in the MariaDB command
  • The e2e-common/test-config.ts already sets default_storage_engine=InnoDB via initSql in the TypeORM connection config

The additional environment variables (MARIADB_MYSQL_LOCALHOST_USER, MARIADB_INITDB_SKIP_TZINFO) and extra mysqld options (innodb_flush_log_at_trx_commit, innodb_log_buffer_size) appear to be stability improvements rather than requirements for pessimistic locking support. Consider whether these changes are necessary for the race-condition fix or if they exceed the scope identified by the maintainer.

🤖 Prompt for AI Agents
In @.github/workflows/build_and_test.yml around lines 87 - 95, The workflow adds
redundant MariaDB InnoDB configuration beyond the required version pin; remove
the extra environment variables MARIADB_MYSQL_LOCALHOST_USER and
MARIADB_INITDB_SKIP_TZINFO and drop the non-essential mysqld options
(--innodb_flush_log_at_trx_commit and --innodb_log_buffer_size) from the options
string, keeping only the explicit version pin to 11.5 and any minimal command
flags required for tests (e.g., --innodb_file_per_table=1 and
--default-storage-engine=InnoDB if you decide to keep them), since InnoDB is
already configured in docker-compose.yml and e2e-common/test-config.ts.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
.github/workflows/build_and_test.yml (1)

138-138: Harden the wait step to eliminate nc dependency and improve service readiness checks.

The nc -z checks on lines 163, 171, 177, 183 are problematic: netcat is not guaranteed on ubuntu-latest runners (it's a virtual package requiring explicit installation), and port-open checks don't guarantee the service is actually ready to accept connections. The Elasticsearch check (line 191) using curl is more robust and should be the model.

Replace nc with bash's built-in /dev/tcp to eliminate the dependency, and improve database checks to verify actual readiness:

Improved wait step
             - name: Wait for services to be ready
               env:
                   E2E_MYSQL_PORT: ${{ job.services.mysql.ports['3306'] }}
                   E2E_MARIADB_PORT: ${{ job.services.mariadb.ports['3306'] }}
                   E2E_POSTGRES_PORT: ${{ job.services.postgres.ports['5432'] }}
                   E2E_ELASTIC_PORT: ${{ job.services.elastic.ports['9200'] }}
                   E2E_REDIS_PORT: ${{ job.services.redis.ports['6379'] }}
                   DB: ${{ matrix.db }}
               run: |
+                  set -euo pipefail
                   echo "Waiting for required services..."
                   # Wait for redis
                   for i in {1..30}; do
-                    if nc -z localhost $E2E_REDIS_PORT; then echo "Redis Ready"; break; fi
+                    if (echo > /dev/tcp/localhost/$E2E_REDIS_PORT) >/dev/null 2>&1; then echo "Redis Ready"; break; fi
                     if [ $i -eq 30 ]; then echo "Redis Timeout"; exit 1; fi
                     sleep 1
                   done

                   # Wait for DB ports depending on matrix value
                   if [ "$DB" = "postgres" ]; then
                     for i in {1..30}; do
-                      if nc -z localhost $E2E_POSTGRES_PORT; then echo "Postgres Ready"; break; fi
+                      if (echo > /dev/tcp/localhost/$E2E_POSTGRES_PORT) >/dev/null 2>&1; then echo "Postgres Ready"; break; fi
                       if [ $i -eq 30 ]; then echo "Postgres Timeout"; exit 1; fi
                       sleep 1
                     done
                   elif [ "$DB" = "mariadb" ]; then
                     for i in {1..30}; do
-                      if nc -z localhost $E2E_MARIADB_PORT; then echo "MariaDB Ready"; break; fi
+                      if (echo > /dev/tcp/localhost/$E2E_MARIADB_PORT) >/dev/null 2>&1; then echo "MariaDB Ready"; break; fi
                       if [ $i -eq 30 ]; then echo "MariaDB Timeout"; exit 1; fi
                       sleep 1
                     done
                   elif [ "$DB" = "mysql" ]; then
                     for i in {1..30}; do
-                      if nc -z localhost $E2E_MYSQL_PORT; then echo "MySQL Ready"; break; fi
+                      if (echo > /dev/tcp/localhost/$E2E_MYSQL_PORT) >/dev/null 2>&1; then echo "MySQL Ready"; break; fi
                       if [ $i -eq 30 ]; then echo "MySQL Timeout"; exit 1; fi
                       sleep 1
                     done
                   fi

Regarding the DB matrix expansion at line 138: this adds legitimate test coverage. However, combined with this new wait step, the PR introduces meaningful CI infrastructure changes. If the maintainer workflow concerns mentioned in the PR are significant, consider whether these should be bundled together or separated.

🤖 Fix all issues with AI agents
In @.github/workflows/build_and_test.yml:
- Around line 118-120: The current readiness probe hitting "/_cluster/health"
can return HTTP 200 while cluster status is RED; update the check to wait for at
least yellow by calling the health API with wait_for_status=yellow (e.g. GET
/_cluster/health?wait_for_status=yellow&timeout=60s) or by parsing the returned
JSON and asserting status !== "red"; modify the workflow step that currently
hits "/_cluster/health" to use the wait_for_status=yellow query (and add a
reasonable timeout) so tests only proceed when the cluster is yellow or green.
- Line 95: The services.mariadb.options block contains an invalid Docker flag
`--command="mysqld ..."` which GitHub Actions services do not support; remove
the `--command` entry from the options and instead rely on the image's default
CMD or supply initialization via a proper workaround (e.g., use a custom image,
entrypoint script, or a dedicated job step) and tighten the healthcheck by
keeping `--health-cmd="mariadb-admin ping -h localhost -u vendure -ppassword"`
and its interval/timeout/retries; update the `services.mariadb.options` to drop
`--command` and ensure the MariaDB container starts with the intended
configuration via a supported method.

In @e2e-common/test-config.ts:
- Around line 119-126: Replace the unrecognized option name initSql with the
correct init_command in the configuration objects inside the extra block (the
object that currently contains initSql, connectionLimit, waitForConnections,
queueLimit); update both occurrences used for MariaDB and MySQL so the storage
engine is actually set (change initSql: "SET default_storage_engine=InnoDB;" to
init_command: "SET default_storage_engine=INNODB").
🧹 Nitpick comments (1)
.github/workflows/build_and_test.yml (1)

100-105: DB service healthchecks: consider consistency + avoid relying on non-root users for ping.

For MySQL, the healthcheck pings as vendure (Line 105). That can be brittle if user creation is delayed; using root tends to be more stable for readiness signaling. For Postgres (Line 113), pg_isready is fine.

Optional tweak (MySQL healthcheck)
-                options: --health-cmd="mysqladmin ping -h localhost -u vendure -ppassword" --health-interval=10s --health-timeout=20s --health-retries=10
+                options: --health-cmd="mysqladmin ping -h localhost -u root -ppassword" --health-interval=10s --health-timeout=20s --health-retries=10

Also applies to: 113-113

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ce5d68a and 307283e.

📒 Files selected for processing (3)
  • .github/workflows/build_and_test.yml
  • docker-compose.yml
  • e2e-common/test-config.ts
🚧 Files skipped from review as they are similar to previous changes (1)
  • docker-compose.yml
🔇 Additional comments (4)
e2e-common/test-config.ts (3)

100-109: LGTM! Connection pool settings are appropriate for concurrent e2e tests.

The extended timeouts (60s), connection pool configuration (max 10 connections), and keepConnectionAlive setting are well-suited for testing concurrent order modifications under the new pessimistic locking implementation.


119-131: Connection pool and timeout settings look reasonable for concurrent e2e tests.

The configuration includes:

  • Connection pooling (10 connections) to handle concurrent test operations
  • Extended timeouts (60s) to accommodate slower test environments
  • queueLimit: 0 allows unlimited queuing, which is acceptable for tests but note this could cause memory issues in production if many connections wait simultaneously
  • keepConnectionAlive: true helps maintain stable connections during concurrent operations

These settings appropriately support the new race-condition e2e tests that exercise concurrent order modifications under pessimistic locking.

Also applies to: 141-153


117-117: No action needed. The username change to vendure is properly coordinated with docker-compose configuration. Both the MariaDB and MySQL services in docker-compose.yml are configured with MARIADB_USER: vendure and MYSQL_USER: vendure respectively, which matches the credentials used in test-config.ts. The official Docker images automatically create and configure these users during container initialization.

.github/workflows/build_and_test.yml (1)

87-92: No action required — MariaDB environment variables are supported.

MARIADB_MYSQL_LOCALHOST_USER and MARIADB_INITDB_SKIP_TZINFO are documented, supported environment variables in the official mariadb:11.5 Docker image. Both are applied only during initial database initialization; the image will initialize correctly with these settings. The healthcheck using the vendure user credentials (matching MARIADB_USER and MARIADB_PASSWORD) is appropriate and will function as expected once the database initializes.

Likely an incorrect or invalid review comment.

Comment on lines +119 to +126
extra: {
// Ensure tables use InnoDB for locking support
initSql: "SET default_storage_engine=InnoDB;",
// Connection pool settings
connectionLimit: 10,
waitForConnections: true,
queueLimit: 0,
},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

# Check if initSql is used anywhere in the codebase
rg -i "initSql" --type ts --type js

Repository: vendurehq/vendure

Length of output: 246


🏁 Script executed:

# Check the surrounding code context for mysql2 configuration patterns
cd e2e-common && head -150 test-config.ts | tail -50

Repository: vendurehq/vendure

Length of output: 2087


🌐 Web query:

mysql2 connection options initSql documentation

💡 Result:

Short answer: mysql2 has no built‑in connection option named initSql / initSQL. To run initialization SQL when a connection is created, acquire the connection and run the statements (or listen for pool events and run them on new connections). See mysql2 docs for createPool/createConnection and examples, and the classic node-mysql pool "connection" event pattern for running session init statements. [1][2]

References

  • mysql2 docs (createConnection / createPool examples). [1]
  • node-mysql (pool events) — use pool.on('connection', conn => conn.query('SET ...')) as the pattern to run init SQL on new pooled connections. [2]

🌐 Web query:

TypeORM mysql2 driver initialization commands SET storage engine

💡 Result:

Use the TypeORM DataSource option extra to pass the driver's initial SQL command (the underlying mysql/mysql2 client will run it on connect). Example — set default storage engine to InnoDB:

TypeScript:
const AppDataSource = new DataSource({
type: "mysql",
host: "localhost",
port: 3306,
username: "user",
password: "pw",
database: "db",
extra: {
init_command: "SET default_storage_engine=INNODB"
}
});

Notes:

  • TypeORM forwards extra to the mysql/mysql2 client. [1][2]
  • The driver/client executes init_command on each new connection (so remove it after migrations if you don't want it run for every session). [2][3]

Sources:
[1] TypeORM — MySQL driver docs (data-source / connection options).
[2] TypeORM — DataSource / DataSourceOptions (extra passes options to underlying driver).
[3] mysql client docs (init_command / initial command executed on connect).


Replace initSql with init_command in both MariaDB and MySQL configurations.

The option initSql is not recognized by mysql2 or TypeORM. The correct option name is init_command. Without this correction, the storage engine setting is silently ignored, and tables may default to MyISAM on some MySQL configurations, which does not support row-level locking and will break the pessimistic locking (SELECT ... FOR UPDATE) implementation.

Change both occurrences from:

initSql: "SET default_storage_engine=InnoDB;",

to:

init_command: "SET default_storage_engine=INNODB",

This applies to lines 119-126 (MariaDB) and 141-148 (MySQL).

🤖 Prompt for AI Agents
In @e2e-common/test-config.ts around lines 119 - 126, Replace the unrecognized
option name initSql with the correct init_command in the configuration objects
inside the extra block (the object that currently contains initSql,
connectionLimit, waitForConnections, queueLimit); update both occurrences used
for MariaDB and MySQL so the storage engine is actually set (change initSql:
"SET default_storage_engine=InnoDB;" to init_command: "SET
default_storage_engine=INNODB").

@RubenDarioGuerreroNeira RubenDarioGuerreroNeira force-pushed the fix/race-condition-order-actions branch from 307283e to 5491b0f Compare January 9, 2026 01:49
@sonarqubecloud
Copy link

sonarqubecloud bot commented Jan 9, 2026

@teresebressette3-alt
Copy link

Screenshot_20251227-084811.png

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Race condition when performing parallel order actions

3 participants