fix: add transaction support for dgraph migration#3186
fix: add transaction support for dgraph migration#3186krisp619 wants to merge 3 commits intogofr-dev:developmentfrom
Conversation
There was a problem hiding this comment.
Pull request overview
Adds transactional semantics to the Dgraph migration “commit” step so the migration metadata insert is performed via an explicit Dgraph transaction (mutate + commit + discard), improving consistency of recording applied migrations.
Changes:
- Introduced a small Dgraph transaction interface (
Mutate/Commit/Discard) and guarded type assertion forNewTxn(). - Updated Dgraph migration commit flow to use
NewTxn()withdefer Discard()and explicitCommit()on success. - Updated Dgraph migration tests to validate commit/discard behavior across success and failure cases.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| pkg/gofr/migration/dgraph.go | Wraps the migration record insert in a Dgraph transaction (NewTxn → Mutate → Commit, with deferred Discard). |
| pkg/gofr/migration/dgraph_test.go | Reworks commit tests to use a mock transaction and assert commit/discard behavior. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| tx, ok := c.DGraph.NewTxn().(dgraphTxn) | ||
| if !ok { | ||
| return errInvalidDgraphTxn | ||
| } |
There was a problem hiding this comment.
commitMigration has a new failure path when c.DGraph.NewTxn() doesn't implement the expected transaction interface, but the tests only cover mutate/commit errors. Add a test case where NewTxn() returns a non-transaction object (or nil) and assert errInvalidDgraphTxn is returned to prevent regressions in this guard logic.
There was a problem hiding this comment.
Thanks for picking this up, @krisp619. The transaction wrapping pattern is correct and the test structure is solid. However, there are a few issues that need to be addressed before merge.
Blocking: Loss of observability on mutation
File: pkg/gofr/migration/dgraph.go:184
The original code called c.DGraph.Mutate() which goes through Client.Mutate() (dgraph.go:205-237). That method:
- Creates an OpenTelemetry span via
addTrace(ctx, "mutate") - Logs mutation failures via
d.logger.Error - Records histogram metrics (
dgraph_mutate_duration) viasendOperationStats - Pretty-prints the query log
The new tx.Mutate() calls txnImpl.Mutate() directly — a raw pass-through to the dgo client with no tracing, no logging, no metrics. Migration mutations become invisible to the observability stack.
Options:
- (a) Add tracing/logging/metrics to the transaction-based mutation path to match existing behaviour
- (b) At minimum, log mutation duration and errors at the migration layer
- (c) Document this as a known trade-off if observability on migration mutations is intentionally deprioritized
Discard error silently dropped
File: pkg/gofr/migration/dgraph.go:182
defer tx.Discard(ctx)Discard returns an error that is silently dropped. While this matches the idiomatic dgo pattern, the framework logs errors from other datastores. Consider:
defer func() {
if dErr := tx.Discard(ctx); dErr != nil {
c.Errorf("dgraph: transaction discard failed: %v", dErr)
}
}()Missing test coverage for edge cases
File: pkg/gofr/migration/dgraph_test.go
The following edge cases are not covered:
| # | Edge Case | Why It Matters |
|---|---|---|
| 1 | NewTxn() returns a type that doesn't implement dgraphTxn |
errInvalidDgraphTxn path is never exercised. Test with: mockDGraph.EXPECT().NewTxn().Return("not-a-txn") |
| 2 | NewTxn() returns nil |
nil type assertion should yield ok=false, verify it returns errInvalidDgraphTxn and doesn't panic. Test with: mockDGraph.EXPECT().NewTxn().Return(nil) |
| 3 | Discard returns an error |
Verify no panic and the commit flow still returns the correct error (mutate or commit error, not discard error) |
| 4 | Discard called after successful Commit |
Verify dgo no-op behaviour — Discard after Commit should not corrupt data |
| 5 | json.Marshal fails before NewTxn() |
Verify no transaction is leaked (txn never created). Pre-existing path but good to confirm. |
Minor style nits
-
var ( errInvalidDgraphTxn = ... )— Single var doesn't need parentheses. Use:var errInvalidDgraphTxn = errors.New("...") -
commitDonefield inmockDgraphTxn— This is set totrueeven when commit returns an error. The name implies success. Consider renaming tocommitCalledfor clarity. -
Error message —
"invalid dgraph transaction type"gives no debugging info. Consider:return fmt.Errorf("dgraph: NewTxn() returned %T, expected dgraphTxn", c.DGraph.NewTxn())
Please add screenshots/logs showing
The PR description should include evidence of correct transaction behaviour. Specifically:
1. Successful migration run
- Start a GoFr application with Dgraph configured and a pending migration
- Show logs confirming: transaction created → mutate succeeded → commit succeeded
- Query Dgraph to verify the migration record exists:
{ migrations(func: type(Migration), orderdesc: migrations.version, first: 1) { migrations.version migrations.method migrations.start_time migrations.duration } } - Screenshot of the query result showing the migration was recorded
2. Failed mutation (rollback verification)
- Simulate a mutation failure (e.g., malformed JSON payload, network error, or Dgraph down)
- Show logs confirming: transaction created → mutate failed → discard called → migration rolled back
- Query Dgraph to verify no partial migration record was committed
3. Failed commit (post-mutate failure)
- Simulate a commit failure (e.g., transaction conflict / network error after mutate)
- Show logs confirming: transaction created → mutate succeeded → commit failed → discard called
- Query Dgraph to verify the mutation was not persisted (transaction was discarded)
4. Idempotency check
- Run the same migration twice
- Show that the second run detects the migration is already applied (via
getLastMigrationquery) and skips it - This confirms the commit path writes the correct version number
5. Before/after comparison (optional but helpful)
- Run a migration on the
developmentbranch (without this PR) and show Dgraph server logs - Run the same migration on this branch and show the difference — explicit
BeginTxn→Commitsequence visible in logs
Edge cases to test against (manual or integration)
Beyond the unit tests listed above, these should be verified against a real Dgraph instance:
| # | Scenario | What to check |
|---|---|---|
| 1 | Dgraph server goes down between Mutate and Commit |
Error is returned, no partial data, app doesn't panic |
| 2 | Context cancelled during Mutate |
Transaction is discarded, migration is marked as failed |
| 3 | Large migration payload (many predicates/fields) | Mutation + commit complete within timeout, no OOM |
| 4 | Multiple GoFr instances running migrations concurrently | Transaction conflicts are handled gracefully (dgo returns ErrAborted) |
| 5 | Dgraph returns ErrAborted (transaction conflict) |
Error is propagated, migration can be retried |
| 6 | Schema migration (ApplySchema) succeeds but metadata commit fails |
Schema change is applied but migration is not recorded — next run will retry. Document this as expected behaviour. |
Summary
The core transaction logic is correct — NewTxn → Mutate → Commit with defer Discard is the right pattern. The blocking issue is the loss of tracing/logging/metrics on the mutation path. Once that's addressed and the missing test cases are added, this is good to merge.
|
Hey @krisp619 ! |
|
@Umang01-hash @coolwednesday Thanks for the detailed review. I’ve been going through the suggested changes and verifying the current implementation, I’ll continue working on this and push the updates soon. |
Fixes #2892
Summary:
This change adds transaction support to Dgraph migration commit flow to ensure consistency during migration execution.
Changes: