Skip to content

Native gorm mongodb String to ObjectId conversion#15583

Open
codeconsole wants to merge 8 commits intoapache:7.1.xfrom
codeconsole:7.1.x-mongo-id-fix
Open

Native gorm mongodb String to ObjectId conversion#15583
codeconsole wants to merge 8 commits intoapache:7.1.xfrom
codeconsole:7.1.x-mongo-id-fix

Conversation

@codeconsole
Copy link
Copy Markdown
Contributor

GORM MongoDB: storedAs Identifier Coercion

The Problem

Before this change, GORM MongoDB had a silent asymmetry in how it handled identifier types:

Code path String id declared + _id: ObjectId on disk
Scan read (.list()) ✅ Decoder falls back via Spring ConversionService
Point lookup (get(hex)) ❌ Returns null — query sends {_id: "<hex>"} as BSON String
Update (save()) ❌ Throws misleading OptimisticLockingException
Batch (getAll([hex]), findAllByIdInList, in('id', [...])) ❌ Returns empty — $in list sent as BSON Strings

The reader was forgiving, but the query and write paths weren't. This made "declare String id on a domain that previously used ObjectId id" a silent landmine — scans looked fine, everything else quietly broke.

The ergonomic consequence: teams were forced to either keep ObjectId id (which serializes to {"timestamp":..., "date":...} in JSON and requires new ObjectId(...) at every call site) or do a full data migration of every _id value in storage.

The Fix

Two new knobs that let a domain keep String id ergonomics while persisting _id as a BSON ObjectId — no data migration required.

Per-domain mapping option

import org.bson.types.ObjectId

class Person {
    String id

    static mapping = {
        id storedAs: ObjectId
    }
}

Global config default

grails:
  mongodb:
    stringIdsDefaultStoredAs: objectid   # or 'string' (default — current behavior)

Per-domain storedAs always wins over the global default. Natural-key domains opt out with id storedAs: String.

What Changed

10 production files across three modules, all coerce between declared type and storage type:

Layer File Change
Core interface IdentityMapping.java New default Class<?> getStoredAs() { return null; }
Core property Property.groovy New Class<?> storedAs field
Core factory MappingFactory.java Both createDefaultIdentityMapping overloads expose storedAs dynamically (composite-key-safe)
Mongo config MongoSettings.groovy New SETTING_STRING_IDS_DEFAULT_STORED_AS
Mongo settings AbstractMongoConnectionSourceSettings.groovy New String stringIdsDefaultStoredAs field
Mongo context MongoMappingContext.java Reads global default; applies to String-id domains without explicit storedAs; warns on unrecognized values
Mongo encoder IdentityEncoder.groovy String → ObjectId on write (with ObjectId.isValid guard for natural keys)
Mongo query MongoQuery.java IdEquals + In handlers coerce to storage type (fixes get(hex), findAllByIdInList, criteria in('id', ...))
Mongo persister MongoCodecEntityPersister.groovy Coerces keys in retrieveEntity and retrieveAllEntities (fixes get and getAll)
Mongo session MongoCodecSession.groovy Coerces nativeKey in update/delete filters (fixes saves and deletes)

Why It's Better

  • Ergonomics + performance, not either/or. Domain code sees String id (clean JSON, no new ObjectId(...) dance, no [object Object] bugs at HTTP boundaries). Storage keeps 12-byte BSON ObjectId _id (half the index size of hex strings, embedded creation timestamp, native Mongo sort order).
  • No data migration required. Legacy documents with _id: ObjectId(...) continue to load via the existing decoder fallback; new writes produce BSON ObjectId too — everything stays consistent on disk.
  • The silent failure modes are gone. get(hex), findAllByIdInList, saves, deletes, and criteria in-lists all now match stored ObjectIds when the domain opts in. The misleading OptimisticLockingException on save (which previously pointed developers at concurrency rather than the real id-type mismatch) no longer fires.
  • Fully backward compatible. Default config is null/string — nothing changes for existing apps unless they explicitly opt in.
  • Natural-key safety. Combining generator: 'assigned' with storedAs: ObjectId and a non-hex value (e.g. a slug) falls back to BSON String rather than throwing IllegalArgumentException deep in the BSON pipeline.

Testing

23 tests across 2 new specs, all against a real MongoDB testcontainer or real MongoMappingContext init:

  • StringIdWithObjectIdStorageSpec (18 tests) — documents the original bugs, then proves each code path (read, point lookup, save, batch getAll, findAllByIdInList, criteria in('id', ...), delete, update of legacy ObjectId-_id docs, non-hex natural-key fallback, JSON serialization demo).
  • StringIdDefaultStoredAsConfigSpec (5 tests) — global config plumbing: default off, global on, per-domain override, declared-type filter (ObjectId-id domains ignore the String-id global), unrecognized-value safe fallback.

Full :grails-data-mongodb-core:test + :grails-data-hibernate5-core:test green. Checkstyle clean on all three modified modules. Three rounds of automated code review completed, each round surfaced real bugs, all fixed.

Documentation

  • objectMapping/idGeneration.adoc — new section "Decoupling the Declared Type from the Storage Type" + "Global Default" + "Caveats".
  • gettingStarted/advancedConfig.adoc — one-line pointer to the new setting.

`ObjectId.class.isAssignableFrom(...)` and `String.class.isAssignableFrom(...)`
are flagged by the CodeNarc UnnecessaryDotClass rule. Drop the redundant
`.class` so the linter passes.
@bito-code-review
Copy link
Copy Markdown

Yes, to maintain consistency between save and query paths, the query conversion should also fall back to the original String when ObjectId conversion fails for invalid hex. Currently, the IdentityEncoder falls back gracefully, but IdEquals in MongoQuery directly calls convert() without try-catch, which could throw an exception. Wrapping it in try-catch like the coerce methods would align the behavior.

…; keep original so the filter matches the BSON String the encoder wrote.
"3.2.0" | true
"3.1.0" | true
"3.3.0" | true
"7.1.0" | false
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should fix this with an arbitrary large value instead of just setting to true.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've updated this in the 7.1 branch after seeing it fail on multiple reviews. You should be able to update your base branch now.

Copy link
Copy Markdown
Contributor

@jdaugherty jdaugherty left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@borinquenkid I know you've recently worked on mongo. Would you mind taking a look at this PR?

@copilot can you please review this PR too?

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds first-class support in GORM MongoDB for persisting String id domains with _id stored as BSON ObjectId, plus a global default to opt into this behavior without per-domain mapping.

Changes:

  • Introduces storedAs on identifier mapping (IdentityMapping#getStoredAs, Property.storedAs) and wires it through MappingFactory.
  • Applies storedAs coercion across MongoDB encoder/query/persister/session paths and adds a global default (grails.mongodb.stringIdsDefaultStoredAs).
  • Adds/updates docs and adds new integration/unit specs covering the previously asymmetric read/query/write behavior.

Reviewed changes

Copilot reviewed 15 out of 15 changed files in this pull request and generated 9 comments.

Show a summary per file
File Description
grails-datastore-core/src/main/groovy/org/grails/datastore/mapping/model/MappingFactory.java Exposes storedAs dynamically via identity mapping, with composite-key null-safety.
grails-datastore-core/src/main/groovy/org/grails/datastore/mapping/model/IdentityMapping.java Adds default getStoredAs() hook for backends to honor storage-type coercion.
grails-datastore-core/src/main/groovy/org/grails/datastore/mapping/config/Property.groovy Adds storedAs to mapped form so backends can persist using a different native type.
grails-datamapping-support/src/test/groovy/org/grails/datastore/mapping/core/grailsversion/GrailsVersionSpec.groovy Updates version expectation for 7.1.x baseline.
grails-data-mongodb/core/src/main/groovy/org/grails/datastore/mapping/mongo/config/MongoSettings.groovy Defines new global setting key for default String-id storage type.
grails-data-mongodb/core/src/main/groovy/org/grails/datastore/mapping/mongo/connections/AbstractMongoConnectionSourceSettings.groovy Adds settings field for stringIdsDefaultStoredAs.
grails-data-mongodb/core/src/main/groovy/org/grails/datastore/mapping/mongo/config/MongoMappingContext.java Parses/applies global default storedAs for String-id domains during identity creation.
grails-data-mongodb/bson/src/main/groovy/org/grails/datastore/bson/codecs/encoders/IdentityEncoder.groovy Encodes ids as BSON ObjectId (or String) based on storedAs, with non-hex fallback.
grails-data-mongodb/core/src/main/groovy/org/grails/datastore/mapping/mongo/query/MongoQuery.java Coerces id values for IdEquals and identity In queries to match stored BSON type.
grails-data-mongodb/core/src/main/groovy/org/grails/datastore/mapping/mongo/engine/MongoCodecEntityPersister.groovy Coerces ids for point lookup and $in batch retrieval (get, getAll).
grails-data-mongodb/core/src/main/groovy/org/grails/datastore/mapping/mongo/MongoCodecSession.groovy Coerces update/delete filters’ native keys to match stored BSON type.
grails-data-mongodb/core/src/test/groovy/org/grails/datastore/gorm/mongo/bugs/StringIdWithObjectIdStorageSpec.groovy Adds regression/integration coverage for scan vs point-lookup vs write asymmetry and fixes.
grails-data-mongodb/core/src/test/groovy/org/grails/datastore/gorm/mongo/bugs/StringIdDefaultStoredAsConfigSpec.groovy Adds tests validating global default + per-domain override semantics.
grails-data-mongodb/docs/src/docs/asciidoc/objectMapping/idGeneration.adoc Documents storedAs and global default, plus caveats.
grails-data-mongodb/docs/src/docs/asciidoc/gettingStarted/advancedConfig.adoc Adds pointer to the new global config setting.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Copy Markdown
Member

@borinquenkid borinquenkid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please increase the test coverage on the noted parts. If the coverage was missed in the review, respond inline

- Rename config key to nested form: grails.mongodb.stringIds.defaultStoredAs
  (Spring-style hierarchical namespace). Introduce StringIdSettings nested
  under AbstractMongoConnectionSourceSettings so ConfigurationBuilder binds
  at the nested path.
- Add javadoc/comment cross-references from each id-coercion site
  (MongoCodecEntityPersister, MongoCodecSession, MongoQuery.IdEquals,
  MongoQuery.In) to the specific StringIdWithObjectIdStorageSpec feature
  methods that exercise them end-to-end.
- Update idGeneration.adoc and advancedConfig.adoc to the nested key form.
PersistentEntityCodec.encodeUpdate silently dropped the recursive encode
of non-DirtyCheckable embedded values (there was a "TODO: Support
non-dirty checkable objects?" in the else branch). Result: when a caller
assigned a new embedded instance to a previously-null property and then
save(flush:true)'d, the parent's $set never included the embedded
sub-document, so nothing landed in Mongo even though save() returned a
non-null persisted instance.

When we are encoding an embedded update and the value lacks dirty
tracking, fall back to encoding every persistent property; the caller's
encodeEmbeddedUpdate then $set's them onto "<assocName>.<prop>" paths
exactly as the DirtyCheckable path does.

Adds SingleEmbeddedAssignNullToNonNullSpec covering three variants:
POGO embedded (the regression), @entity embedded, and a top-level scalar
save control on the parent.
@testlens-app
Copy link
Copy Markdown

testlens-app Bot commented Apr 27, 2026

✅ All tests passed ✅

🏷️ Commit: 69b8add
▶️ Tests: 25324 executed
⚪️ Checks: 36/36 completed


Learn more about TestLens at testlens.app.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

5 participants