-
Notifications
You must be signed in to change notification settings - Fork 3
DGI9-645: Filter destination properties based on available source properties. #173
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
WalkthroughAdds a new TrackingGet migrate process plugin that wraps the existing Get plugin to track source→destination property mappings during migrations; conditionally enables it via env var and integrates filtering into DgiRevisionedEntity. Documentation and unit tests accompany the feature. Changes
Sequence DiagramsequenceDiagram
autonumber
participant Migrator as Migration process
participant TG as TrackingGet (wrapper)
participant Get as Original Get plugin
participant Row as Row
participant Entity as DgiRevisionedEntity
Migrator->>TG: transform(value, migrate_executable, row, dest_prop)
activate TG
TG->>Row: read current destination properties & tracking map
alt mapping resolution
TG->>Get: delegate transform(value,...)
activate Get
Get-->>TG: transformed value
deactivate Get
TG->>Row: update tracking map for dest_prop
end
TG-->>Migrator: return transformed value
deactivate TG
Migrator->>Entity: import(row)
activate Entity
alt tracking enabled
Entity->>TG: TrackingGet::filterRow(row)
activate TG
TG->>Row: clone & remove/mark missing destination props per tracker
TG-->>Entity: filtered row
deactivate TG
Entity->>Entity: swap filtered destination props back to original row
end
Entity->>Entity: getEntity(filtered/original row)
deactivate Entity
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes
Suggested reviewers
Poem
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches🧪 Generate unit tests (beta)
📜 Recent review detailsConfiguration used: CodeRabbit UI Review profile: CHILL Plan: Pro Disabled knowledge base sources:
📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
Warning Review ran into problems🔥 ProblemsErrors were encountered while retrieving linked issues. Errors (1)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
…able source properties.
Was an artifact of earlier implementation, where the wrapped/overridden plugin was being instantiated closer to when it was used.
Given our testing environment installs the polyfills, we could get into a weird place where tests pass, but runtime environments (without the "dev" requirements) would fail.
7364164 to
5511278
Compare
…g the `get` plugin. For example, should a property's definition just do a `dgi_migrate.process.entity_query` using `conditions`, which reads from the row but does _not_ make use of the `get` plugin.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (2)
tests/src/Unit/TrackingGetTest.php (1)
51-54: Consider adding a test for multiple source properties.The
TrackingGetplugin supports array sources (see line 72 in TrackingGet.php:$properties = is_string($source) ? [$source] : $source). A test with['source' => ['a', 'b']]where one exists and one doesn't would validate theany()logic returns true if at least one source exists.src/Plugin/migrate/process/TrackingGet.php (1)
71-72: Consider adding validation for thesourceconfiguration.If
sourceis neither a string nor an array (e.g., null or missing), theis_string($source) ? [$source] : $sourceexpression would pass a non-array toany(), potentially causing issues. While the wrappedGetplugin likely validates this, defensive validation here could provide clearer error messages.$source = $this->configuration['source']; + if (!isset($source)) { + throw new MigrateException('The "source" configuration is required.'); + } $properties = is_string($source) ? [$source] : $source;
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
Disabled knowledge base sources:
- Jira integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (5)
README.md(1 hunks)dgi_migrate.module(2 hunks)src/Plugin/migrate/destination/DgiRevisionedEntity.php(2 hunks)src/Plugin/migrate/process/TrackingGet.php(1 hunks)tests/src/Unit/TrackingGetTest.php(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (3)
tests/src/Unit/TrackingGetTest.php (1)
src/Plugin/migrate/process/TrackingGet.php (3)
TrackingGet(21-160)setWrappedPlugin(58-61)filterRow(134-158)
src/Plugin/migrate/destination/DgiRevisionedEntity.php (1)
src/Plugin/migrate/process/TrackingGet.php (2)
TrackingGet(21-160)filterRow(134-158)
src/Plugin/migrate/process/TrackingGet.php (2)
src/Plugin/migrate/destination/DgiRevisionedEntity.php (1)
create(47-53)src/MigrateBatchExecutable.php (1)
getIdMap(545-547)
🔇 Additional comments (10)
README.md (1)
82-91: Clear and comprehensive documentation for the new entity update process.The documentation effectively explains:
- The purpose of partial source handling to avoid erasing existing fields
- How the tracking works via the
getplugin- The environment variable toggle for disabling
- The known limitation about plugins that bypass
getdgi_migrate.module (1)
37-46: Well-structured plugin wrapping pattern.The implementation correctly:
- Preserves the original
getplugin definition asdgi_migrate_original_get(whichTrackingGet::create()references)- Conditionally wraps based on the environment variable
- Follows the same pattern established for
LockingMigrationLookupsrc/Plugin/migrate/destination/DgiRevisionedEntity.php (3)
68-86: Solid implementation of the tracking-aware entity retrieval.The method correctly:
- Bypasses tracking when disabled via environment variable
- Filters the row through
TrackingGet::filterRow()to apply tracking logic- Syncs destination properties back to the original row to preserve the reference
- Updates the ID map from the filtered row
The property sync pattern (remove all, then copy from filtered) ensures a clean state.
51-51: Good defensive coding with null-safe fallback.Using
$migration?->id() ?? '(unknown; not provided)'handles edge cases where migration may be null.
129-131: Clean helper for revision messaging.The
generateRevisionMessage()method provides a consistent, traceable revision log entry.tests/src/Unit/TrackingGetTest.php (2)
59-147: Comprehensive test coverage for core tracking scenarios.Tests appropriately cover:
- Present source → tracked as true, not marked empty after filtering
- Present empty source with skip → tracked as true, marked empty after filtering
- Present empty source with pass-through → tracked as true, value preserved
- Absent source → tracked as false, filtered out entirely
These align well with the PR objectives for handling present vs. absent vs. empty source properties.
152-201: Good coverage of transitive property resolution.The transitive tests verify that when a destination property references another destination property (via
@prefix), the tracking correctly propagates:
- If the referenced property's source existed → transitive is tracked as true
- If the referenced property's source was absent → transitive is tracked as false
src/Plugin/migrate/process/TrackingGet.php (3)
66-95: Well-designed tracking logic with proper source/destination handling.The implementation correctly:
- Retrieves or initializes the tracker from the destination property
- Handles both string and array source configurations
- Parses the
@prefix using the same logic as Drupal core'sRowclass (with proper attribution)- For source properties: checks
$row->hasSourceProperty()- For destination properties: checks the tracker first, then falls back to
$row->hasDestinationProperty()The
any()semantics correctly implement "at least one source property exists."
110-123: Good forward-compatible polyfill forarray_any().The implementation:
- Uses native
array_any()when available (PHP 8.4+)- Falls back to a manual implementation with proper attribution to Symfony's polyfill
- Correctly passes both value and key to the callback
134-158: Clean implementation of row filtering.The
filterRow()method correctly:
- Returns early if no tracker exists (no-op for non-tracked rows)
- Clones the row to avoid mutating the original
- Copies all destination properties to the clone
- Marks properties as empty only when the tracker indicates source was present but no destination value exists
- Removes the tracking property from the output
The logic at lines 148-152 ensures that if a source column was present (
array_filter($tracker)returns truthy entries), but the destination property wasn't set, it marks it as explicitly empty—correctly implementing the "empty source clears field" behavior.
When performing a migration with rows that update existing entities, leaving properties out of the source for could lead to having the corresponding destination fields on entities cleared of their values.
Let's more directly track the properties based on whether they might have had a value available, such that:
More specifically, this should be useful in processes such as
islandora_spreadsheet_ingest, should a sheet bearing a subset of the fields be provided while also mapping entity IDs (such that it's possible to try to "update" existing entities in the first place).Summary by CodeRabbit
Documentation
New Features
Tests
✏️ Tip: You can customize this high-level summary in your review settings.