Skip to content

Migrate legacy topical events [WHIT-3270]#11530

Draft
ChrisBAshton wants to merge 62 commits into
mainfrom
migrate-legacy-topical-events
Draft

Migrate legacy topical events [WHIT-3270]#11530
ChrisBAshton wants to merge 62 commits into
mainfrom
migrate-legacy-topical-events

Conversation

@ChrisBAshton

@ChrisBAshton ChrisBAshton commented Jun 4, 2026

Copy link
Copy Markdown
Contributor

What

This PR:

  • Rewrites StandardEditionMigrator to simplify its API and make it easier to work with
  • Adds a TopicalEventRecipe ready for migrating (childless) Topical Events
  • Support for Topical Event About pages can come in a future PR, iterating this recipe

New StandardEditionMigrator API

Given a legacy record and a recipe, we can 'preview' the migration, seeing the before-and-after payloads as well as a normalised diff summary.

StandardEditionMigrator.preview_migration(
  TopicalEvent.find_by(slug: "g8-2013"),
  StandardEditionMigrator::TopicalEventRecipe,
)

If you're happy with the preview, you can 'perform' the migration by one of two methods - migrate_existing_document and create_new_document` - which take the same arguments.

  • migrate_existing_document overwrites an existing Document and all of its Editions. It is intended to be called for legacy edition types such as Publication, Speech, etc. It is not compatible with other content types.
  • create_new_document creates a new Document instance and one 'published' Edition. It is intended to be called for legacy content types that don't follow the Document/Edition model, e.g. TopicalEvent, Organisation etc. It could in theory support legacy edition types too, so that we have an alternative to overwriting-in-place (could instead duplicate a given Document and then delete the old one) but that's outside of the scope of this PR, and in any case is 'lossy' vs the overwrite approach above.
StandardEditionMigrator.create_new_document(
  TopicalEvent.find_by(slug: "g8-2013"),
  StandardEditionMigrator::TopicalEventRecipe,
)

There's also an optional raise_if_payloads_differ keyword arg, which defaults to false - it raises an exception if there is any diff between the normalised payloads.

StandardEditionMigrator.create_new_document(
  TopicalEvent.find_by(slug: "g8-2013"),
  StandardEditionMigrator::TopicalEventRecipe,
  raise_if_payloads_differ: true,
)

Finally, you can also enqueue the migration of an array of elements (creating a StandardEditionMigratorJob for each):

StandardEditionMigrator.enqueue_bulk_migration(
  TopicalEvent.reject { |te| te.topical_event_about_page },
  StandardEditionMigrator::TopicalEventRecipe,
  migration_method: "create_new_document",
  raise_if_payloads_differ: true,
)

Why

We need to migrate legacy topical events to StandardEdition. In trying to write the recipe, I found the existing StandardEditionMigrator too cumbersome to work with - so have rebuilt it from the ground up.

We've removed the responsibility for translations management from StandardEditionMigratorJob and made it the responsibility of the Recipe. We've also removed the options for auto-publishing to Publishing API post migration, as this adds complexity and ultimately wasn't used for the NewsArticle migration due to the race conditions it introduced.

Jira: https://gov-uk.atlassian.net/browse/WHIT-3270


⚠️ This repo is Continuously Deployed: make sure you follow the guidance ⚠️

This application is owned by the Whitehall Experience team. Please let us know in #govuk-whitehall-experience-tech when you raise any PRs.

Follow these steps if you are doing a Rails upgrade.

@ChrisBAshton ChrisBAshton force-pushed the migrate-legacy-topical-events branch 6 times, most recently from c7ab17d to 09736c4 Compare June 10, 2026 09:06
…tion

Following the existing pattern for computing which worldwide organisations are required, add two methods to identify if lead organisations and supporting organisation are required in the config.

We currently use the `required` field in the json to render the `(required)` flag on the form, but we do not actually implement organisation validations in accordance with this. Future commits will implement this.
…config

This commit ensures that the `Edition::Organisations` concern's validations do not run when organisations are simply enabled, but also specifically required. We already have the logic for this, that traverses the associations blocks and looks for a required field, which was extended to organisations in the previous commit. This ensures consistent experience for the user, and unblocks the migration of topical events to the config driven approach.

We can additionally validate for supporting organisations individually. This is unlikely to be needed for our edition types, but it is in line with keeping the validation config driven. No additional logic exists in the concern that is not directly derived from the config.

Any mention of the "required" flag in any of the associations blocks, now means:
- the flag appears on the form
- the validation only runs when the required flag is true
- a missing flag is interpreted as 'no validation' required

For legacy models that still use the concern, the logic has remained: always validate lead organisations for presence, never validate supporting organisations. This is now explicit in the new methods added.

Previously:
- We always validated lead organisations for presence on all editions, if organisations were enabled. Organisations were always enabled on all non-config-driven models that included the `Edition::Organisations` concern, due to the `organisation_association_enabled?` method being `true` in the concern. For standard editions (configurable), whether orgs are enabled is based on the json schema - `StandardEdition` overrides the method. The only non-config driven model that also overrides the method is `Speech`.
- Validations always ran for lead organisations presence, even if the required field was not present in the config, because most editions apply this validation, and it was not needed to have specific required logic when the org associations were written. The only outlier is Topical Event.
- Because the `topical_event.json` schema does not mention a `required` lead organisation, the edition form did not render the `(required)` flag, but attempting to save without a lead organisation errored with the "at least one required" error. This was confusing for the user and also prevents us from migrating legacy topical events to the config-driven approach, as old topical events without organisations would be invalid.
This fixes some issues introduced in `6e46574e`, causing the organisations fields to disappear from the form when a value for the person override is provided (the "Speaker does not have a profile on GOV.UK" field is populated). With these changes, the validations for organisations are separated from the logic that checks if organisations are enabled, ensuring we continue to render the organisation fields.

What that commit did:
- replace `can_be_related_to_organisations?` with  `organisation_association_enabled?`
- but also replace `skip_organisation_validation?` with `organisation_association_enabled?`

Speeches used to override `skip_organisation_validation?` but they would always keep `can_be_related_to_organisations?` true from the `Edition::Organisations` concern value. This got changed in commit `6e46574e``:
- the method `organisation_association_enabled?` started conditionally rendering fields in the `_organisation_fields.html.erb` partial, but also conditionally validating the lead organisations for presence
- speeches with a `person_override` would not require lead orgs, but they'd also render without the organisations fields at all, because the logic for enabled and required organisations got merged into a single method.

The cases affected in that commit were news article, corporate information pages, speeches and standard editions. News articles are now standard editions and work as expected due to the implicit separation of types and declarations of associations in the json schemas. Corporate information pages never render org fields, nor do they include the `Organisations` concern, so their `organisation_association_enabled?` should always be false, which currently comes from the value in `base_permission_methods.rb`. Speeches were the only remaining problematic case.
This was only required for corporate information pages - and that is better expressed in the model. When we migrate all models to config driven, the method will only have the one config-based definition, on `StandardEdition`. For now, it's preferable to make it visible where there are differences in behaviour across models.
Not needed. The values of the methods in the `Edition::Organisations` and `Edition::WorldLocations` concerns, and the `StandardEdition` model overrides, are enough.
@ChrisBAshton ChrisBAshton force-pushed the migrate-legacy-topical-events branch from 5419385 to dbda8c9 Compare June 11, 2026 06:23
(So that we don't have to recreate it every time we migrate a new
format.)
As introduced in draft PR #11431. Sometimes we just want a generic
Whitehall error, or a way of catching multiple Whitehall errors,
without accidentally swallowing unexpected errors like NoMethodError
etc.
It can be difficult to know where to slot in a new recipe, because
we keep deleting recipes and their 'slotting in' whenever we've
completed migration of a legacy document type. So the intention
is to leave these comments in.
This was implemented but not used for the original NewsArticle
migration, as it led to FriendlyId related race conditions where
certain slugs would be claimed by some news articles before the
news article that _originally_ had that slug gets published, and
thus Publishing API would reject the save.

https://gov-uk.atlassian.net/browse/WHIT-2487

It's not clear whether the republish option was used on the
CaseStudy migration.

In any case, migrating in one batch and then publishing in another
seems a sensible way to de-risk the migration, especially now that
we're having to have the migrator deal with both Document models
and non-Document models.
This was added in 4ae0df1 - or
rather, it was added quite a bit before that, and then rebased to
the point where it lost all meaning!

I seem to remember the intention was for a much more useful preview
service, but what we've ended up with is essentially a sense check
that the scope you've passed is what you thought you passed -
something that can be easily checked in the rails console. The
preview method also wasn't offering any sort of guardrail.

Removing it simplifies the codebase, which is currently growing
to support Document vs non-Document models for migration.
We have some great payload comparison logic tucked away inside
the StandardEditionMigratorJob, but you can only exercise it by
running an actual migration, and passing the `compare_payloads`
parameter.

What we want instead, ideally, is a way of previewing it up front.
It should be trivial to pass a record and a recipe to
StandardEditionMigrator, and see how the legacy and new presenter
outputs compare.

This new method duplicates a lot of what is in
`ensure_payloads_remain_identical` (and the new `build_edition`
method repeats logic in the `migrate_non_editionable!` method,
etc. Over the next few commits we expect to de-duplicate this
and ultimately delete the old methods.
Did consider raising an exception if the payloads differ, but not
needed for now.

This duplicates a lot of the content in the compare_presenter_payloads
method, which again we'll refactor out in the next few commits.
The method wasn't usable as-is due to raising an exception on
mismatch, and no way of splitting out the content / links diff.
It makes far more sense in here.

TODO:
- move most of the 'compare_payloads' test logic into the
  recipe_test.rb test
- make 'compare_payloads' really dumb - just give it an old record
  and a migrated edition and a recipe, and let it call the relevant
  methods for payload comparison
Takes the pain out of initializing the recipe twice etc.
Had to avoid making DB calls in the images payload builder, since
these images are kept in memory only until we're ready to make the
migration.

Legacy Topical Event logos don't have the same variants/dimensions
as the new ones, but do have the same aspect ratio, so we can map
them as equivalent.
Turns out there are some cases where we want to also get the
UNUSABLE images that map to the given image_usage
We've split out the test recipes into two.
I've ensured the
`reload!; puts StandardEditionMigrator.preview_migration(TopicalEvent.find_by(slug: "g8-2013"), StandardEditionMigrator::TopicalEventRecipe)`
example continues to work.
I've also firmed up the compare_payloads method to ensure we
cache the payload for the legacy content item / presenter, otherwise
we risked updating it in-place and then comparing something with
itself, and then our payload comparitor would be useless.
…ument

In theory it would be possible to have `create_new_document`
support both Editionable and non-Editionable legacy records,
but in the interests of simplicity for now we'll just raise
an exception if an Editionable record is passed.
`migrate_existing_document` would only ever be applicable
to Editionable legacy records.
Splitting the methods makes it clearer to the developer
the consequences of what they're doing, and is a natural
point for us to split the code.
Doesn't make sense for it to be in the preview method
Much cleaner output - stops diffs like this, which wrongly imply
`emphasised_organisations` is the culprit when actually it's
just that there's a missing property above it and that moves the
curly brace:

```
:details=>
- {:body=>"<div class=\"govspeak\">\n" + "</div>",
- :emphasised_organisations=>
+ {:emphasised_organisations=> ["06056197-bc69-4147-aa28-070bca132178", "d994e55c-48c9-4795-b872-58d8ec98af12"],
```
@ChrisBAshton ChrisBAshton force-pushed the migrate-legacy-topical-events branch from b1e480b to 74fd732 Compare June 11, 2026 08:56
The migration completes and the DB rows are created but the Edition
says:
> This edition is invalid
> Images image data file not provided. Images can be JPG, JPEG, GIF, PNG or SVG files.

Stemming from `images[n].image_data.file.file => nil` instead of
a CarrierWave value.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants