Skip to content

Commit 8492702

Browse files
ChrisBAshtonpatrickpatrickpatrick
authored andcommitted
Check if existing base path is associated with same content ID
WHAT This commit adds logic at the point of saving a draft to Publishing API. The logic checks whether the base path being saved to already has a live document at the end of it, and if so, if that document's content ID matches what we're sending. If it doesn't match, then that indicates that a different document (in Whitehall, or originating from another app) already owns the route and we've ended up in the not great situation of having allocated a slug that already exists. We raise an exception in this case, which is better than proceeding and letting the document silently overwrite an existing one. The lookup only checks for _published_ documents, as we haven't passed the `with_drafts: true` keyword arg. This is deliberate as we wouldn't want a situation of a publisher creating a draft document Foo, deleting it, then going and creating another draft document Foo and seeing an error on save. We only care about base paths that are already live. This does mean there could be an edge case where two draft documents have the same slug and we only get the exception when trying to save the second one after publishing the first one, but that seems a very low likelihood edge case. Note also that this logic change depends on Whitehall always saving drafts via the `Whitehall::PublishingApi` helper module. There are a few cases where Whitehall updates Publishing API outside of this module (by making calls to `Services.publishing_api.save_draft` instead), but these all seem to be for index pages and the like, where the content ID is always fixed, so refactoring the Publishing API calls to all go through the helper module is outside of the scope of this change. WHY It has long been possible for Whitehall to create a draft which has the same base path (`/government/foo`) as an already live page that was published by another publishing app. The draft has been previewable, and it's even been possible to schedule the draft for publication. Only at the point of Whitehall making a 'publish' call to Publishing API - either manually via the UI, or via the scheduled publishing worker - would Publishing API then raise an exception and cause Whitehall to raise a `Whitehall::UnpublishableInstanceError`. Historically this hasn't caused too many issues, except for occasional news article clashes with the now retired Content Publisher application. Most publishing apps publish different content types with different base path prefixes, thus there is never usually a conflict. _Internal_ base path conflicts - where Whitehall alone is responsible for producing two different documents under the same base path - have so far been avoided by a uniqueness constraint on the Document table, ensuring that every slug is unique amongst other slugs of the same document type: https://github.com/alphagov/whitehall/blob/e024e2eea25ca9cb48a8a98d2298aaea1df13b92/db/schema.rb#L241 This uniqueness constraint only works on the assumption that every document type has a distinct base path prefix (e.g. `/government/news` for NewsArticle, `/government/consultations` for Consultation, and so on). That assumption no longer holds true as we are gradually migrating content types to use the same shared StandardEdition model (with a dynamic base path prefix based on the `configurable_document_type` of the Edition). The consequence of this is that one may have a legacy NewsArticle with generated base path `/government/news/foo` and a StandardEdition document with generated base path `/government/news/foo`, and the uniqueness constraint passes because both are of different document types. There are several problems to solve here: 1. A newer StandardEdition document might silently overwrite an older legacy Edition which has the same base path. 2. Our slug enumeration logic (that automatically appends `--1` in the event of a clash) does not apply across document types. 3. Even if the slug enumeration logic DID apply, that wouldn't be desirable. It should be allowable to have a stanard-edition flavoured news article of path `/government/news/foo` alongside a standard-edition flavoured consultation of path `/government/consultations/foo` without one of them being relegated the slug `foo--1`. The first and most important problem is the silent overwriting. These cases are (currently) very few and far between, and the best approach is to capture it as an exception where we can help the user out as a support task. Further down the line we can capture the exception and give the user an actionable error message instead (i.e. to change their title), or potentially explore enumerating the slug to avoid conflicts automatically. We'll need to think about whether the uniqueness constraint index is still valuable or whether this dynamic lookup of Publishing API content items is a sufficient replacement. Jira: https://gov-uk.atlassian.net/browse/WHIT-1842
1 parent bc083e1 commit 8492702

4 files changed

Lines changed: 44 additions & 1 deletion

File tree

features/support/publishing_api.rb

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -19,6 +19,8 @@
1919
.to_return(body: { links: {} }.to_json)
2020
stub_request(:get, %r{\A#{Plek.find('publishing-api')}/v2/expanded-links})
2121
.to_return(body: { expanded_links: {} }.to_json)
22+
# Prevent publishing API base path checks from interfering with tests
23+
Whitehall::PublishingApi.stubs(:ensure_base_path_is_associated_with_this_content_id!).returns(nil)
2224
end
2325

2426
World(GdsApi::TestHelpers::PublishingApi)

lib/whitehall/publishing_api.rb

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -69,6 +69,8 @@ def self.save_draft_translation(
6969

7070
content.merge!(bulk_publishing: true) if bulk_publishing
7171

72+
ensure_base_path_is_associated_with_this_content_id!(content[:base_path], model_instance.content_id)
73+
7274
Services.publishing_api.put_content(presenter.content_id, content)
7375
end
7476
rescue GdsApi::HTTPUnprocessableEntity => e
@@ -178,5 +180,12 @@ def self.assert_public_edition!(instance)
178180
raise UnpublishableInstanceError, "#{instance.class} with ID #{instance.id} is not publishable"
179181
end
180182
end
183+
184+
def self.ensure_base_path_is_associated_with_this_content_id!(base_path, content_id)
185+
existing_content_id = Services.publishing_api.lookup_content_id(base_path:)
186+
return if existing_content_id.nil? || existing_content_id == content_id
187+
188+
raise UnpublishableInstanceError, "Cannot save draft (content_id #{content_id}). There is existing content at the '#{base_path}' route, under a different content_id (#{existing_content_id}). Try changing your title to resolve the conflict."
189+
end
181190
end
182191
end

test/test_helper.rb

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -76,6 +76,8 @@ class ActiveSupport::TestCase
7676
Services.publishing_api.stubs(:get_events_for_content_id).returns([])
7777
Services.stubs(:asset_manager).returns(stub_everything("asset-manager"))
7878
TaxonValidator.any_instance.stubs(:validate)
79+
# Prevent publishing API base path checks from interfering with tests
80+
Whitehall::PublishingApi.stubs(:ensure_base_path_is_associated_with_this_content_id!).returns(nil)
7981
end
8082

8183
teardown do

test/unit/lib/whitehall/publishing_api_test.rb

Lines changed: 31 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -274,16 +274,46 @@ class Whitehall::PublishingApiTest < ActiveSupport::TestCase
274274
assert_equal english_path, PublishingApiUnscheduleWorker.jobs[1]["args"].first
275275
end
276276

277-
test ".save_draft publishes a draft edition" do
277+
test ".save_draft publishes a draft edition if no content exists at the route yet" do
278278
draft_edition = create(:draft_case_study)
279279
payload = PublishingApiPresenters.presenter_for(draft_edition)
280280
request = stub_publishing_api_put_content(payload.content_id, payload.content)
281+
Whitehall::PublishingApi.unstub(:ensure_base_path_is_associated_with_this_content_id!)
282+
Services.publishing_api.expects(:lookup_content_id).with(base_path: draft_edition.public_path).returns(nil)
281283

282284
Whitehall::PublishingApi.save_draft(draft_edition)
283285

284286
assert_requested request
285287
end
286288

289+
test ".save_draft publishes a draft edition if there is a live content item with the same base path and same content ID" do
290+
draft_edition = create(:draft_case_study)
291+
payload = PublishingApiPresenters.presenter_for(draft_edition)
292+
request = stub_publishing_api_put_content(payload.content_id, payload.content)
293+
Whitehall::PublishingApi.unstub(:ensure_base_path_is_associated_with_this_content_id!)
294+
Services.publishing_api.expects(:lookup_content_id).with(base_path: draft_edition.public_path).returns(payload.content_id)
295+
296+
Whitehall::PublishingApi.save_draft(draft_edition)
297+
298+
assert_requested request
299+
end
300+
301+
test ".save_draft raises exception if there is a live content item with the same base path but different content ID" do
302+
ConfigurableDocumentType.setup_test_types(build_configurable_document_type("test_type"))
303+
edition = create(:draft_standard_edition)
304+
Whitehall::PublishingApi.unstub(:ensure_base_path_is_associated_with_this_content_id!)
305+
Services.publishing_api.expects(:lookup_content_id).with(base_path: edition.public_path).returns("different-content-id")
306+
307+
error = assert_raises Whitehall::UnpublishableInstanceError do
308+
Whitehall::PublishingApi.save_draft(edition)
309+
end
310+
311+
assert_equal(
312+
"Cannot save draft (content_id #{edition.content_id}). There is existing content at the '#{edition.public_path}' route, under a different content_id (different-content-id). Try changing your title to resolve the conflict.",
313+
error.message,
314+
)
315+
end
316+
287317
test ".publish_redirect_async publishes a redirect to the Publishing API" do
288318
document = create(:document)
289319
destination = "/government/people/milli-vanilli"

0 commit comments

Comments
 (0)