Skip to content

[FIX] sync curation extraction#1248

Merged
jdkent merged 45 commits into
masterfrom
fix/sync_curation_extraction
Dec 13, 2025
Merged

[FIX] sync curation extraction#1248
jdkent merged 45 commits into
masterfrom
fix/sync_curation_extraction

Conversation

@jdkent
Copy link
Copy Markdown
Member

@jdkent jdkent commented Dec 6, 2025

closes #1247

honestly a long standing issue that nick has brought up before, this make the alignment between stubs and studies within studysets direct so people do not lose their annotations (and sometimes analyses) when they update their meta-analyses. (updating meta-analyses is something we want to support, if we want to do "living meta-analyses").

@jdkent jdkent requested a review from Copilot December 6, 2025 23:16
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds support for tracking curation stub UUIDs on studyset-study associations, enabling better synchronization between curation and extraction workflows. The changes allow the system to maintain a stable mapping between curated stubs and their corresponding study versions, even when studies are updated or re-ingested.

Key changes:

  • Added curation_stub_uuid column to the studyset_studies association table with appropriate indexing and uniqueness constraints
  • Updated schema serialization/deserialization to handle stub UUID mappings in studyset payloads
  • Modified frontend helper functions to preserve stub-to-study mappings when syncing curation data to extraction studysets

Reviewed changes

Copilot reviewed 14 out of 14 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
store/backend/neurostore/models/data.py Adds curation_stub_uuid column to StudysetStudy model with relationship configurations
store/backend/migrations/versions/8e3f3d8a9b5b_add_curation_stub_uuid.py Database migration to add the new column, index, and unique constraint
store/backend/neurostore/schemas/data.py Updates schema serialization to capture and normalize stub UUID mappings
store/backend/neurostore/resources/data.py Implements logic to update stub mappings when creating/updating studysets
store/backend/neurostore/resources/base.py Refines permission checks for nested record updates
store/backend/neurostore/tests/api/test_studysets.py Adds test coverage for stub UUID capture and persistence
store/scripts/backfill_curation_stub_uuid.py Provides a backfill script for existing data
compose/neurosynth-frontend/src/helpers/Extraction.helpers.tsx Adds mapStubsToStudysetPayload function to maintain stub-to-study mappings
compose/neurosynth-frontend/src/helpers/Extraction.helpers.spec.ts Unit tests for the new helper function
compose/neurosynth-frontend/src/pages/Project/components/MoveToExtractionDialog.tsx Updates to use new stub mapping helper
compose/neurosynth-frontend/src/pages/Extraction/components/ExtractionOutOfSync.tsx Updates sync logic to preserve stub mappings
cypress/e2e/workflows/ingestion/Ingestion.cy.tsx Minor test fixture path corrections

Comment thread store/backend/neurostore/resources/data.py Outdated
Comment thread store/backend/neurostore/schemas/data.py Outdated
Comment thread store/backend/neurostore/resources/base.py Outdated
Comment thread compose/neurosynth-frontend/src/helpers/Extraction.helpers.tsx Outdated
@jdkent jdkent requested review from Copilot and nicoalee December 7, 2025 05:26
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 16 out of 16 changed files in this pull request and generated 3 comments.

Comment thread store/backend/neurostore/resources/data.py Outdated
Comment thread store/backend/neurostore/schemas/data.py Outdated
Comment thread store/backend/migrations/versions/8e3f3d8a9b5b_add_curation_stub_uuid.py Outdated
@nicoalee
Copy link
Copy Markdown
Collaborator

dependent on:
neurostuff/neurostore-spec#98

Copy link
Copy Markdown
Collaborator

@nicoalee nicoalee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pushed minor changes removing an unused function argument.
Overall it looks good and it works on my end.

We just need to merge in and rebuild the typescript SDKs to add the curation_stub_map property in the studyset update object

I tested:

  1. creating and then ingesting two stubs without doi/pmid/pmcid creates two separate studies/base studies. Previous annotations are retained.
  2. creating and then ingesting two stubs with the same doi/pmid/pmcid creates two separate studies/base studies. Previous annotations are retained.

@jdkent
Copy link
Copy Markdown
Member Author

jdkent commented Dec 11, 2025

looking at the implementation, I do not want curation_stub_map to be a part of the public api, just adds more opportunities for there being 2 sources of truth. curation_stub_uuid should be passed with id within the study object with the parameter studies. Sorry for putting you on a path to try to make that schema work.

@jdkent jdkent requested a review from Copilot December 11, 2025 16:45
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 25 out of 25 changed files in this pull request and generated 3 comments.

Comment thread store/backend/migrations/versions/8e3f3d8a9b5b_add_curation_stub_uuid.py Outdated
Comment thread store/backend/neurostore/schemas/data.py
const queryClient = useQueryClient();
return useMutation<AxiosResponse<StudysetReturn>, AxiosError, StudysetRequest, unknown>(
(studyset) => API.NeurostoreServices.StudySetsService.studysetsPost(studyset),
(studyset) => API.NeurostoreServices.StudySetsService.studysetsPost(undefined, undefined, undefined, studyset),
Copy link

Copilot AI Dec 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Three undefined parameters suggest the API signature may have changed. Consider using named parameters or an options object to make the call site more maintainable and less error-prone.

Suggested change
(studyset) => API.NeurostoreServices.StudySetsService.studysetsPost(undefined, undefined, undefined, studyset),
(studyset) => API.NeurostoreServices.StudySetsService.studysetsPost({ studyset }),

Copilot uses AI. Check for mistakes.
@jdkent
Copy link
Copy Markdown
Member Author

jdkent commented Dec 12, 2025

I'll merge this by EOD!

@nicoalee
Copy link
Copy Markdown
Collaborator

looking at the implementation, I do not want curation_stub_map to be a part of the public api, just adds more opportunities for there being 2 sources of truth. curation_stub_uuid should be passed with id within the study object with the parameter studies. Sorry for putting you on a path to try to make that schema work.

no worries! i was actually wondering why we had the two sources of truth (curation_stub_map and the studyset.studyset_studies) so this makes a lot more sense

@jdkent jdkent force-pushed the fix/sync_curation_extraction branch from e9a5cdc to a4a063c Compare December 12, 2025 23:19
@jdkent jdkent merged commit c64151e into master Dec 13, 2025
18 checks passed
@jdkent jdkent deleted the fix/sync_curation_extraction branch December 13, 2025 02:45
@jdkent jdkent added this to Planning Dec 19, 2025
@jdkent jdkent moved this to Done in Planning Dec 19, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Study with no IDs in curation always overwrite existing study in extraction

3 participants