Skip to content

[SYNPY-1799, SYNPY-1804] Add manifest generation to sync_from_synapse#1354

Open
danlu1 wants to merge 18 commits intodevelopfrom
SYNPY-1799-add-manifest-to-sync_from_synapse
Open

[SYNPY-1799, SYNPY-1804] Add manifest generation to sync_from_synapse#1354
danlu1 wants to merge 18 commits intodevelopfrom
SYNPY-1799-add-manifest-to-sync_from_synapse

Conversation

@danlu1
Copy link
Copy Markdown
Contributor

@danlu1 danlu1 commented Apr 8, 2026

Problem:

StorableContainer.sync_from_synapse() does not generate a manifest file after downloading content from Synapse. We want to add manifest generation similar to synapseutils.syncFromSynapse() so users can use that manifest for uploading data via sync_to_synapse.

Solution:

  1. Added manifest CSV generation to StorableContainer.sync_from_synapse()
  2. The manifest is written in CSV format using parentId and ID column names, consistent with the Synapse UI download cart and synapse get-download-list CLI output. Annotations and provenance are included as additional columns.
  3. The implementation lives in three new helpers in storable_container.py:
  • _extract_entity_metadata_for_manifest_csv() — extracts per-file rows and discovers annotation column keys dynamically

  • _write_manifest_data_csv() — writes the CSV using DictWriter

  • generate_manifest_csv() — public entry point, skips gracefully if path or files are absent

  1. The protocol definition in storable_container_protocol.py was updated to keep the sync and async signatures in alignment.

Testing:

Unit tests and integration tests were added.

@danlu1 danlu1 requested a review from a team as a code owner April 8, 2026 22:13
Copilot AI review requested due to automatic review settings April 8, 2026 22:13
@danlu1 danlu1 marked this pull request as draft April 8, 2026 22:14
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds manifest CSV generation to StorableContainer.sync_from_synapse() so downloaded content can be re-uploaded via sync_to_synapse using an interoperable manifest.csv format (matching Synapse UI download cart / CLI column naming).

Changes:

  • Added manifest CSV helpers and integrated manifest generation into StorableContainer.sync_from_synapse_async() with manifest={all,root,suppress}.
  • Deprecated legacy synapseutils.syncFromSynapse() to steer users to StorableContainer.sync_from_synapse.
  • Added/updated unit tests around manifest behavior and the deprecation warning.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
synapseclient/models/mixins/storable_container.py Adds CSV manifest generation helpers and hooks manifest generation into sync flow.
synapseclient/models/protocols/storable_container_protocol.py Updates protocol signature/docs to include manifest parameter.
synapseutils/sync.py Marks syncFromSynapse as deprecated with guidance to use StorableContainer.sync_from_synapse.
tests/unit/synapseclient/models/async/unit_test_folder_async.py Adds tests for manifest behavior and CSV helper functions.
tests/unit/synapseutils/unit_test_synapseutils_sync.py Adds a test asserting the deprecation warning is emitted.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread synapseclient/models/mixins/storable_container.py Outdated
Comment thread synapseclient/models/mixins/storable_container.py Outdated
Comment thread tests/unit/synapseclient/models/async/unit_test_folder_async.py Outdated
Comment thread tests/unit/synapseclient/models/async/unit_test_folder_async.py Outdated
@danlu1 danlu1 changed the title reformatting [SYNPY-1799] Add manifest generation to sync_from_synapse Apr 9, 2026
Comment thread synapseclient/models/mixins/storable_container.py Outdated
Comment thread synapseclient/models/mixins/storable_container.py
Comment thread synapseclient/models/services/manifest.py Outdated
Comment thread synapseutils/sync.py Outdated
Comment thread tests/unit/synapseclient/models/async/unit_test_folder_async.py Outdated
Comment thread synapseclient/models/services/manifest.py Outdated
Comment thread synapseclient/models/services/manifest.py Outdated
Comment thread synapseclient/models/services/manifest.py Outdated
Comment thread synapseclient/models/services/manifest.py Outdated
Comment thread synapseclient/models/services/manifest.py Outdated
@andrewelamb
Copy link
Copy Markdown
Contributor

Thanks @danlu1! I left some comments. You'll also need to update the docs folder:

  • docs/tutorials/python/download_data_in_bulk.md: Use the new method
  • docs/explanations/manifest_tsv.md: Don't make any changes here, this will be done in a different ticket

@danlu1 danlu1 marked this pull request as ready for review April 24, 2026 19:29
@danlu1 danlu1 changed the title [SYNPY-1799] Add manifest generation to sync_from_synapse [SYNPY-1799, SYNPY-1804] Add manifest generation to sync_from_synapse Apr 27, 2026
@danlu1 danlu1 requested a review from andrewelamb April 27, 2026 21:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants