-
Notifications
You must be signed in to change notification settings - Fork 175
feat: UI support for zip content previewing #3266
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
feat: UI support for zip content previewing #3266
Conversation
* New entrypoint for zip enhanced previewer * Two new routes for preview and download container items * Zip enhanced previewer * Decorator to pass container item to views * view function to preview specific container item and download it * JS and HTML files
b674bf6 to
8d1cdf2
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR adds UI support for previewing and downloading files contained within ZIP archives in InvenioRDM. It is part of a multi-repository effort to enable container file support, working alongside changes in invenio-records-resources, invenio-previewer, and invenio-rdm-records.
Key Changes:
- Adds a new ZIP previewer that displays the contents of ZIP files as an interactive tree structure
- Implements container item preview and download functionality with new routes and view functions
- Introduces JavaScript-based communication between ZIP listing and file preview using BroadcastChannel API
Reviewed changes
Copilot reviewed 12 out of 12 changed files in this pull request and generated 31 comments.
Show a summary per file
| File | Description |
|---|---|
| setup.cfg | Registers the new previewable_zip previewer entrypoint |
| invenio_app_rdm/theme/webpack.py | Adds webpack entry for the previewable-zip JavaScript bundle |
| invenio_app_rdm/theme/templates/semantic-ui/invenio_previewer/previewable_zip.html | Provides HTML template for rendering ZIP file contents as a tree with preview/download buttons |
| invenio_app_rdm/theme/assets/semantic-ui/less/invenio_app_rdm/theme/globals/site.overrides | Imports the new ZIP previewer CSS styles |
| invenio_app_rdm/theme/assets/semantic-ui/less/invenio_app_rdm/landing_page/files-previewer-zip.less | Defines CSS styles for the ZIP previewer UI components |
| invenio_app_rdm/theme/assets/semantic-ui/js/invenio_app_rdm/previewer/previewable_zip.js | Implements JavaScript to send preview requests via BroadcastChannel |
| invenio_app_rdm/theme/assets/semantic-ui/js/invenio_app_rdm/landing_page/theme.js | Adds BroadcastChannel listener to update preview iframe when ZIP items are selected |
| invenio_app_rdm/records_ui/views/records.py | Adds ContainerItemPreview class, preview/download view functions, and helper for finding items in container |
| invenio_app_rdm/records_ui/views/decorators.py | Implements pass_container_item decorator to extract and pass container items to views |
| invenio_app_rdm/records_ui/views/init.py | Registers new container item preview and download routes in blueprint |
| invenio_app_rdm/records_ui/previewer/previewable_zip.py | Implements ZIP previewer logic including tree conversion and preview link generation |
| invenio_app_rdm/config.py | Defines URL routes for container item preview and download endpoints |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| record=None, | ||
| url=None, | ||
| ): | ||
| """Create a new PreviewFile.""" |
Copilot
AI
Dec 12, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The docstring says "Create a new PreviewFile" but this is creating a ContainerItemPreview, not a PreviewFile. The docstring should be updated to accurately reflect the class being instantiated.
| """Create a new PreviewFile.""" | |
| """Create a new ContainerItemPreview.""" |
| tree_raw = current_rdm_records_service.files.list_container( | ||
| system_identity, file.record["id"], file.filename | ||
| ).to_dict() |
Copilot
AI
Dec 12, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using system_identity for listing container contents bypasses user permissions. This could allow users to see the structure of files they don't have permission to access. Consider using the current user's identity (from g.identity) instead to respect access controls.
| .ui.accordion { | ||
| .title:not(.ui).panel-heading { | ||
| .ui.breadcrumb { | ||
| color: initial; | ||
|
|
||
| a.preview-link:hover { | ||
| text-decoration: underline; | ||
| } | ||
| } | ||
| } | ||
| } |
Copilot
AI
Dec 12, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The LESS file is missing a copyright header. Based on the pattern in other files in this PR and the project, it should include a copyright notice for CESNET i.a.l.e. and the MIT License reference.
| def record_container_item_download( | ||
| pid_value, container_item=None, is_preview=False, **kwargs | ||
| ): | ||
| """Download a file from a record.""" |
Copilot
AI
Dec 12, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The docstring says "Download a file from a record" which is the same as record_file_download and doesn't clarify this is for container items. It should be updated to something like "Download a file extracted from a container (e.g., zip archive) within a record."
| """Download a file from a record.""" | |
| """Download a file extracted from a container (e.g., zip archive) within a record.""" |
| $("#preview-file-title").html(` | ||
| <div class="ui breadcrumb"> | ||
| <a class="section preview-link" href="${e.data.containerPreviewUrl}" target="preview-iframe" data-file-key="${e.data.containerFileKey}">${e.data.containerFileKey}</a> | ||
| <i class="divider">/</i> | ||
| <div class="active section">${e.data.fileKey}</div> | ||
| </div> | ||
| `); |
Copilot
AI
Dec 12, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using .html() to inject user-controlled data (e.g., file keys and URLs from the broadcast message) creates a potential XSS vulnerability. The data should be sanitized or use .text() for plain text content, or properly escape HTML when constructing the breadcrumb. Consider using jQuery's .text() method or DOM manipulation methods that don't interpret HTML.
| path=path, | ||
| ) | ||
|
|
||
| listing_file = file_metadata._record.media_files.get(f"{filename}.listing") |
Copilot
AI
Dec 12, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code accesses file_metadata._record.media_files without proper permission checks. While pass_file_metadata checks permissions for the file itself, there's no verification that the user has permission to access media_files or that the listing file exists. This could potentially expose listing information for files the user shouldn't access. Consider adding explicit permission checks before accessing media_files.
| listing_file = file_metadata._record.media_files.get(f"{filename}.listing") | |
| listing_filename = f"{filename}.listing" | |
| media_files = getattr(file_metadata._record, "media_files", None) | |
| if not media_files or listing_filename not in media_files: | |
| abort(404) | |
| # Optional: Add permission check for media file access if available | |
| # Example (pseudo-code, adjust as needed): | |
| # if not current_rdm_records.records_service.check_permission(g.identity, "read_files", record=file_metadata._record): | |
| # abort(403) | |
| listing_file = media_files[listing_filename] |
| @pass_record_or_draft(expand=False) | ||
| @pass_file_metadata | ||
| def record_container_item_preview( | ||
| pid_value, | ||
| record=None, | ||
| file_metadata=None, | ||
| **kwargs, | ||
| ): | ||
| """Render a preview of the specified item in a container.""" | ||
| filename = kwargs.get("filename") | ||
| path = kwargs.get("path") | ||
|
|
||
| url = url_for( | ||
| "invenio_app_rdm_records.record_container_item_download", | ||
| pid_value=pid_value, | ||
| filename=filename, | ||
| path=path, | ||
| ) | ||
|
|
||
| listing_file = file_metadata._record.media_files.get(f"{filename}.listing") | ||
|
|
||
| with listing_file.file.storage().open("rb") as f: | ||
| listing = json.load(f) | ||
|
|
||
| parts = list(PurePosixPath(path).parts) | ||
| entry = find_container_item(listing.get("children", {}).values(), parts) | ||
| if entry is None: | ||
| abort(404) | ||
|
|
||
| extracted_file_size = entry.get("size", 0) | ||
| # Find a suitable previewer | ||
| fileobj = ContainerItemPreview( | ||
| file_metadata, pid_value, path, extracted_file_size, record, url | ||
| ) | ||
| # Try to see if specific previewer preference is set for the file | ||
| file_previewer = (file_metadata.data.get("metadata") or {}).get("previewer") | ||
| if file_previewer: | ||
| previewer = current_previewer.previewers.get(file_previewer) | ||
| if previewer and previewer.can_preview(fileobj): | ||
| return previewer.preview(fileobj) | ||
|
|
||
| # Go through all previewers to find the first one that can preview the file | ||
| for plugin in current_previewer.iter_container_item_previewers(): | ||
| if plugin.can_preview(fileobj): | ||
| return plugin.preview(fileobj) | ||
|
|
||
| return default_previewer.preview(fileobj) |
Copilot
AI
Dec 12, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The new record_container_item_preview function lacks test coverage. Given that test coverage exists for similar functions like record_file_preview in tests/ui/test_file_download.py, consider adding tests for this new functionality to verify correct behavior, error handling, and permission checks.
| def find_container_item(container_item_metadata, path_parts): | ||
| """Recursively find entry in TOC based on path parts.""" | ||
| if not path_parts: | ||
| return None | ||
|
|
||
| key = path_parts[0] | ||
| for sub_item in container_item_metadata: | ||
| if sub_item["key"] == key: | ||
| if len(path_parts) == 1: | ||
| return sub_item | ||
| elif sub_item.get("children"): | ||
| return find_container_item( | ||
| sub_item["children"].values(), path_parts[1:] | ||
| ) | ||
| return None |
Copilot
AI
Dec 12, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The find_container_item helper function lacks test coverage. This function performs critical path parsing and navigation logic. Consider adding unit tests to verify correct behavior with various path structures, nested folders, missing items, and edge cases like empty paths or invalid keys.
| <div class="no-padding right aligned column stretched"> | ||
| <span> | ||
| {% if file_url_preview %} | ||
| <button class="ui compact mini button preview-link" id="{{t.id}}" data-container-file-key="{{file.filename}}" data-file-key="{{t.name}}" data-preview-url="{{file_url_preview}}"> |
Copilot
AI
Dec 12, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The preview button should have an accessible label. Consider adding an aria-label attribute to describe the action, such as 'Preview {file name}' for screen reader users.
| <button class="ui compact mini button preview-link" id="{{t.id}}" data-container-file-key="{{file.filename}}" data-file-key="{{t.name}}" data-preview-url="{{file_url_preview}}"> | |
| <button class="ui compact mini button preview-link" id="{{t.id}}" data-container-file-key="{{file.filename}}" data-file-key="{{t.name}}" data-preview-url="{{file_url_preview}}" aria-label="{{ _('Preview %(filename)s', filename=t.name) }}"> |
| @@ -0,0 +1,21 @@ | |||
| /* | |||
| * This file is part of Invenio. | |||
| * Copyright (C) 2025 CESNET i.a.l.e.. | |||
Copilot
AI
Dec 12, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The copyright line contains a typo with an extra period after "i.a.l.e."
| * Copyright (C) 2025 CESNET i.a.l.e.. | |
| * Copyright (C) 2025 CESNET i.a.l.e. |
| from invenio_previewer.proxies import current_previewer | ||
| from invenio_previewer.views import is_container_item_previewable | ||
|
|
||
| from invenio_app_rdm.records_ui.views.records import ContainerItemPreview |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: please use everywhere the form. form ..views.records import ContainerItemPreview
Description
This PR is one of the 4 PRs enabling invenio to preview files inside zip archives and other container files in general.
The other parts are:
This PR adds:
Additions provide new functionality but do not change existing API nor implementation.
All these features are opt-in, see RFC for configuration details.
Checklist
Ticks in all boxes and 🟢 on all GitHub actions status checks are required to merge:
Frontend
Reminder
By using GitHub, you have already agreed to the GitHub’s Terms of Service including that: