Skip to content

Conversation

@xsenyaaax
Copy link

@xsenyaaax xsenyaaax commented Dec 12, 2025

Description

This PR is one of the 4 PRs enabling invenio to preview files inside zip archives and other container files in general.

The other parts are:

This PR adds:

  • New entrypoint for zip enhanced previewer
  • Zip enhanced previewer
  • Decorator to pass container item to views
  • view function to preview specific container item and download it
  • Two new routes for preview and download container items
  • JS and HTML files

Additions provide new functionality but do not change existing API nor implementation.

All these features are opt-in, see RFC for configuration details.

Checklist

Ticks in all boxes and 🟢 on all GitHub actions status checks are required to merge:

Frontend

Reminder

By using GitHub, you have already agreed to the GitHub’s Terms of Service including that:

  1. You license your contribution under the same terms as the current repository’s license.
  2. You agree that you have the right to license your contribution under the current repository’s license.

* New entrypoint for zip enhanced previewer
* Two new routes for preview and download container items
* Zip enhanced previewer
* Decorator to pass container item to views
* view function to preview specific container item and download it
* JS and HTML files
@xsenyaaax xsenyaaax force-pushed the contribution-zip-previewer-ui branch from b674bf6 to 8d1cdf2 Compare December 12, 2025 10:45
@xsenyaaax xsenyaaax marked this pull request as ready for review December 12, 2025 11:05
Copilot AI review requested due to automatic review settings December 12, 2025 11:05
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds UI support for previewing and downloading files contained within ZIP archives in InvenioRDM. It is part of a multi-repository effort to enable container file support, working alongside changes in invenio-records-resources, invenio-previewer, and invenio-rdm-records.

Key Changes:

  • Adds a new ZIP previewer that displays the contents of ZIP files as an interactive tree structure
  • Implements container item preview and download functionality with new routes and view functions
  • Introduces JavaScript-based communication between ZIP listing and file preview using BroadcastChannel API

Reviewed changes

Copilot reviewed 12 out of 12 changed files in this pull request and generated 31 comments.

Show a summary per file
File Description
setup.cfg Registers the new previewable_zip previewer entrypoint
invenio_app_rdm/theme/webpack.py Adds webpack entry for the previewable-zip JavaScript bundle
invenio_app_rdm/theme/templates/semantic-ui/invenio_previewer/previewable_zip.html Provides HTML template for rendering ZIP file contents as a tree with preview/download buttons
invenio_app_rdm/theme/assets/semantic-ui/less/invenio_app_rdm/theme/globals/site.overrides Imports the new ZIP previewer CSS styles
invenio_app_rdm/theme/assets/semantic-ui/less/invenio_app_rdm/landing_page/files-previewer-zip.less Defines CSS styles for the ZIP previewer UI components
invenio_app_rdm/theme/assets/semantic-ui/js/invenio_app_rdm/previewer/previewable_zip.js Implements JavaScript to send preview requests via BroadcastChannel
invenio_app_rdm/theme/assets/semantic-ui/js/invenio_app_rdm/landing_page/theme.js Adds BroadcastChannel listener to update preview iframe when ZIP items are selected
invenio_app_rdm/records_ui/views/records.py Adds ContainerItemPreview class, preview/download view functions, and helper for finding items in container
invenio_app_rdm/records_ui/views/decorators.py Implements pass_container_item decorator to extract and pass container items to views
invenio_app_rdm/records_ui/views/init.py Registers new container item preview and download routes in blueprint
invenio_app_rdm/records_ui/previewer/previewable_zip.py Implements ZIP previewer logic including tree conversion and preview link generation
invenio_app_rdm/config.py Defines URL routes for container item preview and download endpoints

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

record=None,
url=None,
):
"""Create a new PreviewFile."""
Copy link

Copilot AI Dec 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The docstring says "Create a new PreviewFile" but this is creating a ContainerItemPreview, not a PreviewFile. The docstring should be updated to accurately reflect the class being instantiated.

Suggested change
"""Create a new PreviewFile."""
"""Create a new ContainerItemPreview."""

Copilot uses AI. Check for mistakes.
Comment on lines +118 to +120
tree_raw = current_rdm_records_service.files.list_container(
system_identity, file.record["id"], file.filename
).to_dict()
Copy link

Copilot AI Dec 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using system_identity for listing container contents bypasses user permissions. This could allow users to see the structure of files they don't have permission to access. Consider using the current user's identity (from g.identity) instead to respect access controls.

Copilot uses AI. Check for mistakes.
Comment on lines +1 to +11
.ui.accordion {
.title:not(.ui).panel-heading {
.ui.breadcrumb {
color: initial;

a.preview-link:hover {
text-decoration: underline;
}
}
}
}
Copy link

Copilot AI Dec 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The LESS file is missing a copyright header. Based on the pattern in other files in this PR and the project, it should include a copyright notice for CESNET i.a.l.e. and the MIT License reference.

Copilot uses AI. Check for mistakes.
def record_container_item_download(
pid_value, container_item=None, is_preview=False, **kwargs
):
"""Download a file from a record."""
Copy link

Copilot AI Dec 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The docstring says "Download a file from a record" which is the same as record_file_download and doesn't clarify this is for container items. It should be updated to something like "Download a file extracted from a container (e.g., zip archive) within a record."

Suggested change
"""Download a file from a record."""
"""Download a file extracted from a container (e.g., zip archive) within a record."""

Copilot uses AI. Check for mistakes.
Comment on lines +124 to +130
$("#preview-file-title").html(`
<div class="ui breadcrumb">
<a class="section preview-link" href="${e.data.containerPreviewUrl}" target="preview-iframe" data-file-key="${e.data.containerFileKey}">${e.data.containerFileKey}</a>
<i class="divider">/</i>
<div class="active section">${e.data.fileKey}</div>
</div>
`);
Copy link

Copilot AI Dec 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using .html() to inject user-controlled data (e.g., file keys and URLs from the broadcast message) creates a potential XSS vulnerability. The data should be sanitized or use .text() for plain text content, or properly escape HTML when constructing the breadcrumb. Consider using jQuery's .text() method or DOM manipulation methods that don't interpret HTML.

Copilot uses AI. Check for mistakes.
path=path,
)

listing_file = file_metadata._record.media_files.get(f"{filename}.listing")
Copy link

Copilot AI Dec 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code accesses file_metadata._record.media_files without proper permission checks. While pass_file_metadata checks permissions for the file itself, there's no verification that the user has permission to access media_files or that the listing file exists. This could potentially expose listing information for files the user shouldn't access. Consider adding explicit permission checks before accessing media_files.

Suggested change
listing_file = file_metadata._record.media_files.get(f"{filename}.listing")
listing_filename = f"{filename}.listing"
media_files = getattr(file_metadata._record, "media_files", None)
if not media_files or listing_filename not in media_files:
abort(404)
# Optional: Add permission check for media file access if available
# Example (pseudo-code, adjust as needed):
# if not current_rdm_records.records_service.check_permission(g.identity, "read_files", record=file_metadata._record):
# abort(403)
listing_file = media_files[listing_filename]

Copilot uses AI. Check for mistakes.
Comment on lines +438 to +484
@pass_record_or_draft(expand=False)
@pass_file_metadata
def record_container_item_preview(
pid_value,
record=None,
file_metadata=None,
**kwargs,
):
"""Render a preview of the specified item in a container."""
filename = kwargs.get("filename")
path = kwargs.get("path")

url = url_for(
"invenio_app_rdm_records.record_container_item_download",
pid_value=pid_value,
filename=filename,
path=path,
)

listing_file = file_metadata._record.media_files.get(f"{filename}.listing")

with listing_file.file.storage().open("rb") as f:
listing = json.load(f)

parts = list(PurePosixPath(path).parts)
entry = find_container_item(listing.get("children", {}).values(), parts)
if entry is None:
abort(404)

extracted_file_size = entry.get("size", 0)
# Find a suitable previewer
fileobj = ContainerItemPreview(
file_metadata, pid_value, path, extracted_file_size, record, url
)
# Try to see if specific previewer preference is set for the file
file_previewer = (file_metadata.data.get("metadata") or {}).get("previewer")
if file_previewer:
previewer = current_previewer.previewers.get(file_previewer)
if previewer and previewer.can_preview(fileobj):
return previewer.preview(fileobj)

# Go through all previewers to find the first one that can preview the file
for plugin in current_previewer.iter_container_item_previewers():
if plugin.can_preview(fileobj):
return plugin.preview(fileobj)

return default_previewer.preview(fileobj)
Copy link

Copilot AI Dec 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new record_container_item_preview function lacks test coverage. Given that test coverage exists for similar functions like record_file_preview in tests/ui/test_file_download.py, consider adding tests for this new functionality to verify correct behavior, error handling, and permission checks.

Copilot uses AI. Check for mistakes.
Comment on lines +487 to +501
def find_container_item(container_item_metadata, path_parts):
"""Recursively find entry in TOC based on path parts."""
if not path_parts:
return None

key = path_parts[0]
for sub_item in container_item_metadata:
if sub_item["key"] == key:
if len(path_parts) == 1:
return sub_item
elif sub_item.get("children"):
return find_container_item(
sub_item["children"].values(), path_parts[1:]
)
return None
Copy link

Copilot AI Dec 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The find_container_item helper function lacks test coverage. This function performs critical path parsing and navigation logic. Consider adding unit tests to verify correct behavior with various path structures, nested folders, missing items, and edge cases like empty paths or invalid keys.

Copilot uses AI. Check for mistakes.
<div class="no-padding right aligned column stretched">
<span>
{% if file_url_preview %}
<button class="ui compact mini button preview-link" id="{{t.id}}" data-container-file-key="{{file.filename}}" data-file-key="{{t.name}}" data-preview-url="{{file_url_preview}}">
Copy link

Copilot AI Dec 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The preview button should have an accessible label. Consider adding an aria-label attribute to describe the action, such as 'Preview {file name}' for screen reader users.

Suggested change
<button class="ui compact mini button preview-link" id="{{t.id}}" data-container-file-key="{{file.filename}}" data-file-key="{{t.name}}" data-preview-url="{{file_url_preview}}">
<button class="ui compact mini button preview-link" id="{{t.id}}" data-container-file-key="{{file.filename}}" data-file-key="{{t.name}}" data-preview-url="{{file_url_preview}}" aria-label="{{ _('Preview %(filename)s', filename=t.name) }}">

Copilot uses AI. Check for mistakes.
@@ -0,0 +1,21 @@
/*
* This file is part of Invenio.
* Copyright (C) 2025 CESNET i.a.l.e..
Copy link

Copilot AI Dec 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The copyright line contains a typo with an extra period after "i.a.l.e."

Suggested change
* Copyright (C) 2025 CESNET i.a.l.e..
* Copyright (C) 2025 CESNET i.a.l.e.

Copilot uses AI. Check for mistakes.
from invenio_previewer.proxies import current_previewer
from invenio_previewer.views import is_container_item_previewable

from invenio_app_rdm.records_ui.views.records import ContainerItemPreview
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: please use everywhere the form. form ..views.records import ContainerItemPreview

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants