Skip to content

Fix #288#307

Merged
fabnemEPFL merged 13 commits into
swiss-ai:masterfrom
fabnemEPFL:fix/288
May 19, 2026
Merged

Fix #288#307
fabnemEPFL merged 13 commits into
swiss-ai:masterfrom
fabnemEPFL:fix/288

Conversation

@fabnemEPFL
Copy link
Copy Markdown
Collaborator

This pull request refactors the handling of uploaded file metadata in the indexing API to ensure that the original filename is preserved and associated with each document, and improves the test coverage for this behavior. The main changes include extracting metadata processing into a helper function, updating the upload and update endpoints to use this helper, and adding new and updated tests to verify correct metadata handling.

Metadata handling improvements:

  • Introduced the _apply_uploaded_file_metadata helper function in run_index_api.py to consistently bind processed chunks to the API file ID and persist the original filename in document metadata.
  • Updated both the upload_file and update_file endpoints to use _apply_uploaded_file_metadata, ensuring consistent metadata assignment and filename preservation for uploaded documents. [1] [2]

Testing enhancements:

  • Added the test_apply_uploaded_file_metadata_preserves_chunk_suffix unit test to verify that the helper function correctly updates document IDs and stores the filename.
  • Added the test_uploaded_file_has_filename_in_list_files integration test to ensure that the /list_files API returns the correct filename for uploaded files.
  • Updated existing tests to use the DocumentMetadata class for document metadata, improving type safety and clarity. [1] [2] [3]

Imports and code cleanup:

  • Updated imports in test_live_retriever_api.py to reflect the new helper function and DocumentMetadata usage.

@fabnemEPFL fabnemEPFL requested review from JCHAVEROT and Copilot May 19, 2026 10:03
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request addresses #288 by refactoring the indexing API’s upload/update flows to consistently attach the original uploaded filename to indexed documents (so /list_files can return it), and by adding/adjusting tests to cover the behavior.

Changes:

  • Added _apply_uploaded_file_metadata helper to bind processed chunks to the API file ID and store the uploaded filename in metadata.
  • Updated /v1/files (upload) and /v1/files/{fileId} (update) to use the helper for consistent ID/metadata handling.
  • Enhanced test coverage by adding unit/integration tests and switching result-metadata fixtures to use DocumentMetadata.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
src/mmore/run_index_api.py Introduces the metadata helper and applies it in upload/update endpoints to persist filename and preserve chunk suffixes.
tests/test_live_retriever_api.py Adds tests validating filename persistence in /list_files and updates metadata fixtures to use DocumentMetadata.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/mmore/run_index_api.py Outdated
Comment on lines +23 to +24
)
from mmore.run_index_api import (
Copy link
Copy Markdown
Collaborator

@JCHAVEROT JCHAVEROT left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @fabnemEPFL, looks good to me !

I tested by doing two HTTP POSTs, the first one using the master branch API and the second one using two one from your branch (the two APIs pointing to the same DB), and in the results below we can clearly see that your code fixed the issue !

curl -s 'http://localhost:8000/list_files?collection_name=my_docs' | jq
[
  {
    "id": "mytest1",
    "filename": "Unknown"
  },
  {
    "id": "mytest2",
    "filename": "mmore.pdf"
  }
]

Also great that you also reused DocumentMetadata introduced in a recent PR in some tests 👍

Comment thread src/mmore/run_index_api.py Outdated
@fabnemEPFL fabnemEPFL merged commit 768d225 into swiss-ai:master May 19, 2026
3 checks passed
@fabnemEPFL fabnemEPFL deleted the fix/288 branch May 19, 2026 15:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants