Skip to content

Add Feature: Metadata Extraction from Files #2311

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 12 commits into
base: main
Choose a base branch
from

Conversation

marcobelligoli
Copy link

Introduced functionality for extracting metadata from files. The feature supports multiple file types and retrieves key metadata for improved file management and analysis.
Added a default metadata extractor and an implementation for the "Alfresco Transform Service".
Also added the possibility to create custom components.
Introduced a service to centralize metadata extraction with one or more extractors.

Changes:

  • Added components to process metadata extraction.
  • Added metadata extraction auto-configuration.
  • Included unit tests to ensure reliability of the feature.
  • Updated documentation with usage details.

Introduced metadata extraction functionality, including MetadataExtractor and MetadataExtractionService interfaces, along with a default extractor implementation.

Added associated autoconfiguration to integrate with Spring Boot and unit tests to verify correctness.
Split metadata extraction from renditions into a dedicated module, `spring-content-metadata-extraction`, with its own configurations, documentation, and Maven artifacts. Updated package structure, dependencies, and imports accordingly.
Introduce `AlfrescoTransformCoreMetadataExtractor` for metadata extraction using the Alfresco Transform Core API. Added utilities for handling file conversions and test cases to ensure functionality. Updated dependencies in `pom.xml` to support the new extractor.
Replaced dependency injection of RestTemplate and ObjectMapper with direct instantiation. Simplified constructor and improved code readability by reducing unnecessary parameters. Adjusted test code to align with these changes.
Updated documentation to describe integration with Alfresco Transform Core for metadata extraction. Introduced `MetadataExtractionService` example, detailing its use for centralizing metadata retrieval from various extractors.
@CLAassistant
Copy link

CLAassistant commented Mar 25, 2025

CLA assistant check
All committers have signed the CLA.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants