Feature/fao open knowledge #94

lpi-tn · 2026-02-03T15:19:34Z

This pull request introduces comprehensive support for the FAO Open Knowledge data source, including new data models, a URL collector, and extensive unit tests. It also adds new utility functions for serializing dataclass instances and integrates this serialization into the document collection workflow. These changes improve the system's ability to ingest, validate, and process FAO Open Knowledge documents, while ensuring robust error handling and test coverage.

FAO Open Knowledge integration:

Added new data models in fao_open_knowledge.py to represent FAO Open Knowledge items, bundles, bitstreams, and related metadata, enabling structured parsing and validation of API responses.
Implemented FAOOpenKnowledgeURLCollector in fao_open_knowledge_collector.py to fetch and construct WeLearnDocument objects from the FAO Open Knowledge API, supporting automated document discovery and ingestion.

Testing and validation:

Added extensive unit tests for both the FAO Open Knowledge plugin and its data models, covering scenarios such as embargoed, withdrawn, unauthorized, and error cases to ensure reliability and correct handling of edge cases. [1] [2]

Dataclass serialization utilities:

Introduced utility functions is_dataclass_instance, _inner_serialize_dataclass, and serialize_dataclass_instance in computed_metadata.py to recursively serialize dataclass instances, improving compatibility with downstream processing and storage. [1] [2]

Workflow integration:

Integrated the new dataclass serialization utility into the main document collection workflow in document_collector.py, ensuring that all document details are properly serialized before database insertion. [1] [2] [3]

…d URL collector

… Knowledge collector

… status extraction

…ts and enhance SDG processing

…or FAO Open Knowledge collector

…ses for consistency

…s and metadata validation

…PDF extraction logic

…nces and enhance document processing

…ng and error handling

Copilot

Pull request overview

This PR integrates the FAO Open Knowledge data source into the system, enabling automated document discovery, validation, and ingestion from the FAO Open Knowledge repository.

Changes:

Added comprehensive FAO Open Knowledge integration including data models, URL collector, and document collector plugin
Implemented dataclass serialization utilities to properly handle structured metadata before database storage
Added extensive unit tests covering various edge cases (embargoed, withdrawn, unauthorized documents, HTTP errors)

Reviewed changes

Copilot reviewed 10 out of 11 changed files in this pull request and generated 2 comments.

Show a summary per file

File	Description
welearn_datastack/plugins/rest_requesters/fao_open_knowledge.py	New collector plugin that fetches and processes FAO Open Knowledge documents
welearn_datastack/data/source_models/fao_open_knowledge.py	Pydantic models for FAO Open Knowledge API responses
welearn_datastack/collectors/fao_open_knowledge_collector.py	URL collector for discovering FAO Open Knowledge documents
welearn_datastack/modules/computed_metadata.py	Added dataclass serialization utilities
welearn_datastack/nodes_workflow/DocumentHubCollector/document_collector.py	Integrated dataclass serialization into workflow
welearn_datastack/plugins/rest_requesters/init.py	Registered FAO collector plugin
welearn_datastack/nodes_workflow/URLCollectors/node_fao_open_knowledge_collect.py	Workflow node for FAO URL collection
tests/source_models/test_fao_open_knownledge.py	Tests for FAO data models
tests/document_collector_hub/plugins_test/test_fao_open_knowledge.py	Tests for FAO collector plugin
welearn_datastack/plugins/rest_requesters/open_alex.py	Removed blank line

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

welearn_datastack/plugins/rest_requesters/fao_open_knowledge.py

Co-authored-by: Copilot <[email protected]>

lpi-tn added 12 commits January 29, 2026 15:11

typo

795610a

feat(fao-open-knowledge): implement FAO Open Knowledge data models an…

9c5f6e4

…d URL collector

feat(fao-open-knowledge): refactor data models and implement FAO Open…

161a049

… Knowledge collector

feat(fao-open-knowledge): add MetadataEntry model and enhance embargo…

d96d455

… status extraction

feat(fao-open-knowledge): implement detail extraction for FAO documen…

c5e03e7

…ts and enhance SDG processing

feat(fao-open-knowledge): enhance error handling and add unit tests f…

8caade9

…or FAO Open Knowledge collector

feat(fao-open-knowledge): improve metadata parsing and update test ca…

85acbe9

…ses for consistency

feat(fao-open-knowledge): add unit tests for FAO Open Knowledge model…

ef2fdbc

…s and metadata validation

Merge branch 'main' into Feature/FAO-open-knowledge

772ea9e

feat(fao_open_knowledge): add Bitstream and Checksum models, enhance …

0a6d5a5

…PDF extraction logic

feat(fao_open_knowledge): implement serialization for dataclass insta…

a89e554

…nces and enhance document processing

feat(fao_open_knowledge): enhance test coverage for document processi…

3e0898a

…ng and error handling

lpi-tn requested review from Copilot and sandragjacinto February 3, 2026 15:19

Copilot AI reviewed Feb 3, 2026

View reviewed changes

welearn_datastack/plugins/rest_requesters/fao_open_knowledge.py Outdated Show resolved Hide resolved

welearn_datastack/plugins/rest_requesters/fao_open_knowledge.py Outdated Show resolved Hide resolved

lpi-tn and others added 2 commits February 3, 2026 17:11

Update welearn_datastack/plugins/rest_requesters/fao_open_knowledge.py

4f275bd

Co-authored-by: Copilot <[email protected]>

Update welearn_datastack/plugins/rest_requesters/fao_open_knowledge.py

b55ca25

Co-authored-by: Copilot <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature/fao open knowledge #94

Feature/fao open knowledge #94

Uh oh!

lpi-tn commented Feb 3, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Feature/fao open knowledge #94

Are you sure you want to change the base?

Feature/fao open knowledge #94

Uh oh!

Conversation

lpi-tn commented Feb 3, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants