Skip to content

Conversation

@satyaog
Copy link
Member

@satyaog satyaog commented Jan 23, 2026

Add PaperoniV2 discoverer and validation command

This PR adds support for reading from the PaperoniV2 JSON database format:

  • New discoverer: PaperoniV2 class that parses JSON database files and converts them to the paperoni model format, including authors, institutions, topics, links, and validation flags
  • New validation command: Coll.Validate command that marks papers in the collection as "validated" based on validation status from the v2 database
  • Configuration: Added v2 discovery source to basic.yaml config

@satyaog satyaog marked this pull request as ready for review January 26, 2026 18:38
@satyaog satyaog force-pushed the migratev2 branch 3 times, most recently from eadb627 to b1d34fa Compare January 27, 2026 18:50
Comment on lines +719 to +714
validated = []
not_found = []
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is currently only used to count the papers but maybe we could also save the lists which could be used to analyse which papers were not found in the v3 database

@satyaog satyaog force-pushed the migratev2 branch 2 times, most recently from 860d43b to 15f3cce Compare January 28, 2026 17:07
@satyaog
Copy link
Member Author

satyaog commented Jan 28, 2026

Superseded by #131

@satyaog satyaog closed this Jan 28, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant