-
-
Notifications
You must be signed in to change notification settings - Fork 1.8k
feat: use author identifiers in import API #10110
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
cdrini
merged 47 commits into
internetarchive:master
from
pidgezero-one:9448/feat/use-known-author-identifiers-in-import
Mar 27, 2025
Merged
Changes from 45 commits
Commits
Show all changes
47 commits
Select commit
Hold shift + click to select a range
19a1566
author identifiers in import
pidgezero-one 5912f75
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] 8305af5
this wasnt supposed to be here
pidgezero-one 280d715
this was supposed to be here
pidgezero-one a3b8412
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] 810dc33
notes
pidgezero-one 9d54ff7
no more try/catch
pidgezero-one 0be86b7
precommits
pidgezero-one f57f997
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] 82b198c
?
pidgezero-one 87d9f25
re: slack convo, go ahead and import this and use get_author_config o…
pidgezero-one f9a43c5
scripts
pidgezero-one 047211c
merge
pidgezero-one 18541e8
books are being imported, but author page does not list them
pidgezero-one e525c23
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] 6473fbb
fix failing test
pidgezero-one c8e43b8
add 1900 exemption for wikisource, move script requirements into thei…
pidgezero-one 419016a
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] 00c3dff
Update openlibrary/catalog/add_book/load_book.py
pidgezero-one a2af74a
Update openlibrary/catalog/add_book/load_book.py
pidgezero-one 309daa9
Update openlibrary/catalog/add_book/load_book.py
pidgezero-one e47f5cb
Update openlibrary/catalog/add_book/load_book.py
pidgezero-one a8cd195
Update openlibrary/catalog/add_book/load_book.py
pidgezero-one 01c2576
Update openlibrary/catalog/add_book/load_book.py
pidgezero-one c730dd0
Update openlibrary/core/models.py
pidgezero-one bab3f89
Update openlibrary/core/models.py
pidgezero-one 8596a3b
Update openlibrary/core/models.py
pidgezero-one a3d9a24
Update scripts/providers/import_wikisource.py
pidgezero-one 9bbf023
Update scripts/providers/import_wikisource.py
pidgezero-one ec13685
Update scripts/providers/import_wikisource.py
pidgezero-one a1251ad
Update scripts/providers/import_wikisource.py
pidgezero-one 9c9cc48
Update scripts/providers/import_wikisource.py
pidgezero-one a615f26
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] 253ed32
imports
pidgezero-one 3f42068
Merge branch '9448/feat/use-known-author-identifiers-in-import' of ht…
pidgezero-one a9456f8
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] 9563baf
commtents
pidgezero-one 6faf88d
Merge branch '9448/feat/use-known-author-identifiers-in-import' of ht…
pidgezero-one 6facf5b
bracket fixes
pidgezero-one cd051af
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] 43b3aaa
update script instructions
pidgezero-one b9ab864
:(
pidgezero-one d904ec1
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] 7c9f483
Merge branch 'master' into 9448/feat/use-known-author-identifiers-in-…
pidgezero-one 37f173e
?
pidgezero-one 6fa2c1a
Update import API to use key/remote_ids instead of ol_id/identifiers …
cdrini 4895023
Have Author.merge_remote_ids error on conflict for now
cdrini File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -30,6 +30,7 @@ | |
| from openlibrary.core.ratings import Ratings | ||
| from openlibrary.core.vendors import get_amazon_metadata | ||
| from openlibrary.core.wikidata import WikidataEntity, get_wikidata_entity | ||
| from openlibrary.plugins.upstream.utils import get_identifier_config | ||
| from openlibrary.utils import extract_numeric_id_from_olid | ||
| from openlibrary.utils.isbn import canonical, isbn_13_to_isbn_10, to_isbn_13 | ||
|
|
||
|
|
@@ -802,6 +803,33 @@ def get_edition_count(self): | |
| def get_lists(self, limit=50, offset=0, sort=True): | ||
| return self._get_lists(limit=limit, offset=offset, sort=sort) | ||
|
|
||
| def merge_remote_ids( | ||
| self, incoming_ids: dict[str, str] | ||
| ) -> tuple[dict[str, str], int]: | ||
| """Returns the author's remote IDs merged with a given remote IDs object, as well as a count for how many IDs had conflicts. | ||
| If incoming_ids is empty, or if there are more conflicts than matches, no merge will be attempted, and the output will be (author.remote_ids, -1). | ||
| """ | ||
| output = {**self.remote_ids} | ||
pidgezero-one marked this conversation as resolved.
Show resolved
Hide resolved
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ended up having to revert to this deconstruction - self.remote_ids is being treated as a Thing and not a dict, for some reason (despite every other operation on it in this codebase suggesting it should be a dict) so deepcopy fails. I'm stumped on why that's happening. |
||
| if not incoming_ids: | ||
| return output, -1 | ||
| # Count | ||
| matches = 0 | ||
| conflicts = 0 | ||
| config = get_identifier_config("author") | ||
| for id in config["identifiers"]: | ||
| identifier: str = id.name | ||
| if identifier in output and identifier in incoming_ids: | ||
| if output[identifier] != incoming_ids[identifier]: | ||
| conflicts = conflicts + 1 | ||
| else: | ||
| matches = matches + 1 | ||
| # Decide at a later date if we want to change the logic to bail only when conflicts > matches, instead of conflicts > 0. | ||
| # if conflicts > matches: | ||
| if conflicts > 0: | ||
| # TODO: Raise this to librarians, somehow. | ||
| return self.remote_ids, -1 | ||
| return output, matches | ||
|
|
||
|
|
||
| class User(Thing): | ||
| def get_default_preferences(self): | ||
|
|
||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,7 @@ | ||
| # Temporary requirements for running standalone scripts that are not necessary for OL to function. | ||
| # Run like this: | ||
| # python -m pip install -r requirements_scripts.txt && PYTHONPATH=. python ./path/to/script.py optional_args... && python -m pip uninstall -y -r requirements_scripts.txt | ||
|
|
||
| mwparserfromhell==0.6.6 | ||
| nameparser==1.1.3 | ||
| wikitextparser==0.56.1 |
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We might want to name this one as
keyto be consistent with our book/thing records. Having the import endpoint mirror the shape of our core book records is convenient.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Freso do you have any strong opinions on this? ^
(see also Drini's comment above about remote_ids vs identifiers!)