-
Notifications
You must be signed in to change notification settings - Fork 3
Add LMFDB concepts #8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
6 commits
Select commit
Hold shift + click to select a range
7a931c7
Some Wikipedia entries need more than 200 chars, increase to 300; add…
katjabercic 2588ea6
Add a slurper for the LMFDB
katjabercic cf237ef
Add slurper runs to ensure we don't run certain slurpers too often
katjabercic 82082c4
Add guard for clearing LMFDB, messages in rebuild_db
katjabercic 4ae9873
-minor reformatting fix
Stanoja c680015
-added Makefile command
Stanoja File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,18 @@ | ||
| # Generated by Django 4.2.28 on 2026-04-22 07:46 | ||
|
|
||
| from django.db import migrations, models | ||
|
|
||
|
|
||
| class Migration(migrations.Migration): | ||
|
|
||
| dependencies = [ | ||
| ('concepts', '0019_remove_concept_unique_lower_name_concept_normal_name_and_more'), | ||
| ] | ||
|
|
||
| operations = [ | ||
| migrations.AlterField( | ||
| model_name='item', | ||
| name='source', | ||
| field=models.CharField(choices=[('Wd', 'Wikidata'), ('nL', 'nLab'), ('MW', 'MathWorld'), ('PW', 'ProofWiki'), ('EoM', 'Encyclopedia of Mathematics'), ('WpEN', 'Wikipedia (English)'), ('AUm', 'Agda Unimath'), ('LMF', 'The L-functions and modular forms database')], max_length=4), | ||
| ), | ||
| ] |
28 changes: 28 additions & 0 deletions
28
web/concepts/migrations/0021_alter_item_identifier_alter_item_name_alter_item_url.py
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,28 @@ | ||
| # Generated by Django 4.2.28 on 2026-04-24 10:08 | ||
|
|
||
| from django.db import migrations, models | ||
|
|
||
|
|
||
| class Migration(migrations.Migration): | ||
|
|
||
| dependencies = [ | ||
| ('concepts', '0020_alter_item_source'), | ||
| ] | ||
|
|
||
| operations = [ | ||
| migrations.AlterField( | ||
| model_name='item', | ||
| name='identifier', | ||
| field=models.CharField(max_length=300), | ||
| ), | ||
| migrations.AlterField( | ||
| model_name='item', | ||
| name='name', | ||
| field=models.CharField(max_length=300, null=True), | ||
| ), | ||
| migrations.AlterField( | ||
| model_name='item', | ||
| name='url', | ||
| field=models.URLField(max_length=300), | ||
| ), | ||
| ] |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,46 @@ | ||
| import logging | ||
| import sys | ||
| from datetime import timedelta | ||
|
|
||
| from concepts.models import Item | ||
| from django.core.management.base import BaseCommand | ||
| from slurper.models import SlurperRun | ||
|
|
||
| MIN_INTERVAL = timedelta(days=7) | ||
|
|
||
|
|
||
| class Command(BaseCommand): | ||
| help = ( | ||
| "Delete all LMFDB items. Guarded by a 7-day throttle; use --force to override." | ||
| ) | ||
|
|
||
| def add_arguments(self, parser): | ||
| parser.add_argument( | ||
| "--force", | ||
| action="store_true", | ||
| help="Clear even if the LMFDB slurper ran within the last 7 days.", | ||
| ) | ||
|
|
||
| def handle(self, *args, force=False, **options): | ||
| source = Item.Source.LMFDB | ||
| if not force and not SlurperRun.can_run(source, MIN_INTERVAL): | ||
| if sys.stdin.isatty(): | ||
| answer = ( | ||
| input( | ||
| f"LMFDB slurper ran within the last {MIN_INTERVAL.days} days. " | ||
| f"Clear anyway? [y/N] " | ||
| ) | ||
| .strip() | ||
| .lower() | ||
| ) | ||
| if answer not in ("y", "yes"): | ||
| logging.info(f"[{source.label}] clear cancelled.") | ||
| return | ||
| else: | ||
| logging.info( | ||
| f"[{source.label}] clear skipped: ran less than " | ||
| f"{MIN_INTERVAL.days} days ago (use --force to override)." | ||
| ) | ||
| return | ||
| deleted, _ = Item.objects.filter(source=source).delete() | ||
| logging.info(f"[{source.label}] cleared {deleted} items.") |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,14 @@ | ||
| from django.core.management.base import BaseCommand | ||
| from slurper import source_lmfdb | ||
|
|
||
|
|
||
| class Command(BaseCommand): | ||
| def add_arguments(self, parser): | ||
| parser.add_argument( | ||
| "--force", | ||
| action="store_true", | ||
| help="Bypass the 7-day throttle and run anyway.", | ||
| ) | ||
|
|
||
| def handle(self, *args, force=False, **options): | ||
| source_lmfdb.LMFDB_SLURPER.save_items(force=force) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,22 @@ | ||
| # Generated by Django 4.2.28 on 2026-04-24 15:38 | ||
|
|
||
| from django.db import migrations, models | ||
|
|
||
|
|
||
| class Migration(migrations.Migration): | ||
|
|
||
| initial = True | ||
|
|
||
| dependencies = [ | ||
| ] | ||
|
|
||
| operations = [ | ||
| migrations.CreateModel( | ||
| name='SlurperRun', | ||
| fields=[ | ||
| ('id', models.BigAutoField(auto_created=True, primary_key=True, serialize=False, verbose_name='ID')), | ||
| ('source', models.CharField(max_length=8, unique=True)), | ||
| ('last_succeeded_at', models.DateTimeField()), | ||
| ], | ||
| ), | ||
| ] |
Empty file.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,26 @@ | ||
| from datetime import timedelta | ||
|
|
||
| from django.db import models | ||
| from django.utils import timezone | ||
|
|
||
|
|
||
| class SlurperRun(models.Model): | ||
| """Tracks the last successful run of a named slurper for throttling.""" | ||
|
|
||
| source = models.CharField(max_length=8, unique=True) | ||
| last_succeeded_at = models.DateTimeField() | ||
|
|
||
| @classmethod | ||
| def can_run(cls, source: str, min_interval: timedelta) -> bool: | ||
| try: | ||
| last = cls.objects.get(source=source).last_succeeded_at | ||
| except cls.DoesNotExist: | ||
| return True | ||
| return timezone.now() - last >= min_interval | ||
|
|
||
| @classmethod | ||
| def mark_ran(cls, source: str) -> None: | ||
| cls.objects.update_or_create( | ||
| source=source, | ||
| defaults={"last_succeeded_at": timezone.now()}, | ||
| ) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,57 @@ | ||
| import logging | ||
| from datetime import timedelta | ||
|
|
||
| from concepts.models import Item | ||
| from django.db.utils import IntegrityError | ||
| from psycopg2.sql import SQL | ||
| from slurper.models import SlurperRun | ||
|
|
||
|
|
||
| class LmfdbSlurper: | ||
| KNOWL_URL_PREFIX = "https://www.lmfdb.org/knowledge/show/" | ||
| MIN_INTERVAL = timedelta(days=7) | ||
|
|
||
| def __init__(self): | ||
| self.source = Item.Source.LMFDB | ||
|
|
||
| def fetch_rows(self): | ||
| from lmf import db | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I added an import for this, seems that the correct library is imported: https://github.com/roed314/lmfdb-lite |
||
|
|
||
| cur = db._execute(SQL("SELECT id, title, content FROM kwl_knowls")) | ||
| columns = [desc[0] for desc in cur.description] | ||
| for row in cur: | ||
| yield dict(zip(columns, row)) | ||
|
|
||
| def row_to_item(self, row) -> Item: | ||
| return Item( | ||
| source=self.source, | ||
| identifier=row["id"], | ||
| url=self.KNOWL_URL_PREFIX + row["id"], | ||
| name=row["title"], | ||
| description=row["content"], | ||
| ) | ||
|
|
||
| def save_items(self, force: bool = False): | ||
| if not force and not SlurperRun.can_run(self.source, self.MIN_INTERVAL): | ||
| logging.info( | ||
| f"[{self.source.label}] skipped: ran less than " | ||
| f"{self.MIN_INTERVAL.days} days ago (use --force to override)." | ||
| ) | ||
| return | ||
| total_saved = 0 | ||
| for row in self.fetch_rows(): | ||
| item = self.row_to_item(row) | ||
| try: | ||
| item.save() | ||
| total_saved += 1 | ||
| except IntegrityError: | ||
| logging.info( | ||
| f"Item {item.source} {item.identifier} is already in the database." | ||
| ) | ||
| SlurperRun.mark_ran(self.source) | ||
| logging.info( | ||
| f"[{self.source.label}] save_items finished: {total_saved} items saved." | ||
| ) | ||
|
|
||
|
|
||
| LMFDB_SLURPER = LmfdbSlurper() | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
☝️ ...this dependency