Import data from MathAlgoDB into the MaRDI Portal.
The script parses an RDF/XML export of MathAlgoDB and creates/updates items (algorithms, problems, software, benchmarks) on the MaRDI portal, including their properties and inverse relations.
- Python >= 3.11
- uv (handles dependencies automatically)
No manual installation of dependencies is needed. Running the script with uv run will automatically install rdflib and mardiclient.
importer/
├── import_mathalgodb.py # Main script
├── pyproject.toml
├── README.md
└── mappings/
├── staging/
│ ├── config.json # Staging environment configuration
│ └── mardi.json # MathAlgoDB ID -> QID mapping (staging)
└── production/
├── config.json # Production environment configuration
└── mardi.json # MathAlgoDB ID -> QID mapping (production)
Each environment has a config.json in mappings/{environment}/ with:
| Key | Description |
|---|---|
wikibase_host |
Portal hostname (e.g. staging.mardi4nfdi.org). API endpoints are derived from this. |
instance_mapping |
Maps entity types (problem, software, algorithm, benchmark) to their "instance of" QIDs. |
profile_mapping |
Maps entity types to their MaRDI profile type QIDs. |
property_mapping |
Maps MathAlgoDB relation names (e.g. solvedBy, subclassOf) and identifier types (e.g. DOI, arxiv, swmath) to Wikibase property IDs. |
qualifier_mapping |
Maps relation names that are stored as qualifier-based claims (e.g. documentedIn, analyzedIn) to the QID used as the qualifier value. |
community_item |
QID of the MathAlgoDB community item. |
instance_of_property |
Property ID for "instance of" claims. |
profile_type_property |
Property ID for MaRDI profile type claims. |
community_property |
Property ID for community claims. |
mathalgodb_identifier_property |
Property ID for the MathAlgoDB identifier string claim. |
object_has_role_property |
Property ID used as the qualifier property for qualifier-based claims. |
documented_in_property |
Property ID under which qualifier-based claims are stored (the main property of the qualified statement). |
Each environment directory contains a mardi.json file that maps MathAlgoDB individual IDs to Wikibase QIDs:
{
"al:BICGSTABl": "Q6825304",
"al:ClassDirectDense": "Q6825303",
"pr:TangentFromPointToCircle": "Q6825307",
"sw:polymake": "Q6825333",
"pb:FerN83": "Q4745688",
...
}This file is read at startup to determine which items already exist. When new items are created, the file is updated automatically.
| Variable | Required | Description |
|---|---|---|
WIKIBASE_USER |
Yes (unless --dry-run) |
Wikibase bot username |
WIKIBASE_PASSWORD |
Yes (unless --dry-run) |
Wikibase bot password |
These can be provided via a .env file in the project directory. Copy the template and fill in your values:
cp .env.example .envWIKIBASE_USER=MyBot
WIKIBASE_PASSWORD=secret
Downloads the latest XML and shows what would be created/updated without writing anything:
uv run import_mathalgodb.py --dry-runuv run import_mathalgodb.py -e staginguv run import_mathalgodb.py -e productionuv run import_mathalgodb.py --xml-file path/to/mathalgodb.xmlThe script runs three steps sequentially:
-
Create items -- For each non-publication individual in the XML, checks
mardi.jsonto see if it already exists. If not, creates a new Wikibase item with label, aliases, description, instance-of, profile type, community, and MathAlgoDB identifier claims. Updatesmardi.jsonwith the new QID. -
Add properties -- Resolves relations between individuals to QIDs using
mardi.jsonand the environment'sproperty_mapping, then writes them as claims. Relations listed inqualifier_mapping(e.g.documentedIn,analyzedIn) are stored as qualified statements: a claim underdocumented_in_propertywith the target item as value and the role QID as a qualifier onobject_has_role_property. Identifier strings (DOI, arXiv, swMath, etc.) are written as literal claims using the property IDs also defined inproperty_mapping. -
Add inverse relations -- For relations that have an inverse (e.g.
solves→solvedBy,documents→documentedIn), adds the corresponding claims to the target items. Inverse relations that map toqualifier_mappingentries are likewise stored as qualified statements on the target item.
usage: import_mathalgodb.py [-h] [-e {staging,production}] [--xml-file XML_FILE] [--dry-run]
options:
-e, --environment Target environment (default: 'staging')
--xml-file Path to a local RDF/XML file (skips download)
--dry-run Parse and prepare data without writing to Wikibase