Skip to content

Commit 65846f8

Browse files
committed
Update README for current data model and site generator
1 parent 53971c9 commit 65846f8

File tree

1 file changed

+105
-45
lines changed

1 file changed

+105
-45
lines changed

README.md

Lines changed: 105 additions & 45 deletions
Original file line numberDiff line numberDiff line change
@@ -1,45 +1,105 @@
1-
# Rosetta stone DB (prototype)
2-
This is a prototype of the Rosetta stone database for mathematical objects. Its
3-
purpose is to store mathematical objects and their descriptions in different
4-
mathematical software, to enable developers to deserialize such data in the
5-
future. Furthermore it can be used to facilitate interoperability between
6-
different mathematical softwares. As an added benefit, this database can be
7-
used to showcase the serialization capabilities of mathematical software.
8-
9-
10-
## Structure
11-
At the top level there are folders
12-
- **description** Containing a description of the examples
13-
- **PROGRAM** Containing the examples for the given `PROGRAM`. Note that not
14-
every example will be available in every program.
15-
16-
An `example` from the folder `description` will then correspond to a folder
17-
`PROGRAM/example` where you will find
18-
- Code for producing this example in `PROGRAM` in `PROGRAM/example/generate.*`
19-
- The example serialized by `PROGRAM` in `PROGRAM/example/data.*`
20-
- Code for verifying the example by `PROGRAM` in `PROGRAM/example/check.*`
21-
- A link for a MaPS runtime in `PROGRAM/example/maps` that contains the version
22-
of `PROGRAM` necessary to read the data or run the scripts. At the same time
23-
this script showcases how to read the data using `PROGRAM`.
24-
25-
Note that not all files will be available for all examples. Not all software
26-
provides (de-)serialization. Proprietary software is not available in MaPS
27-
runtimes. And not all data makes sense for all software, for example, almost
28-
all mathematical software will have matrices implemented, but not every
29-
mathematical software will have groups or number fields.
30-
31-
In case this structure is chosen differently for any reason, the corresponding
32-
folder will come with a `README.md` file containing a detailed explanation.
33-
34-
35-
## Guidelines for suitable entries
36-
### Choose unique data entries
37-
To uniquely map entries to each other between different data types, the entries
38-
themselves should be unique and large enough, such that automated searching for
39-
these becomes easy. For example, one digit numbers will often appear multiple
40-
times, even in the metadata.
41-
42-
### Break symmetries
43-
Take for example matrices. The worst example would be a quadratic zero matrix,
44-
since in the data one would be unable to tell rows from columns and the entries
45-
from each other. Instead choose a non-zero non-quadratic matrix.
1+
# Rosetta Stone DB (prototype)
2+
3+
This repository is a prototype "Rosetta stone" for serialization of mathematical
4+
objects across computer algebra systems.
5+
6+
It stores:
7+
- a human-readable description per example
8+
- code to generate the object in a given system
9+
- the serialized data emitted by that system
10+
11+
It also generates a browsable static site with:
12+
- an index table grouped by category (and optional subcategory)
13+
- per-example pages with code and serialized data for each available system
14+
- Markdown and HTML output
15+
16+
## Repository layout
17+
18+
### Input data
19+
20+
All source data lives under `data/`:
21+
22+
`data/<category>/<example-slug>/description.md`
23+
`data/<category>/<example-slug>/systems/<SystemName>/generate.*`
24+
`data/<category>/<example-slug>/systems/<SystemName>/data.*`
25+
26+
Example:
27+
28+
`data/polyhedral/complete-graph/description.md`
29+
`data/polyhedral/complete-graph/systems/Oscar.jl/generate.jl`
30+
`data/polyhedral/complete-graph/systems/Oscar.jl/data.json`
31+
32+
### Site generator
33+
34+
- Script: `webpage/generate_page.py`
35+
- Input: `data/`
36+
- Output directory: `_site/` (generated files, ignored by git)
37+
38+
Generated output includes:
39+
- `_site/index.md`
40+
- `_site/index.html`
41+
- one `.md` and one `.html` page per example in category subdirectories, e.g.
42+
`_site/groups/free-group.md` and `_site/groups/free-group.html`
43+
44+
## Metadata in `description.md`
45+
46+
Each example description starts with YAML frontmatter:
47+
48+
```yaml
49+
---
50+
title: Complete graph
51+
category: polyhedral
52+
subcategory: combinatorics
53+
---
54+
```
55+
56+
Required:
57+
- `title`
58+
- `category`
59+
60+
Optional:
61+
- `subcategory` (used for sub-grouping and sorting in the index)
62+
63+
`category` and `subcategory` are internal keys (slug-like). Display names and
64+
ordering are configured in `webpage/generate_page.py`.
65+
66+
## Local development
67+
68+
Install dependencies:
69+
70+
```bash
71+
pip install -r requirements.txt
72+
```
73+
74+
Generate the site:
75+
76+
```bash
77+
python3 webpage/generate_page.py
78+
```
79+
80+
Notes:
81+
- Markdown is converted to HTML using `marko` (with GFM extension).
82+
- Math rendering uses MathJax in generated HTML.
83+
- Code blocks use highlight.js and include a copy button.
84+
- JSON `data.*` is rendered with compact pretty-printing on pages.
85+
86+
## GitHub Pages
87+
88+
The repository contains a workflow at
89+
`.github/workflows/publish-pages.yml` that:
90+
- installs Python dependencies
91+
- runs `python3 webpage/generate_page.py`
92+
- publishes `_site/` via GitHub Pages
93+
94+
Dependabot config for GitHub Actions updates is in:
95+
- `.github/dependabot.yml`
96+
97+
## Guidelines for good examples
98+
99+
### Prefer distinctive values
100+
Use values that are easy to identify in serialized output. Tiny or repetitive
101+
values are harder to match across systems.
102+
103+
### Avoid overly symmetric objects
104+
Prefer examples that make structure visible in serialized form (for example,
105+
nontrivial matrices instead of highly symmetric zero matrices).

0 commit comments

Comments
 (0)