Skip to content

Commit c9103e9

Browse files
mpasternakclaude
andcommitted
Rewrite README: convert RST to Markdown, improve structure and content
Replace the unwelcoming "This is a fork" opening with a proper project title, description, and IPL Web sponsorship section. Modernize all commands to use uv, fix typos, and format configuration as a table. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1 parent e14cc69 commit c9103e9

2 files changed

Lines changed: 179 additions & 185 deletions

File tree

README.md

Lines changed: 179 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,179 @@
1+
# MOAI — Open Access Server Platform for Institutional Repositories
2+
3+
[![Tests](https://github.com/iplweb/moai-iplweb/actions/workflows/test.yml/badge.svg)](https://github.com/iplweb/moai-iplweb/actions/workflows/test.yml)
4+
5+
MOAI is a platform for aggregating content from different sources and publishing it through the [Open Archive Initiative Protocol for Metadata Harvesting](http://www.openarchives.org/pmh/) (OAI-PMH). It can harvest data from various sources — OAI feeds, SQL databases, XML files, Fedora Commons, EPrints, DSpace — and serve multiple OAI feeds from a single server, each with independent configuration.
6+
7+
<p align="center">
8+
<b>Support graciously provided by</b><br><br>
9+
<a href="https://www.iplweb.pl"><img src="https://www.iplweb.pl/images/ipl-logo-large.png" width="150" alt="IPL Web"></a>
10+
</p>
11+
12+
## About this fork
13+
14+
This is a maintained fork of [MOAI by Infrae](https://github.com/infrae/moai/), adding Python 3 support, modern packaging (`pyproject.toml`, `uv`), and GitHub Actions CI. Changes were offered upstream via [PR #5](https://github.com/infrae/moai/pull/5).
15+
16+
> **Note:** Other than modernizing the tooling, there are no major functional changes. Some parts of the documentation below may be outdated. Patches welcome.
17+
18+
## Installation
19+
20+
MOAI is a normal Python package. It is tested with Python 3.9, 3.10, 3.11, 3.12, 3.13.
21+
We recommend using [uv](https://docs.astral.sh/uv/) for dependency management.
22+
23+
Instructions below are for Unix, but MOAI should also work on Windows.
24+
25+
Install MOAI using uv:
26+
27+
```bash
28+
cd moai
29+
uv sync
30+
```
31+
32+
To run tests:
33+
34+
```bash
35+
uv sync --extra test
36+
uv run pytest
37+
```
38+
39+
## Running in development mode
40+
41+
The development server should never be used in production. It is convenient for testing and development.
42+
43+
```bash
44+
cd moai
45+
uv run paster serve settings.ini
46+
```
47+
48+
This will print something like:
49+
50+
```
51+
Starting server in PID 7306.
52+
Starting HTTP server on http://127.0.0.1:8080
53+
```
54+
55+
You can now visit `localhost:8080/oai` to view the MOAI OAI-PMH feed.
56+
57+
## Configuring MOAI
58+
59+
Configuration is done in the `settings.ini` file. The default settings file uses the `Paste#urlmap` application to map WSGI applications to a URL.
60+
61+
In the `[composite:main]` section there is a line:
62+
63+
```
64+
/oai = moai_example
65+
```
66+
67+
Which maps the `/oai` URL to a MOAI instance. This makes it easy to run many MOAI instances in one server, each with its own configuration.
68+
69+
The `[app:moai_example]` configuration lets you specify the following options:
70+
71+
| Option | Description |
72+
|--------|-------------|
73+
| `name` | The name of the OAI feed (returned in Identify verb) |
74+
| `url` | The URL of the OAI feed (returned in OAI-PMH XML output) |
75+
| `admin_email` | The email address of the admin (returned in Identify verb) |
76+
| `formats` | Available metadata formats |
77+
| `disallow_sets` | List of setspecs that are not allowed in the output of this feed |
78+
| `allow_sets` | If used, only sets listed here will be returned |
79+
| `database` | SQLAlchemy URI to identify the database used for storage |
80+
| `provider` | Provider identifier where MOAI retrieves content from |
81+
| `content` | Class that maps metadata from provider format to MOAI format |
82+
83+
## Adding content
84+
85+
The MOAI system is designed to periodically fetch content from a *provider*, and convert it to MOAI's internal format, which can then be translated to the different metadata formats for the OAI-PMH feed.
86+
87+
MOAI comes with an example that shows this principle:
88+
89+
In the `moai/moai` directory there are two XML files. Let's pretend these files are from a remote system, and we want to publish them with MOAI.
90+
91+
In the `settings.ini` file, the following option is specified:
92+
93+
```
94+
provider = file://moai/example-*.xml
95+
```
96+
97+
This tells MOAI that we want to use a file provider, with some files located in `moai/example-*.xml`.
98+
99+
The following option points to the class that we want to use for converting the example content XML data to MOAI's internal format:
100+
101+
```
102+
content = moai_example
103+
```
104+
105+
The last option tells MOAI where to store its data, this is usually a SQLite database:
106+
107+
```
108+
database = sqlite:///moai-example.db
109+
```
110+
111+
Now let's try to add these two XML files. First visit the OAI-PMH feed to make sure nothing is already being served:
112+
113+
```
114+
http://localhost:8080/oai?verb=ListRecords&metadataPrefix=oai_dc
115+
```
116+
117+
This should return a `noRecordsMatch` error.
118+
119+
To add the content, run the `update_moai` script with the section name from the `settings.ini` as argument:
120+
121+
```bash
122+
uv run update_moai moai_example
123+
```
124+
125+
This will produce the following output:
126+
127+
```
128+
/ Updating content provider: example-2345.xml
129+
Content provider returned 2 new/modified objects
130+
131+
100.0%[====================================================================>] 2
132+
Updating database with 2 objects took 0 seconds
133+
```
134+
135+
Now when you visit the OAI-PMH feed again you should see the two records:
136+
137+
```
138+
http://localhost:8080/oai?verb=ListRecords&metadataPrefix=oai_dc
139+
```
140+
141+
When you run the `update_moai` script again, it will create a new database with all the records. It is also possible to specify a date with the `--date` switch. When a date is specified, only records that were modified after this date will be added. The `update_moai` script can be run from a daily or hourly cron job to update the database.
142+
143+
## Adding your own Provider / Content and Metadata classes
144+
145+
It's possible — and most of the time, needed — to extend MOAI for your use-cases. The Provider and Content classes from the example might be a good starting point. All your customizations should be registered with MOAI through `entry_points`. Have a look at MOAI's `pyproject.toml` for more information.
146+
147+
The best approach would be to create your own Python package with `pyproject.toml` and install it in the same environment as MOAI. This will let MOAI find your customizations. Note that when you change something in your package metadata, you have to reinstall the package for MOAI to pick up the changes.
148+
149+
The `moai.interfaces` file contains documentation about the different classes that you can implement.
150+
151+
## Adding your own database
152+
153+
Instead of writing your own provider/content classes, you can also register your own custom database. Implementing a replacement for `moai.database.SQLDatabase` can be more complicated than writing a provider/content class, but it has the advantage that MOAI is always up to date and you don't need a second SQLite database.
154+
155+
Have a look at the `pyproject.toml` file — it registers several databases. You could use this mechanism to register your own database from your own Python package.
156+
157+
In the `settings.ini` configuration you can then reference your database (`mydb://some+config+variables`).
158+
159+
For the database, have a look at the generic database provider in `database.py`. The only methods that you need to implement are: `oai_sets`, `oai_earliest_datestamp` and `oai_query`.
160+
161+
The `oai_query` method returns dictionaries with record data. The keys of these dictionaries are defined in the metadata files (for example `metadata.py`) — have a look at the source.
162+
163+
For `oai_dc` there are the following names:
164+
165+
`title`, `creator`, `subject`, `description`, `publisher`, `contributor`, `type`, `format`, `identifier`, `source`, `language`, `date`, `relation`, `coverage`, `rights`
166+
167+
So a return value would look like:
168+
169+
```python
170+
{'id': '<oai record id>',
171+
'deleted': '<bool>',
172+
'modified': '<utc datetime>',
173+
'sets': ['<list of setspecs>'],
174+
'metadata': {
175+
'title': ['<list with publication title>'],
176+
'creator': ['<list of creator names>'],
177+
...}
178+
}
179+
```

README.rst

Lines changed: 0 additions & 185 deletions
This file was deleted.

0 commit comments

Comments
 (0)