Skip to content

Conversation

@gouttegd
Copy link
Contributor

@gouttegd gouttegd commented Dec 7, 2025

⚠️ Do not merge until a decision has been made about where ODK-Core should live (see first comment)!

This PR implements the idea proposed in #877. It supersedes the odk/odk.py script by the ODK-Core project.

For now, it:

  • replaces all the Python packages installed in ODK-Lite by a single odk-core package (which brings with it all the Python packages required to run the ODK standard workflows);
  • removes all files that are now installed as part of the odk-core package (a handful of helper scripts and most importantly all the template files);
  • changes the layout for the programs and resources installed in the ODK image.

The new layout is as follows:

  • /odk: base directory where all the ODK stuff is installed
  • /odk/bin: all the executable files (some of them may be mere links to executables that are located elsewhere) -- this is only directory that needs to be added to the PATH;
  • /odk/tools: additional stuff required by some tools (typically this is where the Jar archives for the Java programs will be);
  • /odk/resources: additional resources that may be required from within a ODK workflow (this directory is exposed to the standard Makefile via the ODK_RESOURCES_DIR environment variable);
  • /odk/resources/robot/plugins: all bundled ROBOT plugins (odk.jar, sssom.jar, kgcl.jar).

All the Python packages that are required to run ODK workflows are
dependencies of the odk-core package (with the `workflows` extra). This
means that, when building ODKLite, the only Python package we need to
explicitly install is odk-core.

We can thus remove the `requirements.txt.lite` file that is no longer
needed (we explicitly invoke `pip install odk-core[workflows]` instead),
and consequently rename `requirements.txt.full` into `requirements.txt`.
ODK-Core expects a certain layout for a "ODK environment". At the very
least, it expects a "resources" directory (which should be pointed to by
a "ODK_RESOURCES_DIR" variable, with the following contents:

* ODK_RESOURCES_DIR/obo.epm.json (the OBO extended prefix map);
* ODK_RESOURCES_DIR/robot/profile.txt (current version of the default
  ROBOT profile);
* ODK_RESOURCES_DIR/robot/plugins/... (ROBOT plugins).

This commit changes the way the ODK tools are installed so that all the
ODK stuff is cleanly organised in a single /odk directory as follows:

* /odk/bin: executable programs (may be links to somewhere else);
* /odk/tools: additional files that a tool may need to run (typically,
  the Jar archives for a Java tool);
* /odk/resources: the ODK_RESOURCES_DIR as explained above.

We install all programs as usual within this layout, then we run the
`odk install` command; the command will only install a tool if the tool
is not already available, so it will _not_ reinstall any tool that has
already been "manually" installed previously in the Dockerfile -- but
that will ensure that, if ODKCore starts needing a new tool, that tool
will always be included in the ODKLite image even if we do not update
the Dockerfile. This principle also allows us to override the versions
of the tools set forth by ODKCore.

We remove the context2csv.py and check-rdfxml.sh scripts, which are
replaced by subcommands of ODKCore's odk-helper command.
@gouttegd gouttegd self-assigned this Dec 7, 2025
There is no longer a /tools/odk.py script. That script now lives as
/odk/bin/odk, and can be called as simply `odk` (the /odk/bin directory
is in the PATH).
The script that generates the documentation for the project schema is
currently broken, and will in fact require some profound changes to be
adapted to the new ODK-Core system. So for now, we simply give up on
generating the documentation automatically.
The launchers for both ROBOT and Owltools' ontology-release-runner were
faulty because of unproperly escaped `$` characters (and a misplaced
`-jar` option, in the case of the ROBOT launcher).
Jinjanator had been mistakenly removed from both ODKLite and ODKFull; it
does not belong to ODKLite but we expect its presence in ODKFull, so we
restore it there.

We also move the test for the presence of both Jinjanator and Owltools'
ontology-release-runner from the `test_odklite_programs` target to the
`test_odkfull_programs` target, since both those tools belong only to
ODKFull.
@gouttegd gouttegd marked this pull request as ready for review December 14, 2025 23:28
@gouttegd
Copy link
Contributor Author

This is “ready” for merging, modulo a few things:

(A) We must decide where ODK-Core should live.

For now, ODK-Core is kept in a separate repository (https://github.com/gouttegd/odkcore). This allows for a very clear separation between (a) the ODK seeding system and its workflows (in the ODK-Core repository) and (b) the Docker images (this repository). However this comes with some drawbacks that we need to be aware of:

  • When building the images, we get ODK-Core from the Python Package Index (PyPI). This means that whenever we want to build a new ODK images that incorporates some recent changes in ODK-Core, we must first publish a ODK-Core release. (For local builds it is possible to work around that requirement, and use instead a locally available copy of the ODK-Core repository.)
  • For now the tests live in this repository; if we keep the ODK-Core in a separate repository, they should be moved to that repository instead (otherwise the core cannot be tested until we build an image here, which as said in the previous point requires a ODK-Core release – this means we would need to publish ODK-Core before we can test it, which is absurd).
  • Having one repository for the Core and one for the images is likely to be a bit confusing for users, who might not know where to report issues.

We could instead decide to have ODK-Core live within this very repository. Building the images would then systematically use the current state of ODK-Core, which would not even need to be released (though we could and should still make releases of it, for people who want to use “native” ODK environments). All tests would be in the same repository. And there would still be only one place to report issues.

Nonetheless, for now I am inclined to keep ODK-Core in a completely separate repository, as I believe it is “cleaner”. But I can be convinced otherwise.

(B) Developers’ documentation will need updating.

Regardless of the decision made for (A), the adoption of ODK-Core would mean that several aspects of the existing docs (e.g. about how to add a new program to the ODK, or even the general design of the ODK) will no longer be accurate.

@gouttegd
Copy link
Contributor Author

When building the images, we get ODK-Core from the Python Package Index (PyPI). This means that whenever we want to build a new ODK images that incorporates some recent changes in ODK-Core, we must first publish a ODK-Core release.

It just occurred to me that this particular drawback could be very easily dealt with by importing the ODK-Core repository into this repository as a Git submodule…

@gouttegd gouttegd changed the title Allow using ODK-Core Use ODK-Core Dec 15, 2025
The templates are now maintained as part of ODK-Core, we can then remove
them from here.
@gouttegd gouttegd added this to the 1.7 milestone Dec 15, 2025
Copy link
Contributor

@matentzn matentzn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great to me! I went through it line by line but did not test it. Just some minor comments which are probably not actionable

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will there be a relatively straight forward way to to add dependencies to [odk-core] similar to what was happening here so far?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean Python dependencies? Yes of course, just add them to the appropriate section of the pyproject.toml file in ODK-Core.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(That’s the kind of things for which the documentation will need updating…)

mkdocs
mkdocs-material
mkdocs-mermaid2-plugin
mkdocs-table-reader-plugin
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All the mkdocs ones have to be in core? (documentation workflow)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then they should be added as dependencies to odk-core[workflows].

Currently, they are not because the dependencies to odk-core[workflows] are simply “all the Python packages that were installed in ODK-Lite“. The mkdocs packages are not currently in ODK-Lite (which is a bug, since ODK-Lite is supposed to provide everything that is required by the standard workflows), so they are not currently in ODK-Core.

@gouttegd gouttegd mentioned this pull request Dec 19, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants