The arc-validate-package-registry (avpr) repository contains:
- a staging area for authoring official validation packages intended for use with
arc-validate. - a web API for serving validation packages. This API is consumed by
arc-validateto install and sync validation packages. - a website for browsing validation packages.
- some domain types and utilities relevant for consuming libraries in the AVPRIndex library
- a .NET client library for consuming the web API in the AVPRClient library
Read more at avpr.nfdi4plants.org/about
This repo runs an extensive CI/CD pipeline on every commit and PR on the main branch. The pipeline includes:
- tests and pre-publish checks for every package in the staging area.
- a release pipeline for validation packages:
- publishing stable packages to the production instance of the web API at avpr.nfdi4plants.org
- tests and release pipelines for the
AVPRIndexandAVPRClientlibraries, as well as a docker container for thePackageRegistryServiceweb API.
flowchart TD
setup("<b>setup</b> <br> (determines subsequent jobs <br>based on changed files)")
batp("<b>Build and test projects</b><br>any of [AVPRIndex, AVPRClient, API]")
tsa("<b>Test staging area</b><br>test all packages in the staging area")
sapc("<b>Staging area pre-publish checks</b><br>hash verification, prevent double publication etc")
nr("<b>Release (nuget)</b><br>any of [AVPRIndex, AVPRClient]")
dr("<b>Release (docker image)</b><br>API")
ppp("<b>Publish pending packages</b><br>Publish packages to production DB")
setup --when relevant project<br> files change--> batp
setup --changes in the<br> staging area--> tsa
batp --when tests pass and<br> release notes change--> nr
batp --when tests pass--> dr
tsa --when tests pass--> sapc
sapc --when checks pass<br> and any new packages<br> are pending--> ppp
The package staging area is intended for development and testing of validation packages.
Files in this folder must follow the naming convention <package-name>@<major>.<minor>.<patch>.* and contain a yml frontmatter at the start of the file. These files must additionally be inside a subfolder exactly named as the package name. This leads to a folder structure like this:
StagingArea
│
├── some-package
│ ├── [email protected]
│ ├── [email protected]
│ └── [email protected]
│
├── some-python-package
│ ├── [email protected]
│ ├── [email protected]
│ └── [email protected]
│
└── some-other-package
├── [email protected]
├── [email protected]
└── [email protected]
Validation packages must be self-contained, single-file scripts.
The following programming languages can be used to create validation packages:
F# (.fsx)- software package management in F# scripts MUST use
#r nuget ...directives to reference any external dependencies. - F# scripts are executed via
dotnet fsi.
- software package management in F# scripts MUST use
Python (.py)- software package management in Python scripts MUST use uv inline script dependencies to reference any external dependencies.
- Python scripts are executed via
uv run.
Any change to a package in the staging area triggers the tests located at /StagingAreaTests, which are run on every package. (see also CI/CD pipeline). publishing packages to the production registry is only possible if all tests pass.
In principle, packages can be published via 2 channels:
Packages committed to the staging area will be published to the production registry database (https://avpr.nfdi4plants.org) if they pass all tests and pre-publish checks.
Publishing a package to the registry is a multi-step process:
Suppose you want to develop version 1.0.0 of a package called my-package.
- fork this repo
- Add a new blank
[email protected]file to the staging area in the foldermy-package. - Develop the package, using a work-in-process pull request to use this repository's CI to perform automated integrity tests on it.
- Once the package is ready for production use, add
publish: trueto the yml frontmatter of the package file. This will trigger the CI to build and push the package to the registry once the PR is reviewed and merged. - Once a package is published, it cannot be unpublished or changed. To update a package, create a new script with the same name and a higher version number.
| stage | availability | mutability |
|---|---|---|
| staging: development in this repo | version of current HEAD commit in this repo via github API-based execution in arc-validate CLI |
any changes are allowed |
| published: available in the registry | version of the published package via the registry API | no changes are allowed |
Packages SHOULD be versioned according to the semantic versioning standard. This means that the version number of a package should be incremented according to the following rules:
- Major version: incremented when you make changes incompatible with previous versions
- Minor version: incremented when you add functionality in a backwards-compatible manner
- Patch version: incremented when you make backwards-compatible bug fixes
Package metadata is extracted from yml frontmatter at the start of the .fsx file, indicated by a multiline comment ((* ... *))containing the frontmatter fenced by --- at its start and end:
(*
---
<yaml frontmatter here>
---
*)You can additionally bind YAML frontmatter as a string inside your package. This is recommended because you can now re-use the metadata in your package code.
This binding must be placed at the start of the file to the name PACKAGE_METADATA with a [<Literal>] attribute exactly like this:
let [<Literal>] PACKAGE_METADATA = """(*
---
<yaml frontmatter here>
---
*)"""further down in your package code, you can now extract and use this metadata. This for example prevents you from having to repeat the package name in your package code.
#r "nuget: ARCExpect"
#r "nuget: AVPRIndex"
let metadata = ValidationPackageMetadata.extractFromString PACKAGE_METADATA
let validationCases = ...
cases
|> Execute.ValidationPipeline(
metadata = metadata // use metadata to determine output paths and names instead of doing it manually
)Package metadata is extracted from yml frontmatter at the start of the .py file, guarded by triple quotes (""") containing the frontmatter fenced by --- at its start and end:
"""
---
<yaml frontmatter here>
---
"""You can additionally bind YAML frontmatter as a string inside your package. This is recommended because you can now re-use the metadata in your package code.
This binding must be placed at the start of the file to the name PACKAGE_METADATA exactly like this:
PACKAGE_METADATA = """
---
<yaml frontmatter here>
---
"""further down in your package code, you can now extract and use this metadata. This for example prevents you from having to repeat the package name in your package code.
metadata = extract_yaml(PACKAGE_METADATA) #extract yaml object from string| Field | Type | Description |
|---|---|---|
| Name | string | the name of the package |
| MajorVersion | int | the major version of the package |
| MinorVersion | int | the minor version of the package |
| PatchVersion | int | the patch version of the package |
| Summary | string | a single sentence description (<=50 words) of the package |
| Description | string | an unconstrained free text description of the package |
Example: only mandatory fields
(*
---
Name: my-package
MajorVersion: 1
MinorVersion: 0
PatchVersion: 0
Summary: My package does the thing.
Description: |
My package does the thing.
It does it very good, it does it very well.
It does it very fast, it does it very swell.
---
*)
let doSomeValidation () = ()
doSomeValidation ()| Field | Type | Description |
|---|---|---|
| Publish | bool | a boolean value indicating whether the package should be published to the registry. If set to true, the package will be built and pushed to the registry. If set to false (or not present), the package will be ignored. |
| Authors | author[] | the authors of the package. For more information about mandatory and optional fields in this object, see Objects > Author |
| Tags | string[] | a list of tags with optional ontology annotations that describe the package. For more information about mandatory and optional fields in this object, see Objects > Tag |
| ReleaseNotes | string[] | a list of release notes for the package indicating changes from previous versions |
| CQCHookEndpoint | string | an optional URL to a CQC Hook endpoint that can be used for continuous quality control (CQC) integration. If provided, this endpoint will be called with validation results after each package execution. |
Example: all fields
(*
---
Name: my-package
MajorVersion: 1
MinorVersion: 0
PatchVersion: 0
Summary: My package does the thing.
Description: |
My package does the thing.
It does it very good, it does it very well.
It does it very fast, it does it very swell.
Publish: true
Authors:
- FullName: John Doe
Email: [email protected]
Affiliation: University of Nowhere
AffiliationLink: https://nowhere.edu
- FullName: Jane Doe
Email: [email protected]
Affiliation: University of Somewhere
AffiliationLink: https://somewhere.edu
Tags:
- Name: validation
- Name: my-tag
TermSourceREF: my-ontology
TermAccessionNumber: MO:12345
ReleaseNotes: |
- initial release
- does the thing
- does it well
CQCHookEndpoint: https://some-url.xd
---
*)
let doSomeValidation () = ()
doSomeValidation ()Author metadata about the people that create and maintain the package. Note that the
| Field | Type | Description | Mandatory |
|---|---|---|---|
| FullName | string | the full name of the author | yes |
| string | the email address of the author | no | |
| Affiliation | string | the affiliation (e.g. institution) of the author | no |
| AffiliationLink | string | a link to the affiliation of the author | no |
Tags can be any string with an optional ontology annotation from a controlled vocabulary:
| Field | Type | Description | Mandatory |
|---|---|---|---|
| Name | string | the name of the tag | yes |
| TermSourceREF | string | Reference to a controlled vocabulary source | no |
| TermAccessionNumber | string | Accession in the referenced controlled vocabulary source | no |
Prerequisites:
- .NET 10 SDK
- Docker
- Docker Compose
Advanced local dev functionality has only been tested on Windows with Visual Studio. For that, install the ASP.NET core workload including container features, which will enable running the Docker Compose project in Debug mode.
The AVPRIndex and AVPRClient libraries are located in /src and are intended for use in consuming applications.
To build them, just run dotnet build in the respective project folders or build the arc-validate-package-registry solution.
- Bump the version in the respective
csprojorfsprojfile - Update the respective RELEASE_NOTES.md file
- CI will automatically publish the package to the nuget feed
The PackageRegistryService project located in /src is a simple ASP.NET Core (8) web API that serves validation packages and/or associated metadata via a few endpoints.
It is developed specifically for containerization and use in a docker environment.
To run the PackageRegistryService locally, ideally use VisualStudio and run the Docker Compose project in Debug mode. This will launch the stack defined at docker-compose.yml, which includes:
- the containerized
PackageRegistryServiceapplication - a
postgresdatabase seeded with the latest indexed packages - an
adminerinstance for database management (will maybe be replaced by pgAdmin in the future)
In other IDEs, you can run the PackageRegistryService project directly or adjust the stack, but you will need to either set up a local postgres database and configure the connection string in appsettings.json accordingly or fine-tune the existing docker-compose file..
Changes in e.g. ValidationPackage metadata need to be reflected at several points in the code:
- PackageRegistryService/Models/ValidationPackage.cs: the main database model
- AVPRIndex/Domain.fs: the client-side model, the respective type here would be
ValidationPackageMetadata - EntityFramework migration files in PackageRegistryService/Migrations: these are auto-generated via the
dotnet ef migrations add <MigrationName>command after making changes to the data model in PackageRegistryService/Models, but might need manual adjustment (e.g. when a field is renamed rather than added/removed) - Database seeding code in PackageRegistryService/Data/DataInitializer.cs: when adding new fields, make sure to update the seeding code accordingly.
- do not forget to trigger client lib auto generation
Currently, any change in src/PackageRegistryService will trigger a release to the production registry. This is done by the CI/CD pipeline, which builds and pushes a docker image to the registry on every relevant commit to the main branch.
This will move to a versioned release process in the future.
The PackageRegistryService has a built-in Swagger UI endpoint for API documentation. It is served at /swagger/index.html.
There are 2 solutions that contain test projects:
arc-validate-package-registry.slncontains the test projects for theAVPRIndexandAVPRClientlibraries, as well as future API and integration tests located in/tests.PackageStagingArea.slncontains the tests and sanity checks for all packages in the staging area.
Run the tests with dotnet test in the respective test project folders or on the respective solution.
For now, this is a manual process.
If you are an authorized user with an API key, packages can be pushed to prod with the AVPRCI CLI tool in this repo:
in the repo root, run:
dotnet run --project .\src\AVPRCI\AVPRCI.fsproj -- publish --api-key yourKeyHere --dry-runto see what would be published, and remove the --dry-run flag to actually publish the packages.