Description
Nodestream projects are currently complex to maintain and limiting. There is duplication and complexity that makes it hard to develop new features. The core parts of the project model were never well thought out but rather reactionary designed as the project evolved. For 1.0, we'd really want to design a project model that leaves the doors open to new features down the road, increases maintainability, and removes some limitations that users have. This issue catalogs the issues as well as a proposed solution.
Current Issues
Reusing Plugin Pipelines
Currently plugins work by adding a scope to the project. This means that the pipelines cannot be used more than once with a specified configuration. See #240
Plugin Development Difficulties
Currently if you are developing a plugin, you have to do some gymnastics to run your pipelines inside of a project which makes it quite difficult to develop.
Using the Same Pipeline at Once
Similar to plugins, you cannot provide a base configuration for a pipeline and use the same definition more than once.
Orchestrating Pipelines in a Particular Order
There is no way to run a "group" of pipelines in a particular order without specifying a specific series of pipelines in a run command.
Current Class Inventory
Lets examine the internal object model of the project:
PipelineConfiguration
Contains targets, annotations, and other data that pertains to how pipelines are treated and initialized.
PluginConfiguration
Largely duplicative with a scope definition. Used to "merge" configuration with the scope defined in the project.
PipelineScope
Defines a group of pipelines and shared configuration of those pipelines.
PipelineDefinition
Stores the file path to the definition as well as configuration specific to that definition.
Project
Contains all scopes and plugins.
Proposed Solution
The proposed solution would change an expanded nodestream.yaml
file to look like this.
plugins:
- type: plugin
name: nodestream_plugin_sbom # as an example
pipelines:
- name: foo
path: foo/bar/baz.yaml
config: # optional; same as today.
foo: bar
annotations: # optional; same as today.
foo: bar
scopes:
- name: crons
pipelines:
- !pipeline self/foo # refer to the prototype pipeline described above.
- path: !pipeline self/foo # as written, same as above. But can specify overriding config, annotations, etc .
- foo/bar/baz.yaml # Can inline a pipeline same as before
- path: foo/bar/baz.yaml # Same as above.
- !pipeline nodestream_plugin_sbom/default/github # import a pipeline from a plugin.
Since you can still define a pipeline in a child scope, there is no breaking changes to projects that are simple by defining pipelines and scopes on their own. This does introduce a change for projects that use plugins. However, it is minimal.
PipelineConfiguration
This class would stay essentially unchanged compared to the current class.
PipelineDefinition
-> PipelinePrototype
This issue proposes renaming PipelineDefinition
to PipelinePrototype
. It would contain the following data:
name
:str
source
:PipelineSource
(representing either a file or another prototype)configuration
: The effective configuration chain for this prototype.
Scope
and Project
Essentially a project is duplicative with the Scope
class.
With the changes in internal data model, a specific project class is not required.
The data model essentially becomes a graph of scopes and this the Project
class is just a Scope
node in the Graph.
Therefore changing the APIs to munge the best of both worlds should be all we need.
Metadata
Assignees
Labels
Type
Projects
Status
No status