Skip to content

Conversation

@Synicix
Copy link
Contributor

@Synicix Synicix commented Jul 1, 2025

Summary:
In short, I implemented from_dot to fit with the design of graph + kernel_lut where kernel_lut contains only the unique kernels to prevent duplicates. This will provided me with a minimum Pipeline for me to resume work on the pipeline runner. Also, it removes the need for pipeline_builder to be implemented in order to create test cases.

Added:

  • MVP pipeline
  • from_dot function that requires the graph and kernel_to_node_name HashMap (Reason behind this is I still want to keep kernel_lut to only have 1 copy of each kernel)
  • find_missing_keys, for PipelineJob input checking
  • Uniffi bindings

Updated:

  • utils functions to more generic implementation

Deferred:

  • to_dot (Was spending too much time trying to deal with error handling based on @guzman-raphael reference code, and it is not needed for runner)
  • to_svg (Not needed for runner so deferring it)

For reference where these changes come from:

image

@Synicix Synicix added the enhancement New feature or request label Jul 1, 2025
@Synicix Synicix requested a review from guzman-raphael July 1, 2025 07:37
@Synicix
Copy link
Contributor Author

Synicix commented Jul 1, 2025

@guzman-raphael The code base is ready for a rough review, it is still missing tests and the list of todos above, but I plan to get it done by Wednesday night hopefully.

@codecov
Copy link

codecov bot commented Jul 1, 2025

Codecov Report

❌ Patch coverage is 88.52459% with 14 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
src/uniffi/model/pipeline.rs 82.35% 9 Missing ⚠️
src/uniffi/model/mod.rs 0.00% 5 Missing ⚠️

📢 Thoughts on this report? Let us know!

Cargo.toml Outdated
indexmap = { version = "2.9.0", features = ["serde"] }
# random name generator
names = "0.14.0"
petgraph = { version = "0.8.2", features = ["serde-1"] }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before introducing petgraph serialization (assuming for storage), we should ensure it is deterministic.

e.g.:

graph TD;
    A-->B;
    A-->C;
Loading

and

graph TD;
    A-->C;
    A-->B;
Loading

should serialize the same in YAML. If not, we should defer until we can guarantee this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

File an issue to properly fix this issue. At the moment it is none deterministic.

  • TODO: Disable hash for now.

src/core/mod.rs Outdated
pub mod model;
pub(crate) mod orchestrator;
/// Components relating to pipelines
pub mod pipeline;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should not public expose here. If meant to be public, it should be moved into within orcapod::uniffi::* with appropriate FFI exposure.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add TODO move it to uniffi side of the library, and add to derives to function and struct that we want to export.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added

/// # Errors
///
/// Will return `Err` if there is an issue initializing a `Blob` instance.
pub const fn new(kind: BlobKind, location: URI) -> Self {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why was this needed? Why not just: Blob {kind, location, ..Blob::default()}.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, very fair point, let me fix this.

@Synicix Synicix reopened this Jul 7, 2025
@Synicix Synicix changed the title Pipeline and PipelineBuilder Implementation MVP Pipeline Implementation with Utils update Jul 7, 2025
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR delivers an MVP implementation of pipeline construction from DOT notation, refactors utility helpers for generic use, and adds test fixtures to validate the new behavior.

  • Introduce Pipeline::from_dot and a unique kernel_lut to prevent duplicate kernels
  • Generalize get and add find_missing_keys in core/util for input validation
  • Add end-to-end tests and fixtures for Pipeline and PipelineJob

Reviewed Changes

Copilot reviewed 10 out of 10 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
tests/pipeline.rs Add basic tests for pipeline creation and specs
tests/fixture/mod.rs Provide pipeline and pipeline_job fixtures
src/core/util.rs Refactor get to generic key type and add find_missing_keys
src/uniffi/pipeline.rs Implement Pipeline::from_dot, node hashing, and PipelineJob::new
src/uniffi/orchestrator/docker.rs Minor update to JSON label lookup invocation
Comments suppressed due to low confidence (3)

tests/fixture/mod.rs:226

  • [nitpick] The fixture function name pipeline shadows the Pipeline type and can be confusing. Consider renaming it to create_pipeline or similar.
pub fn pipeline() -> Result<Pipeline> {

tests/fixture/mod.rs:280

  • [nitpick] This fixture pipeline_job could be renamed to create_pipeline_job to avoid ambiguity with the PipelineJob type.
pub fn pipeline_job() -> Result<PipelineJob> {

src/core/util.rs:41

  • [nitpick] The variable name temp is generic; renaming it to value or entry would improve readability.
    let temp = map.get(key).context(selector::KeyMissing {

//! process completes successfully and outputs the expected results.
pub mod fixture;
use std::{collections::HashMap, path::PathBuf, vec};
Copy link

Copilot AI Jul 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The import of vec isn’t used anywhere in this test. Consider removing it to keep imports clean.

Suggested change
use std::{collections::HashMap, path::PathBuf, vec};
use std::{collections::HashMap, path::PathBuf};

Copilot uses AI. Check for mistakes.
.map(|(assigned_name, run_info)| {
let pod_job: PodJob =
serde_json::from_str(get(&run_info.labels, "org.orcapod.pod_job")?)?;
serde_json::from_str(get(&run_info.labels, &"org.orcapod.pod_job".to_owned())?)?;
Copy link

Copilot AI Jul 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] Allocating a String inline for the label key is unnecessary. Consider defining a const POD_JOB_LABEL: &str = "org.orcapod.pod_job" and using &POD_JOB_LABEL.to_owned() or similar to avoid repeated to_owned() calls.

Suggested change
serde_json::from_str(get(&run_info.labels, &"org.orcapod.pod_job".to_owned())?)?;
serde_json::from_str(get(&run_info.labels, &POD_JOB_LABEL.to_owned())?)?;

Copilot uses AI. Check for mistakes.
Kernel::Mapper(mapper) => {
Ok(find_missing_keys(&input_packet, mapper.mapping.keys()))
}
Kernel::Joiner => Ok(Vec::<String>::new()), // Should probably error out because joiner should not be a root node
Copy link

Copilot AI Jul 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Joiner kernels should not be treated as root nodes. Either filter out Joiner before this check or return an error here to prevent silent omissions.

Copilot uses AI. Check for mistakes.
#[expect(clippy::string_slice, reason = "Should never fail as we are in")]
pub fn get_kernel(&self, kernel_key: &str) -> Result<&Kernel> {
let char_to_cut_at = '_';

Copy link

Copilot AI Jul 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] The logic to strip after the last _ relies on naming conventions and may be confusing. Document the intention or restrict this slicing to node-based lookup rather than kernel hashes.

Suggested change
// Document the slicing logic and validate the format of `kernel_key`
// The `kernel_key` is expected to follow the format: "<prefix>_<suffix>"
// where `<prefix>` is the meaningful part used for lookup.
if !kernel_key.contains(char_to_cut_at) {
return Err(OrcaError::new(Kind::InvalidInput, format!("Invalid kernel_key format: {}", kernel_key)));
}

Copilot uses AI. Check for mistakes.
Copy link
Contributor

@guzman-raphael guzman-raphael left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Synicix I have completed my review of your PR and summarized my feedback into this PR to your fork. Let me know if you have any questions when you go over it.

I'd recommend merging it through GH to prevent merge conflicts between us and so my contributions show up in the commit history. If you'd like to make some changes yourself, it is probably best to do it after merging my PR to your fork.

@guzman-raphael guzman-raphael merged commit 0835db2 into nauticalab:dev Jul 29, 2025
4 checks passed
@Synicix Synicix deleted the pipeline branch October 4, 2025 05:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants