-
Notifications
You must be signed in to change notification settings - Fork 6
Add Pipeline feature #96
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
|
@guzman-raphael Is this supposed to be your version of pipeline implmentation? |
|
@Synicix Since the pipeline feature is a big one, I've been getting a head start really to help me review your PRs. I'm not sure yet if I'll make this PR "ready for review" but at least wanted to make my brainstorming visible in case it helps with discussions. |
|
@guzman-raphael |
|
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
…b, add map operator, modify pod command to Vec<String> as opposed to String to allow more use cases, apply node shape based on kernel, covert enum variant from unnamed to named fields to improve python UX, remove unused metadata from make_graph, make core crypto/model functions private by default, apply clippy implicit_hasher suggestion.
…e flexible, add stopgap hash for pipeline job (unique each time), add placeholder for pipeline job scheduler, add to crate diagram, simplify store regex, capture more in available event metadata, and remove unused event info + classifier to simplify.
…le to pipeline scheduler, and increase clippy error size threshold.
…pose `PipelineRun` to keep track of pipeline job state, add `get_pipeline_result` to agent, update pipeline demo, create a packet tuple struct, rename `orchestrator::Status` to `PodStatus` to make more clear, and expose optional customization to generated graphs.
… DOT to petgraph, use jinja for DOT templating, remove SVG support in favor of graphviz CLI, remove layout dependency, and make get util more generic.
…flexible, allow pipeline.make_dot(..) to include/exclude style, and move function optional variables to end to be compatible with uniffi defaults.
…-contained groups, reduce error message size, optimize iteration workflow with llvm-cov, and increase resolution of pipeline execution visualization in demo.
…d to PipelineRun/PipelineResult, improve pipeline execution animation in demo, and add monitor note to pipeline demo.
…the hard-coded sleeps.
…l utility to verify pipelines in tests, and stream prints during tests invoked using VSCode in-place GUI.
…in pipeline execution demo.
|
Close this draft to clear up the PR (Save it PR logs for reference when needed) |
Depends on #91, #94
Features
Pipelinemodel with DOT support.graph_dotinput is used to build apetgraph::DiGraph.metadatais a lookup table of the node type i.e.Kernel.input_specis a map that defines keys required to feed in an inputPacketto thePipeline. Each key can be associated with one or more node(s)/key(s).output_specis a map that defines keys to create an outputPacket. Each key is linked to exactly one node:key.input_spec/output_specis explicit, flexible but most importantly equivalent in structure toPodto facilitate composing pipelines of mixed pods and other pipelines.pipeline.make_dot()is also available to make it easier to visualize the compute nodes.PipelineJobmodel.pipelinepoints to the reference pipeline.input_packetis a map of packet keys to path sets. Notice that each key can have a collection of path sets. This allows batching several inputs in one go. When inputs are batched (length > 1) for keys, the cartesian product will be applied if they correspond to the same input node.output_diris the root output directory for all produced computations. A pod job'soutput_dirwill be mounted to the following tree structure:{pipeline_job_output_dir}/{node_name}/{input_packet_hash}/. Currently the packet is not hashed and a simple random hash is used.PipelineResulltmodel.statuscaptures the final state of a pipeline run.JoinOperatorwhich performs cartesian product on parent streams (itertoolscrate makes this very straightforward).MapOperatorwhich allows renaming packet keys.PipelineJobexecution algorithm.agent_client(facilitated byzenohcrate). This lends itself to allowing even the operator logic to benefit from distributed, coordinated compute (in the future). Currently, there is no coordination between agent nodes.group/{group}/status/pipeline_job/{pipeline_hash}/{input|output}/{node_name}. Appropriate payloads (e.g.Payload::Stream(Packet)) are published to these topics. Nodes referenced inpipeline.input_speclisten on input while all others listen onoutput. To signal successful completion of a stream,Payload::Endis sent.Payload::End. This will signal a node has reached a state ofNodeState::Completed.PayloadasCancelledorFailed(..)), the entire node fails immediately and any descendants will be cancelled immediately. The node will be markedFailed(..)along with its error message.pod_result.statethat is notCompletedfor any packet), the pipeline does not fail immediately. It will continue on a best effort since there is value in evaluating unrelated nodes e.g. for memoization.NodeState::Completed, then the pipeline run was successful. Otherwise, it failed.JoinOperatorneeds to remember prior packets.PipelineRun.pod_run, this provides a way to interact with a pipeline while it is running which can be useful to poll for state.pipeline_run.attach()provides a way to listen for updates from the agent network to track state.pipeline_run.summarize_dot()provides a minimal way to visualize the state at any point in time as a DOT. When combined with a loop and a DOT->SVG tool (like graphviz'sdotCLI), this can provide a live animation of the pipeline run progress.agent_client.start_pipeline_run(..). Similar to to itsorchestratorcounterpart, it will return immediately since it will be processed as detached.agent_client.get_pipeline_result(..). Similar to itsorchestratorcounterpart, it will wait to respond until the pipeline run has concluded.Small features and fixes
petgraphcrate helps in traversing the graph and generating a DOT.layoutcrate helps parsing a DOT to create apetgraph::DiGraph.output_packetinPodResultwhich evaluates the checksum on all expected, generated output. Was needed to convert output packets to input packets in a pipeline run. If it fails, then partial output is allowed. If it succeeds, partial output is not allowed. Fix Add output_packet to Pod Result #89pod.commandfrom&str->Vec<String>to allow more flexibility. Having this made it simpler to create pipeline test cases. Fix Change command from string to a vec of string #95hexandrandcrates). Used for creating non-colliding pod job output directories for processing packets (temporary until we hash input packets) and a pipeline job hash (temporary until we can hash pipeline job consistently).RE_AGENT_KEY_EXPRregex to allow capturing of more metadata. Use it as the primary source for loading metadata.Housekeeping
pipeline_test.ipynbas a DEMO that illustrates how to use the pipeline feature.testcargo feature to allow exposing more to integration tests while still keeping thedefaultAPI private. Featuresdefaultandtestcannot be combined. This allows us to finally make all ofcoresubmodules private by default.orchestrator::Status->orchestrator::PodStatusto make it more distinct.RE_MODEL_METADATAregex.agent_client.submit_pod_jobs(..)->agent_client.start_pod_jobs(..)to be more consistent withorchestratorAPI design.