-
Notifications
You must be signed in to change notification settings - Fork 6
Open
Description
This issue has been converted to a project Reintergrate PipelineBuilder and pipeline-relevant features
This issue is to introduce all the features that are already implemented back into the dev branch. All these features were removed in #99 due to time constraints on review time.
Below is the old summary of what was already implemented and still need to be worked on.
Added:
- Kernel Enum
- Mapper Class (Design to map one set of input_packet keys to another)
- Joiner Class (Design to represent to do a Cartesian product among all incoming packets. Actual implementation is handled by the runner, though once I get to that I plan to probably move it into impl Joiner as a class func)
- Pipeline Class (Update the previous design to have kernel and labels now, still need to include tests)
- Verify method to check if the pipeline is valid or not
- PipelineBuilder: To help build pipeline in tests far easier instead of typing out the pipeline by hand (Could also possible map it to python later on)
- Reintegrate the tests I had in the previous PR
- Expand the Pipeline Builder to be able to add edges between any two nodes and update the hash accordingly
- Test to check labels
- add_edge function to auto inject a joiner if there is more than one parent node
- verify function that goes through the pipeline and verify all the input_spec matches up with mappings and joiner
- Pipeline hash is computed upon using Pipeline::new()
Changed:
- Utils get function to be more generic, though one problem with this new implementation is that it is unable to deal with &str keys
TODO:
- Add test to deal with case of chaining joiners
- Split the pipeline and pipeline builder into two separate modules and test
- Fix BUG with joiner node injection to not inject another joiner if the to_node is already a joiner
- File issue to add options to save pipeline to localfilestore
- File issue about the case where a joiner node is also an input node as the current input_spec verify breaks for this case (I am assuming we would want to support this)
- Ask Raphael to create a pull request to move the serialize_hashmap, seralizer_hashset and etc to utils.
PR Design Question:
Should we keep the pipeline builder which will later be mapped to python or pull it out and dump in a simpler one for this stride? It is a little complicated to follow the logic for reviewers
Metadata
Metadata
Assignees
Labels
No labels