Description
my conf:
ubuntu v 22. lts
Dataiku On Premise v 13.1.2
Exemple
The get_graph
method in the DSSProjectFlow
class does not support flows that have loops, causing a RecursionError
when trying to get the flow graph for such projects.
function's reference : https://developer.dataiku.com/latest/api-reference/python/flow.html#dataikuapi.dss.flow.DSSProjectFlow.get_graph
Steps to Reproduce
- Create a project in Dataiku where a recipe has a dataset as output (last stage in the flow) but used as input of another recipe. Both input and output, forming a loop.
- Use the API to get the flow graph:
import dataiku client = dataiku.api_client() project = client.get_project("YOUR_PROJECT_KEY") flow = project.get_flow() graph = flow.get_graph() # This line causes the RecursionError below
DSSProjectFlowGraph.get_items_in_traversal_order..add_from(graph_node)
776 predecessor_node = self.nodes[predecessor_ref]
777 if not in_set(predecessor_node):
--> 778 add_from(predecessor_node)
780 # Then add ourselves
781 if not in_set(graph_node):
DSSProjectFlowGraph.get_items_in_traversal_order..add_from(graph_node)
775 for predecessor_ref in graph_node["predecessors"]:
776 predecessor_node = self.nodes[predecessor_ref]
--> 777 if not in_set(predecessor_node):
778 add_from(predecessor_node)
780 # Then add ourselves
DSSProjectFlowGraph.get_items_in_traversal_order..in_set(obj)
767 def in_set(obj):
768 for candidate in ret:
--> 769 if candidate["type"] == obj["type"] and candidate["ref"] == obj["ref"]:
770 return True
771 return False
RecursionError: maximum recursion depth exceeded in comparison
As a result, flows using data to rewrite the same directory become problematic only in certain cases
output dataset > used as intermediate dataset of a recipe upstream of this output dataset.
but not problematic in other cases, such as;
input folder > used as output folder at the end of the flow.