[BUG] Prevent infinite loops in AptaTransPipeline and MCTS candidate generation#290
[BUG] Prevent infinite loops in AptaTransPipeline and MCTS candidate generation#290direkkakkar319-ops wants to merge 4 commits into
AptaTransPipeline and MCTS candidate generation#290Conversation
|
ran the checks locally in this All actual code-quality steps passed. Post python environment step fails with exitcode '127': node not found — this is the act teardown bug, not your code detect-notebooks-change with command run-jupyter-notebooks with command run-tests-no-extras with command |
|
Hi, This PR has a big diff and lots of unnecessary changes, I would request you to either revert those commits or open a new PR and make sure pre-commit is installed before you make new changes. |
|
I am converting this to a draft PR for now. |
28c80d5 to
f38e970
Compare
|
Hi @siddharth7113 i have reverted the unnecessary changes. All checks are now passing, and the PR should be in good shape for review. Thanks |
|
Sorry to barge in but is this really an issue? The probability of MCTS returning duplicate sequences so often that we run into an infinite loop seems very low to me, especially given reasonable parameters (e.g., sequence length, number of candidates, etc.). Could you showcase an example where the current implementations is stuck in an infinite loop? Otherwise, I am not sure whether this change is needed. Also, adding a guard like this introduces additional issues: the algorithm may return a lower number of candidates than the ones specified by the user. |
|
Hi @NennoMP, you arre right, the infinite loop can't actually happen. The score in the tuple is a >>> t1 = torch.tensor(0.5); t2 = torch.tensor(0.5)
>>> hash(t1) == hash(t2)
False # different objects → different hashes
>>> s = set(); s.add(("A", "B", t1)); s.add(("A", "B", t2))
>>> len(s)
2 # both added despite identical content |
|
However, this also means the current code doesn't truly deduplicate by sequence content — twoo identical sequences get kept as separate entries because their tensor objects differ. My Happy to rework the PR framing or close it if you prefer. |
|
@direkkakkar319-ops You are right on the fact that we are not deduplicating corrently, since we are also hashing a tensor for the score. If it wasn't for that, it would probably work, since the neural network running in inference mode likely returns the same score for identical sequences. I would suggest to start fresh, new issue and new PR, and fix the no deduplication bug. Minimal changes without modifying other parts of the pipeline or MCTS, unless strictly needed. The fix is relatively straightforward, either make sure the set checks for duplicates based only on the candidate sequence or use a dictionary with the candidate sequence as key. Some example code would look like this, but feel free to do it differently. candidates = {}
while len(candidates) < n_candidates:
result = mcts.run(verbose=verbose)
candidate, sequence, score = tuple(result.values())
if candidate not in candidates:
candidates[candidate] = (candidate, sequence, score.item())
if verbose:
for candidate, sequence, score in candidates.values():
print(
f"Candidate: {candidate}, "
f"Sequence: {sequence}, "
f"Score: {score:.4f}"
)
return set(candidates.values()) |





What does this implement/fix? Explain your changes.
This PR addresses two significant stability issues that could lead to production hangs and unbounded CPU usage:
Infinite Loop in _pipeline.py :
Infinite Loop in _algorithm.py :
What should a reviewer concentrate their feedback on?
Did you add any tests for the change?
Yes, several updates were made to the test suite:
Any other comments?
PR checklist
ran the precommits
ran the tests locally they are passing
