You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Add SGD diagnostic tools and comprehensive test suite
- Added sgd_diagnostics binary for analyzing node displacement in SGD layouts
- Added measure_layout_quality binary for graph quality metrics
- Added test_sgd_perfect_sort.rs with 6 comprehensive SGD tests
- Added test_sgd_real_b3106_pattern.rs with real-world B-3106.fa test cases
- Added documentation of SGD investigation (sgd_reverse_handle_bug.md)
- Updated bidirected_gfa_writer.rs to support SGD parameter calculation
- Updated lib.rs to export path_sgd and path_sgd_exact modules
These tools were used to identify and verify the RC handle bug fix.
# SGD Path Position Calculation Bug - Reverse Handles
2
+
3
+
## Problem
4
+
5
+
Path-guided SGD produces suboptimal layouts with "bubbles" - nodes that should be adjacent in paths are placed far apart. Topological sort would eliminate these, indicating SGD is not calculating correct positions.
**The Problem**: This finds the FIRST occurrence of `handle_b` in the path. If a node appears multiple times (valid for structural variation), or if we're looking for the wrong rank, we calculate the wrong position.
31
+
32
+
**However**, my attempted fix (using `b_rank` instead) made things WORSE, suggesting the current implementation is closer to correct.
33
+
34
+
## Alternative Hypothesis
35
+
36
+
The real issue may be that **path positions are calculated correctly**, but the SGD is not converging because:
37
+
38
+
1. Initial positions (seeded by node ID order) create massive separations
39
+
2. Not enough iterations to overcome poor initialization
40
+
3. Conflicting constraints from paths with mixed orientations
41
+
42
+
## Diagnostic Evidence
43
+
44
+
Created `sgd_diagnostics` tool showing:
45
+
- Adjacent nodes in paths (1-40bp apart) are placed 100x-3976x apart in final layout
46
+
- Example: Node 1 and 205 are 1bp apart in path, but 845 positions apart in layout
47
+
- ALL paths show this problem (not just RC paths)
48
+
49
+
## Next Steps
50
+
51
+
1. Create focused tests for simple cases (chains with RC edges)
52
+
2. Verify SGD can achieve perfect layout on trivial graphs
53
+
3. If tests fail, the bug is confirmed in position calculation or SGD convergence
54
+
4. Fix and verify
55
+
56
+
## Test Cases Needed
57
+
58
+
1.**Simple chain forward**: `1+->2+->3+` should place nodes sequentially
59
+
2.**Simple chain reverse**: `1-->2-->3-` should place nodes sequentially
60
+
3.**Mixed orientation chain**: `1+->2-->3+` should place nodes sequentially
61
+
4.**Simple bubble with RC**:
62
+
```
63
+
Path A: 1+->2+->3+
64
+
Path B: 1+->2-->3+ (traverses node 2 in reverse)
65
+
```
66
+
5.**Repeated node**: `1+->2+->1+` should handle revisiting correctly
67
+
68
+
Each test should verify SGD produces perfect ordering (no displacement).
0 commit comments