Skip to content

Latest commit

 

History

History
415 lines (301 loc) · 12.8 KB

File metadata and controls

415 lines (301 loc) · 12.8 KB

COMPD Design Notes with Examples - UPDATED v1.3.3

This document provides concrete examples of edge cases, design decisions, and expected behaviors.

Generated by Perplexity AI, verified by Ramiz Gindullin


1. Quadrant Distribution with Uneven Splits

Scenario: 5×5 grid (25 wells), 4 materials with quantities [10, 6, 4, 5]

Quadrant sizes:

  • Q-I: ⌊5/2⌋ × ⌈5/2⌉ = 2 × 3 = 6 wells (but 3×3 = 9 due to asymmetry)
  • Q-II: ⌈5/2⌉ × ⌈5/2⌉ = 3 × 3 = 9 wells
  • Q-III: ⌊5/2⌋ × ⌊5/2⌉ = 2 × 2 = 4 wells (but 3×2 = 6)
  • Q-IV: ⌈5/2⌉ × ⌊5/2⌋ = 3 × 2 = 6 wells

Actually:

  • Q-I: rows 1-2, cols 1-3 = 6 wells
  • Q-II: rows 3-5, cols 1-3 = 9 wells
  • Q-III: rows 1-2, cols 4-5 = 4 wells
  • Q-IV: rows 3-5, cols 4-5 = 6 wells

Result after round-robin + bin packing:

Q-I (9 wells):   Material 1 ×5, Material 2 ×1, Material 3 ×1, Material 4 ×2
Q-II (6 wells):  Material 1 ×3, Material 2 ×1, Material 3 ×1, Material 4 ×1
Q-III (6 wells): Material 1 ×2, Material 2 ×2, Material 3 ×1, Material 4 ×1
Q-IV (4 wells):  Material 1 ×0, Material 2 ×2, Material 3 ×1, Material 4 ×1

Note: Material 1 gets 0 wells in Q-IV despite having 10 total replicates. This is correct behavior—the round-robin ensures materials appear in multiple quadrants but doesn't guarantee all 4.


2. Criteria Set Splitting Example

Scenario: 96-well plate (8 rows × 12 cols, no edges), 84 wells available

Materials:

  • 10 empty wells
  • Positive controls: 15 replicates
  • Negative controls: 15 replicates
  • Drug A: 3 concentrations × 4 replicates = 12 total
  • Drug B: 3 concentrations × 10 replicates = 30 total

Threshold: 50% of wells = 42

Criteria set construction:

  1. Set 1: 10 empty wells (always separate)
  2. Controls: 15 + 15 = 30 total < 42 → ONE set for all controls
  3. Drug A: 12 total < 42 → ONE set for all 3 concentrations
  4. Drug B: 30 total < 42 → ONE set for all 3 concentrations

If Drug B had 50 replicates:

  • Drug B: 50 total > 42 → THREE separate sets, one per concentration

Result: 4 criteria sets (or 6 if Drug B > threshold)


3. Optional Constraints: alldifferent vs. GCC

Scenario: 96-well plate, top half has 4 rows

Case A: 3 replicates of concentration C1

  • Wells in top half: 3
  • Rows in top half: 4
  • 3 ≤ 4 → Apply alldifferent(row_coordinates)
  • Result: Each replicate in a different row (row 4 empty)

Case B: 6 replicates of concentration C2

  • Wells in top half: 6
  • Rows in top half: 4
  • 6 > 4 → Apply GCC(row_coordinates, [1,2,3,4], [1,1,1,1], [12,12,12,12])
  • Result: At least 1 replicate per row (some rows have 2)

Why GCC upper bound = 12?

  • Full plate has 12 columns
  • Upper bound = num_cols_line = 12
  • Allows up to 12 wells in one row (physically possible)

4. All-Plates GCC Looseness

Scenario: 5 plates, concentration C1 with 20 replicates total, top half

Constraint:

GCC(all_C1_rows_across_plates, [1,2,3,4], [1,1,1,1], [60,60,60,60])
  • Lower bound: 1 per row → At least one C1 well in row 1 across all 5 plates
  • Upper bound: 60 (12 cols × 5 plates) → Could theoretically put all 60 possible wells in row 1

Why is this OK?

  • The 20 replicates are already distributed via quadrant allocation
  • Distance optimization spreads them further
  • The lower bound ensures coverage (the critical requirement)
  • A tighter upper bound (e.g., 4 per row) would over-constrain and potentially make the model infeasible

Actual outcome: Distribution will be ~5 replicates per row due to other constraints, not because of the GCC upper bound.


5. Fake Edge Wells Example

Scenario: 8×12 grid, Drug X with 3 replicates

Without fake edge wells:

Optimal placement (all in corners):
[1,1], [1,12], [8,1]
Min distance: 7 + 11 + 7 = 25 (Manhattan)

Wells cluster in corners (undesirable for plate effects).

With fake edge wells at:

[0,0], [9,0], [0,13], [9,13], ...

New optimal placement (closer to center):

[2,4], [4,8], [6,12]
Min distance: Smaller than 25 due to fake edge penalties

Fake edges push wells toward the center.

Threshold: use_fake_edge_wells[p,s] = true if:

  • Criteria set has ≤ 6 wells
  • Not controls (controls exempt)

6. Symmetry Breaking: Randomized vs. Sequential Lex

Old Behavior (v1.3.2 and earlier)

lex_chain_greater([[C[1,1], C[1,2]], [C[2,1], C[2,2]], ..., [C[n,1], C[n,2]]])

Wells ordered by material index → Drug A, concentration 1 replicates come first.

Problem: First replicates tend to go to top-left of quadrants (row 1 preference).

New Behavior (v1.3.3)

array[int] of int: perm = random_permutation(n);
lex_chain_greater([[C[perm[1],1], C[perm[1],2]], ..., [C[perm[n],1], C[perm[n],2]]])

Example:

  • 12 replicates of Drug A (all same concentration)
  • perm = [7, 2, 11, 4, 9, 1, 5, 12, 3, 10, 6, 8]
  • Lex ordering: replicate 7 first, then 2, then 11, ...

Result: More even row/column distribution (no systematic row 1 bias).


7. Empty Wells Lex Asymmetry Edge Case

Scenario: A 96-well plate with 30 empty wells and one compound with 66 replicates.

What Happens:

  1. Criteria Set Construction:

    • Set 1: 30 empty wells
    • Set 2: 66 compound replicates
  2. Distance Criteria Flags:

    • use_min_dist_criteria[p,1] = false (30 wells < 50% threshold but still too many)
    • use_min_dist_criteria[p,2] = false (66 wells > 50% threshold)
  3. Lex Constraint Application:

    For Set 1 (empty wells):
      (j = 1 -> use_min_dist_criteria[p,1]) evaluates to (true -> false) = false
      → Empty wells EXCLUDED from lex
    
    For Set 2 (compound):
      No conditional check
      → Compound replicates INCLUDED in lex
    
  4. Result:

    • Empty wells: Get default coordinates (num_rows_line, num_cols_line) via set_default_empty_well_coords
    • Compound: Subject to lex ordering even though not in distance objective

Why This Design Is Correct:

  • The 30 empty wells being in one corner doesn't affect experimental outcomes
  • The 66 compound replicates having a consistent ordering helps the solver converge faster
  • Without lex on compounds, solver might spend time exploring symmetric permutations

Alternative Design (Not Used): Apply lex to both sets unconditionally → Would over-constrain the model, likely causing timeout on large instances.


8. Randomization Non-Reproducibility

The Issue: As of v1.3.3, random_permutation() uses MiniZinc's cauchy(0,2) function, which has no seed parameter.

Concrete Example:

Data File: experiment_A.dzn

  • Compound "DrugX" with 3 concentrations: [1nM, 10nM, 100nM]
  • Each concentration: 8 replicates

Run 1 (Monday):

Concentration order after random_permutation: [10nM, 1nM, 100nM]
Quadrant assignment: 10nM mostly in Q-I, 1nM in Q-II, 100nM in Q-III/IV

Run 2 (Tuesday, identical data):

Concentration order after random_permutation: [100nM, 10nM, 1nM]  
Quadrant assignment: 100nM mostly in Q-I, 10nM in Q-II, 1nM in Q-III/IV

Impact:

Good: Prevents systematic bias (if 1nM were always in Q-I, any Q-I-specific effects would confound low-dose results)

Bad:

  • Cannot reproduce exact layout from git commit + data file
  • Debugging is harder ("Why did this fail on CI but not locally?")
  • Regression testing cannot compare exact well positions

Workarounds:

  1. For reproducibility: Save the .ozn (MiniZinc output) file after solving, which captures the actual permutation used
  2. For testing: Use fixed test data with concentrations designed to be distinct regardless of permutation
  3. Future enhancement: Add opt int: random_seed parameter and implement deterministic PRNG

When Randomization Matters Most:

  • Replicates divisible by 4: Without randomization, all replicates of one concentration end up in the same quadrant
  • Example: 12 replicates of 100nM → without shuffle, all 12 might go to Q-I (bad for plate effect control)

9. Single Row/Column Plate Behavior

Scenario: 1×96 well plate (single row)

Automatic Adjustments:

use_quadrant_distribution = false  (because num_rows_line = 1)

Effects:

  1. All wells assigned to quadrant 1
  2. Row-based optional constraints disabled (can't have "different rows" with 1 row)
  3. Column-based constraints still active
  4. Simplified constraint posting (no quadrant iteration)

Output: Layout essentially becomes a 1D optimization problem (column distribution only).


10. Criteria Set with All Singletons

Scenario: Plate with 4 compounds, each with 1 replicate (no concentrations)

Criteria Sets:

  • Set 1: Empty wells
  • Set 2: All 4 control replicates (if < 50% threshold)
  • Sets 3-6: Each compound separately (since each has only 1 concentration)

Distance Optimization:

  • Sets with num_criteria_set_line[p,s] = 1 are skipped (can't optimize distance for 1 well)
  • where num_criteria_set_line[p,s] > 1 filter in constraint

Result: Only empty wells and controls (if > 1 replicate) are optimized for distance.


11. replicates_on_same_plate Allocation Failure

Scenario: 3 plates, 3 compounds with [40, 40, 40] replicates each on a 50-well plate

Available space per plate after compounds: [10, 10, 10]

Attempting to place 30 total control replicates:

Current allocation strategy:

Plate 1: 10 controls
Plate 2: 10 controls
Plate 3: 10 controls

✓ Works fine (evenly distributed)

Now add 15 empty wells:

Attempted allocation:

floor(15 × 10 / 30) = 5 empty wells per plate
Plate 1: 5 empty + 10 controls = 15 > 10 available → ERROR

Error Message:

Model ERROR! Material distribution failed with replicates_on_same_plate strategy.
The available space distribution is too imbalanced...
Diagnostic Information:
  Number of plates: 3
  Available space per plate: [10, 10, 10]
  Materials to distribute: (empty: 15, controls: 30)

Solutions:

  1. Use replicates_on_different_plates = true instead
  2. Reduce number of control replicates
  3. Use Python preprocessing script for optimal allocation

12. Corner Wells with Small Plates

Scenario: 4×6 plate with size_corner_empty_wells = 2

Corner wells to exclude:

  • Top-left: [1,1], [1,2], [2,1], [2,2]
  • Top-right: [1,5], [1,6], [2,5], [2,6]
  • Bottom-left: [3,1], [3,2], [4,1], [4,2]
  • Bottom-right: [3,5], [3,6], [4,5], [4,6]

Total excluded: 16 wells

Remaining: 24 - 16 = 8 wells (only!)

Assertion check:

constraint assert(2 * num_corner_empty_wells <= max(num_rows_line, num_cols_line), ...);

Would fail if num_corner_empty_wells = 3 (exceeds dimensions).


13. Concentration Shuffling Edge Case (v1.3.3)

Scenario: Drug A with 4 concentrations [0.1, 1, 10, 100], each with 4 replicates

Old behavior (v1.3.2):

Concentration list: [0.1, 0.1, 0.1, 0.1, 1, 1, 1, 1, 10, 10, 10, 10, 100, 100, 100, 100]
Quadrant allocation:
  Q-I: [0.1, 0.1, 0.1, 0.1]
  Q-II: [1, 1, 1, 1]
  Q-III: [10, 10, 10, 10]
  Q-IV: [100, 100, 100, 100]

Each concentration in exactly one quadrant (undesirable).

New behavior (v1.3.3):

Random permutation: [3, 1, 4, 2]  (shuffle concentration order)
Concentration list: [10, 10, 10, 10, 0.1, 0.1, 0.1, 0.1, 100, 100, 100, 100, 1, 1, 1, 1]
Quadrant allocation:
  Q-I: [10, 10, 10, 10]
  Q-II: [0.1, 0.1, 0.1, 0.1]
  Q-III: [100, 100, 100, 100]
  Q-IV: [1, 1, 1, 1]

Still one concentration per quadrant, but which concentration goes where changes each run.

Why this helps: If there's a systematic Q-I bias (e.g., temperature), it doesn't always affect the lowest/highest concentration.


14. Empty Criteria Set Quadrants

Scenario: Drug X with 7 replicates on a 2×6 grid (12 wells, 4 quadrants of 3 wells each)

Quadrant allocation:

  • Q-I: 2 replicates
  • Q-II: 2 replicates
  • Q-III: 2 replicates
  • Q-IV: 1 replicate

Per-quadrant lex constraint:

where quadrant_criteria_counts[p,i,j] >= 2
  • Q-I, Q-II, Q-III: lex applied ✓
  • Q-IV: skipped (only 1 well, no symmetry to break)

Result: No error, constraint simply not posted for Q-IV.


Summary of Design Philosophy

  1. Fail gracefully: Use assertions with detailed error messages
  2. Skip when inapplicable: Use where guards instead of reified constraints
  3. Prioritize coverage over uniformity: Lower bounds critical, upper bounds loose
  4. Balance performance vs. quality: Thresholds tuned for plate size
  5. Embrace controlled randomness: Prevents systematic bias (v1.3.3+)

Testing Recommendations

When modifying the model, test these edge cases:

  • ✅ Single row/column plates
  • ✅ Criteria sets with 1 replicate
  • ✅ Replicates divisible by 4 (concentration shuffling)
  • ✅ Empty wells exceeding 50% threshold
  • ✅ All-plates constraints with uneven distribution
  • ✅ Corner wells on small plates
  • ✅ replicates_on_same_plate with limited space