Skip to content

Latest commit

 

History

History
388 lines (277 loc) · 13.3 KB

File metadata and controls

388 lines (277 loc) · 13.3 KB

COMPD Model Understanding v1.3.3

Generated by Perplexity AI, verified by Ramiz Gindullin

Overview

This document provides a comprehensive technical breakdown of the COMPD v1.3.3 constraint optimization model for microplate layout design.

Coordinate System Architecture

Dual Coordinate System

COMPD uses two distinct coordinate systems:

  1. Inner Coordinates (used in all constraints)

    • Ranges: 1..num_rows_line × 1..num_cols_line
    • Excludes: outer edge wells (size_empty_edge) and corner wells (num_corner_empty_wells)
    • Purpose: ALL constraint logic operates in this space
  2. Full Plate Coordinates (used only in output)

    • Ranges: 1..num_rows × 1..num_cols
    • Includes: all wells (edges and corners)
    • Conversion: calculate_line_index() maps inner → full coordinates

Critical Rule: When modifying constraints, always work in inner coordinates. Only the output section uses full plate coordinates.


Quadrant System

Distribution Strategy

Wells are assigned to 4 quadrants (Q-I, Q-II, Q-III, Q-IV) using deterministic round-robin allocation followed by sequential bin packing.

Example (5×5 grid, 4 materials with quantities [10, 6, 4, 5]):

Step 1: List all replicates in order
1 1 1 1 1 1 1 1 1 1 | 2 2 2 2 2 2 | 3 3 3 3 | 4 4 4 4 4

Step 2: Apply mod-4 labeling
1 1 1 1 1 1 1 1 1 1 | 2 2 2 2 2 2 | 3 3 3 3 | 4 4 4 4 4
i ii iii iv i ii iii iv i ii | iii iv i ii iii iv | i ii iii iv | i ii iii iv i

Step 3: Group by label
i:   1 1 1 2 3 4 4
ii:  1 1 1 2 3 4
iii: 1 1 2 2 3 4
iv:  1 1 2 2 3 4

Step 4: Assign to quadrants (first 9 → Q-I, next 6 → Q-II, etc.)
Q-I:   Material 1 (5), Material 2 (1), Material 3 (1), Material 4 (2)
Q-II:  Material 1 (3), Material 2 (1), Material 3 (1), Material 4 (1)
Q-III: Material 1 (2), Material 2 (2), Material 3 (1), Material 4 (1)
Q-IV:  Material 1 (0), Material 2 (2), Material 3 (1), Material 4 (1)

Purpose: Ensures balanced distribution without reified constraints.

Quadrant Rotation Across Plates

array[1..num_plates_lines] of int: quadrant_shift_line = [p - 1 | p in 1..num_plates_lines];

Effect:

  • Plate 1: Q-I receives materials first
  • Plate 2: Q-II receives materials first
  • Plate 3: Q-III receives materials first
  • Plate 4: Q-IV receives materials first (then wraps)

Purpose: Prevents systematic bias where the same concentration always appears in the same quadrant across plates. This distributes quadrant-specific plate effects (temperature gradients, pipetting errors) more evenly across experimental conditions.


Criteria Sets

Purpose

Criteria sets group materials that should be spatially distributed (maximize min-distance). Each criteria set undergoes independent distance optimization.

Construction Logic

Per plate p:

  1. Set 1: Always empty wells (if present)

  2. Control wells:

    • If total controls ≤ 50% of wells → one set for all controls
    • If total controls > 50% of wells → separate sets per control type
  3. Compound wells:

    • Per compound, count total replicates across all concentrations
    • If compound total ≤ 50% of wells → one set for all concentrations
    • If compound total > 50% of wells → separate sets per concentration

Distance Optimization Flags

use_min_dist_criteria[p,s]: Apply distance constraints to criteria set s on plate p?

use_min_dist_criteria[p,s] = 
    (num_criteria_set_line[p,s] > 1) AND
    (num_criteria_set_line[p,s] <= min[num_wells_set_criteria, threshold × num_total_wells_line])

Rationale for upper threshold:

  • With many replicates (e.g., 66 on a 96-well plate), plate effects dominate spatial effects
  • Distance optimization becomes computationally expensive without adding scientific value
  • Such large sets are excluded from distance criteria but still subject to lex ordering

use_fake_edge_wells[p,s]: Add fake edge wells for distance calculation?

use_fake_edge_wells[p,s] = 
    (num_criteria_set_line[p,s] > 0) AND
    (category != 'ctrs') AND  
    (num_criteria_set_line[p,s] ≤ 6)

Purpose: For sparse replicates, fake edge wells prevent corner clustering by penalizing distance to plate edges.


Symmetry Breaking

Dual lex_chain_greater System

COMPD uses two separate lex constraints with different scopes:

1. Per-Quadrant Lex (Lines ~587-605)

constraint forall(p in 1..num_plates_lines, i in 1..4, j in 1..num_criteria_sets
                  where quadrant_criteria_counts[p,i,j] >= 2)
  (lex_chain_greater([...]));
  • Scope: Wells of criteria set j in quadrant i of plate p
  • Purpose: Breaks symmetry within each quadrant
  • New in v1.3.3: Randomized ordering via random_permutation() prevents concentration clustering in rows

2. Per-Concentration Lex (Lines ~610-622)

constraint forall(p in 1..num_plates_lines, i in 1..len_emptywells_controls_compounds
                  where use_optional_constraints[p,i])
  (forall(a in 1..4 where quadrant_on_different_rows_columns[p,i,a] >= 2)
    (lex_chain_greater([...])));
  • Scope: Wells of material i in quadrant a (when optional constraints apply)
  • Purpose: Breaks symmetry when alldifferent/GCC constraints are active
  • Stronger constraint: Applied per-concentration, not per criteria set

Why both?

  • First handles materials without optional constraints (broader groups)
  • Second handles materials with optional constraints (finer granularity)

Empty Wells Asymmetry in Lex Constraint

The Issue: The first lex_chain_greater constraint treats empty wells (j=1) differently from controls and compounds (j>1).

Implementation:

where (emptywells_controls_compounds_indices_ordered[p,k] != 1 \/ 
       (j = 1 -> use_min_dist_criteria[p,j]))

Behavior:

  • Empty wells (j=1): Included in lex only if use_min_dist_criteria[p,1] = true
  • Controls/compounds (j>1): Always included in lex, regardless of distance criteria flag

Design Rationale:

  1. Scientific Interchangeability: Empty wells are scientifically identical—their exact positions don't matter for experimental interpretation. Controls and compounds are NOT interchangeable.

  2. Computational Efficiency: With many empty wells (e.g., 20+ on a 96-well plate), applying lex ordering to all of them over-constrains the solver without adding scientific value.

  3. Solver Guidance: Controls and compounds benefit from lex symmetry breaking even when excluded from distance optimization, as it guides the solver toward more interpretable layouts.

When This Matters:

  • If a plate has 25 empty wells and use_min_dist_criteria[p,1] = false (too many to optimize distance), those empty wells are excluded from both distance optimization and lex ordering
  • If a compound has 25 replicates and is excluded from distance criteria, it's still subject to lex ordering

Result: Balances computational tractability with experimental interpretability.


Optional Constraints

Helper Function Architecture (v1.2.9+)

function bool: well_excluded_by_optional_constraints(int: quadrant, int: plate, int: material)

Returns true if well should be excluded from lex constraint due to optional constraints being active.

Logic:

(quadrant in {1,3} AND row constraints active) OR
(quadrant in {2,4} AND row constraints active) OR
(quadrant in {1,2} AND column constraints active) OR
(quadrant in {3,4} AND column constraints active)

alldifferent vs. GCC Split (v1.2+)

For each half (top/bottom/left/right):

  • If wells_in_half ≤ rows_or_cols_in_half: Apply alldifferent
  • If wells_in_half > rows_or_cols_in_half: Apply global_cardinality_constraint (GCC)

GCC constraints:

global_cardinality(coordinates, values, [1, 1, ..., 1], [upper_bound, ...])
  • Lower bound: 1 for all rows/columns (ensures coverage)
  • Upper bound: num_cols_line (single plate) or num_cols_line × num_plates (all-plates variant)

GCC Upper Bound Design for All-Plates Constraints

Single-Plate GCC:

upper_bound = num_cols_line  (for row constraints)
upper_bound = num_rows_line  (for column constraints)

Each row can have at most num_cols wells—a tight, physically meaningful bound.

All-Plates GCC:

upper_bound = num_cols_line * num_plates_lines

Why This Is Loose But Correct:

The upper bound allows, in theory, all wells to cluster on plates 1-2 with none on plates 3-10. However:

  1. The Lower Bound Does the Work: The critical constraint is lower_bound = 1, ensuring every row/column position is covered at least once across all plates.

  2. Practical Constraints Dominate: The plate distribution strategy, quadrant allocation, and distance optimization already prevent extreme clustering.

  3. Tighter Bounds Are Complex: A per-plate upper bound would require tracking which plates have which materials, adding significant model complexity.

  4. Scientific Validity: For multi-plate experiments, ensuring coverage is more important than enforcing uniform density across plates.

Conclusion: The loose upper bound is a deliberate simplification that doesn't compromise experimental validity.


Optimization Criteria

Objective Function (Equation 16)

maximize: M * min(D) + |D|

Where:

  • M = 1000 (large constant)
  • D = list of all pairwise distances within active criteria sets
  • |D| = cardinality (number of distances)

Two-stage lexicographic optimization:

  1. Primary: Maximize minimum distance across all criteria sets
  2. Tie-breaker: Maximize total number of distance pairs (favors larger active sets)

Distance Calculation (v1.3.0+)

Old approach (≤v1.2.9):

criteria_set_min_distance[p,s] = min([distances in set])

New approach (v1.3.0+):

forall(d in distances_in_set) (criteria_set_min_distance[p,s] <= d)

Benefits:

  • Reduces memory consumption by 5-10%
  • Improves constraint propagation
  • Slightly faster solve times in some cases

Randomization in v1.3.3

Non-Reproducibility Issue

Function:

function array[int] of int: random_permutation(int: n) = arg_sort([cauchy(0,2) | i in 1..n]);

Problem: MiniZinc's cauchy() has no seed parameter → same data produces different layouts each run.

Where used:

  1. Concentration ordering in emptywells_controls_compounds_names_concentrations
  2. Lex ordering in symmetry-breaking constraints (when not under optional constraints)

Impact:

  • Good: Prevents systematic bias (e.g., 1nM always in Q-I)
  • Bad: Cannot reproduce exact layout from git commit + data file

Workaround: Save the .ozn file after solving to capture the actual permutation used.


Version 1.3.3 Changes Summary

  1. Randomized lex ordering: Prevents row/column clustering when optional constraints inactive
  2. Shuffled concentration lists: Prevents quadrant clustering when replicates divisible by 4
  3. Both changes: No performance impact, but layouts are non-reproducible

Common Edge Cases

Single Row/Column Plates

bool: use_quadrant_distribution = (num_rows_line > 1) /\ (num_cols_line > 1);

When false:

  • All wells assigned to quadrant 1
  • Optional row/column constraints disabled
  • Simplified constraint posting

Empty Criteria Sets

Constraints automatically skip when:

where quadrant_criteria_counts[p,i,j] >= 2
where num_criteria_set_line[p,s] > 1

Prevents posting unnecessary constraints for empty/singleton sets.


Performance Tuning Parameters

int: num_wells_set_criteria = if num_total_wells_line < 300 then 90 else 120 endif;
float: num_threshold_min_dist_criteria = if num_total_wells_line < 300 then 0.35 else 0.25 endif;

Rationale:

  • Larger plates (≥300 wells): More permissive thresholds (fewer criteria sets in optimization)
  • Smaller plates (<300 wells): Stricter thresholds (more criteria sets optimized)
  • Balance: Solution quality vs. solve time

Model Architecture Summary

Input Data
   ↓
Plate Distribution (replicates_on_same_plate / different_plates / randomized)
   ↓
Quadrant Allocation (deterministic round-robin + bin packing)
   ↓
Criteria Set Construction (based on 50% threshold)
   ↓
Constraint Posting:
   ├─ LEXALLDIFFERENT (no well overlap)
   ├─ lex_chain_greater (symmetry breaking, dual system)
   ├─ alldifferent/GCC (optional constraints)
   └─ Domain restrictions (quadrant-based)
   ↓
Optimization (maximize min-distance, two-stage)
   ↓
Output (convert inner → full coordinates)

Key Takeaways for Model Modification

  1. Always use inner coordinates in constraints
  2. Understand the dual lex system before adding symmetry-breaking constraints
  3. Respect the quadrant allocation (deterministic, don't add reified constraints)
  4. Use helper functions (well_excluded_by_optional_constraints, etc.) for readability
  5. Test with single-row/column edge cases (use_quadrant_distribution = false)
  6. Be aware of non-reproducibility in v1.3.3+ due to random_permutation()

Further Reading

  • Original paper: See README.md for algorithm derivation and evaluation
  • Version history: See VERSION_HISTORY.md for detailed changelogs
  • Design notes: See COMPD_Design_Notes_with_Examples.md for edge case examples