Phase B performs physics validation checks to ensure model physics consistency and provides reasonable automatic corrections. This comprehensive guide covers all aspects of Phase B operation.
- Architecture and Design
- Technical Implementation
- Scientific Validation Categories
- CRU TS4.06 Climatological Integration
- Scientific Corrections and Adjustments
- Processing Modes and Behaviour
- Output Files Structure
- Error Handling and Edge Cases
- Integration with Other Phases
- Testing and Validation
- Best Practices
- Troubleshooting
Phase B implements a multi-layered scientific validation system that:
- Validates Physics Parameters: Ensures required model physics options parameters are present and non-empty
- Checks Model Dependencies: Validates internal consistency between physics options
- Validates Land Cover: Checks surface fraction totals and parameter consistency
- Validates Geographic Parameters: Ensures coordinates and location-dependent parameters are realistic
- Validates Irrigation Parameters: Checks irrigation timing for consistency and climatological appropriateness
- Applies CRU Integration: Uses CRU TS4.06 climatological data for temperature initialisation
- Makes Scientific Corrections: Automatic adjustments that improve model realism
validate_phase_b_inputs(): Input file validation and loadingextract_simulation_parameters(): Extract and validate simulation parameters with comprehensive error collectionvalidate_physics_parameters(): Required physics parameter validationvalidate_model_option_dependencies(): Physics option consistency checkingvalidate_land_cover_consistency(): Surface fraction and parameter validationvalidate_geographic_parameters(): Coordinate and location validationvalidate_irrigation_doy(): Irrigation timing validation with hemisphere and leap year awarenessvalidate_irrigation_parameters(): Site-level irrigation parameter validationvalidate_dls_doy(): Daylight saving time parameter validation with leap year and hemisphere awarenessget_mean_monthly_air_temperature(): CRU TS4.06 monthly climatological temperature lookupget_mean_annual_air_temperature(): CRU TS4.06 annual climatological temperature lookup (average of 12 months)run_scientific_adjustment_pipeline(): Intelligent automatic parameter adjustmentsrun_science_check(): Main orchestration function for all validations
@dataclass
class ValidationResult:
"""Structured result from scientific validation checks."""
status: str # 'PASS', 'WARNING', 'ERROR'
category: str # 'PHYSICS', 'GEOGRAPHY', 'SEASONAL', 'LAND_COVER', 'MODEL_OPTIONS'
parameter: str
site_index: Optional[int] = None
message: str = ""
suggested_value: Any = None
applied_fix: bool = False
@dataclass
class ScientificAdjustment:
"""Record of automatic scientific adjustment applied."""
parameter: str
site_index: Optional[int] = None
old_value: Any = None
new_value: Any = None
reason: str = ""
class DLSCheck(BaseModel):
"""Calculate daylight saving time transitions and timezone offset from coordinates."""
lat: float
lng: float
year: int
startdls: Optional[int] = None
enddls: Optional[int] = NoneValidates specific critical model physics parameters that are essential for model operation:
- Selected Required Parameters: Key physics options validated for presence and valid values
- Non-Empty Values: Critical parameters cannot be null, empty, or zero when required
- Dependency Validation: Validates interdependencies between physics options (e.g., rslmethod-stabilitymethod)
- Type Validation: Parameters must have correct data types
- Note: Currently focuses on essential parameters rather than comprehensive validation of all physics options
Validates internal consistency between different physics options using actual implemented dependency rules:
def validate_model_option_dependencies(yaml_data: dict) -> List[ValidationResult]:
"""Validate internal consistency between model physics options."""
results = []
physics = yaml_data.get("model", {}).get("physics", {})
# Check rslmethod-stabilitymethod constraints
rslmethod = get_value_safe(physics, "rslmethod")
stabilitymethod = get_value_safe(physics, "stabilitymethod")
storageheatmethod = get_value_safe(physics, "storageheatmethod")
ohmincqf = get_value_safe(physics, "ohmincqf")
# Constraint: If rslmethod == 2, stabilitymethod must be 3
if rslmethod == 2 and stabilitymethod != 3:
results.append(ValidationResult(
status="ERROR",
category="MODEL_OPTIONS",
parameter="rslmethod-stabilitymethod",
message="If rslmethod == 2, stabilitymethod must be 3",
suggested_value="Set stabilitymethod to 3"
))
# Constraint: StorageHeatMethod=1 (OHM_WITHOUT_QF) requires OhmIncQf=0
if storageheatmethod == 1 and ohmincqf != 0:
results.append(ValidationResult(
status="ERROR",
category="MODEL_OPTIONS",
parameter="storageheatmethod-ohmincqf",
message=f"StorageHeatMethod is set to {storageheatmethod} and OhmIncQf is set to {ohmincqf}. You should switch to OhmIncQf=0.",
suggested_value="Set OhmIncQf to 0"
))
return resultsComprehensive validation and adjustment of surface types and parameters:
- Surface Fraction Totals: Must sum to 1.0 for each site - automatically adjusted if needed
- Seasonal LAI Adjustments: Automatic LAI calculation for deciduous trees based on season (only when surface fraction > 0)
Location-dependent parameter validation (actual implemented checks):
- Coordinate Validity: Latitude (-90 to 90°), longitude (-180 to 180°) with numeric type validation
- Timezone Parameter: Warns if missing, can be calculated automatically from coordinates
- Daylight Saving Parameters: Automatically calculated using accurate timezone data (
DLSCheckclass withpytzandtimezonefinder)- If
startdlsandenddlsare None/incorrect, Phase B calculates them automatically based on geographic coordinates - Note: Phase C (Pydantic) validates DOY range [1, 366] for cases when users do not run phase B or complete pipeline.
- If
Validates irrigation timing parameters (ie_start and ie_end) for consistency and climatological appropriateness:
- Day-of-Year Range: Must be 1-365 (non-leap) or 1-366 (leap year), or both 0/None to disable
- Consistency Check: Both parameters must be set together or both disabled
- Hemisphere-Aware Seasonal Check:
- Northern Hemisphere (lat ≥ 23.5°): Warm season May-September (DOY 121-273)
- Southern Hemisphere (lat ≤ -23.5°): Warm season November-March (DOY 305-90)
- Tropical regions (|lat| < 23.5°): No seasonal restrictions
- Year-Wrapping Pattern Detection:
- Northern Hemisphere: Warns if
ie_start > ie_end(unusual cold-season irrigation) - Southern Hemisphere: Warns if
ie_start < ie_end(should wrap for warm season, e.g., DOY 305→60) - Helps identify potentially swapped start/end values
- Northern Hemisphere: Warns if
- Error Handling: Invalid DOY generates ERROR, out-of-season generates WARNING, unusual year-wrapping patterns generate WARNING
Phase B integrates CRU TS4.06 monthly climatological data (1991-2020) for accurate temperature initialisation of surface types and STEBBS parameters:
Phase B provides two CRU-based temperature functions for different use cases:
def get_mean_monthly_air_temperature(
lat: float,
lon: float,
month: int,
spatial_res: float = 0.5
) -> float:
"""Calculate mean monthly air temperature using CRU TS4.06 data."""
# Loads CRU Parquet data from package resources
# Finds nearest grid cell within spatial resolution
# Returns climatological mean temperature for specified monthUsed for initialising parameters that vary with seasons:
- Surface temperatures (tsfc, tin, temperature arrays)
- STEBBS outdoor surface temperatures
- Initial state temperatures for all surface types
def get_mean_annual_air_temperature(
lat: float,
lon: float,
spatial_res: float = 0.5
) -> float:
"""Calculate annual mean air temperature using CRU TS4.06 climate normals."""
# Computes average of all 12 monthly climate normals (1991-2020)
# Returns stable long-term average annual temperature
# Suitable for parameters that do not vary rapidly with seasonsUsed for initialising stable, non-seasonal parameters that require representative annual values rather than month-specific temperatures.
- Coverage: Global land areas at 0.5° resolution
- Period: 1991-2020 climatological normals
- Variables: Monthly mean air temperature
- Accuracy: Location-specific estimates within 0.5° spatial resolution
- Validation: Ensures coordinates are within CRU coverage area
# Before Phase B processing
sites:
- properties:
initial_states:
paved:
tsfc:
value: null # Uninitialised surface temperature
temperature:
value: null # Uninitialised 5-layer temperatures
# After Phase B processing with CRU integration
sites:
- properties:
initial_states:
paved:
tsfc:
value: 15.8 # CRU-derived temperature for January at coordinates
temperature:
value: [15.8, 15.8, 15.8, 15.8, 15.8] # 5-layer temperaturesPhase B makes scientific adjustments that improve model realism without changing user intent:
- CRU Integration: Initialises temperatures using climatological data
- Month-Aware: Uses correct month from simulation start date
- Coordinate-Based: Location-specific temperature from CRU grid
- Fraction Normalisation: Adjusts surface fractions to sum to 1.0 by rounding the surface with maximum fraction value
- Seasonal LAI Adjustments: Calculates LAI for deciduous trees based on seasonal parameters (laimin, laimax) when surface fraction > 0. When surface fraction is 0, existing lai_id values are preserved and validation is skipped with a warning
- Season‑Aware
alb_id: Updatesinitial_states.*.alb_idfor grass, dectr, evetr - Summer Regime:
alb_id(grass) = alb_min(grass);alb_id(dectr/evetr) = alb_max(dectr/evetr) - Winter Regime:
alb_id(grass) = alb_max(grass);alb_id(dectr/evetr) = alb_min(dectr/evetr) - Transition Seasons:
alb_idset to midpoint(alb_min + alb_max)/2for grass, dectr, evetr
- Conditional Logic: When
stebbsmethod == 0, nullifies STEBBS parameters - Parameter Cleanup: Removes unused STEBBS parameters for clarity
- Consistency: Ensures STEBBS configuration matches selected method
- Temperature Initialisation: When
stebbsmethod == 1, automatically updatesInitialOutdoorTemperatureusing CRU climatological data - CRU-Based Updates: Uses location-specific mean monthly air temperature from CRU TS4.06 dataset
Phase B includes enhanced validation logic with improved parameter handling:
- Improved get_value_safe Function: Better handling of nested parameter extraction
- Reduced False Positives: More accurate validation with safer parameter access
- Enhanced Error Handling: Better detection of actual configuration issues
- Automatic DLS Calculation: Computes daylight saving start/end days from coordinates
- Timezone Integration: Uses timezonefinder and pytz libraries for accurate calculations
Phase B uses the mode parameter for report formatting but applies the same validation to all modes:
- Same Validation: Both public and developer modes run identical validation checks
- Same Corrections: Both modes apply the same automatic adjustments
- Mode Difference: Only affects report header formatting ("Public" vs "Developer" in report title)
# Actual validation status values used in implementation
@dataclass
class ValidationResult:
status: str # "ERROR", "WARNING", "PASS"
category: str # "PHYSICS", "GEOGRAPHY", "LAND_COVER", "MODEL_OPTIONS"
parameter: str
message: str = ""# ==============================================================================
# Updated YAML
# ==============================================================================
#
# This file has been updated by the SUEWS processor and is the updated version of the user provided YAML.
# Details of changes are in the generated report.
#
# ==============================================================================
name: Scientifically Validated Configuration
model:
physics:
netradiationmethod: 2
emissionsmethod: 2
stebbsmethod: 0
sites:
- properties:
lat: 51.5074
lng: -0.1278
initial_states:
paved:
tsfc:
value: 12.4 # CRU-derived for January at London coordinatesPhase B generates comprehensive reports with two main sections:
- ACTION NEEDED: Critical physics issues requiring user attention (ERROR status validation results)
- NO ACTION NEEDED: Automatic adjustments made by Phase B, warnings, and Phase A information
# SUEWS Validation Report
# ==================================================
# Mode: Public
# ==================================================
## ACTION NEEDED
- Found (2) critical scientific parameter error(s):
-- rslmethod-stabilitymethod: If rslmethod == 2, stabilitymethod must be 3
Location: model.physics.stabilitymethod
-- storageheatmethod-ohmincqf: StorageHeatMethod is set to 1 and OhmIncQf is set to 1. You should switch to OhmIncQf=0.
Location: model.physics.ohmincqf
## NO ACTION NEEDED
- Updated (11) parameter(s):
-- initial_states.paved at site [0]: temperature, tsfc, tin → 12.4 C (Set from CRU data for coordinates (51.51, -0.13) for month 1)
-- initial_states.bldgs at site [0]: temperature, tsfc, tin → 12.4 C (Set from CRU data for coordinates (51.51, -0.13) for month 1)
-- stebbs.InitialOutdoorTemperature at site [0]: 20.0 → 12.4 C (Set from CRU data for coordinates (51.51, -0.13) for month 1)
-- anthropogenic_emissions.startdls at site [0]: 15.0 → 86 (Calculated DLS start for coordinates (51.51, -0.13))
-- anthropogenic_emissions.enddls at site [0]: 12.0 → 303 (Calculated DLS end for coordinates (51.51, -0.13))
-- paved.sfr at site [0]: rounded to achieve sum of land cover fractions equal to 1.0
# ==================================================
Phase B now provides comprehensive error collection and reporting for initialization failures:
def extract_simulation_parameters(yaml_data: dict) -> Tuple[int, str, str]:
"""Extract simulation parameters for validation."""
# Collect all validation errors instead of failing on first error
errors = []
if not isinstance(start_date, str) or "-" not in str(start_date):
errors.append("Missing or invalid 'start_time' in model.control - must be in 'YYYY-MM-DD' format")
if not isinstance(end_date, str) or "-" not in str(end_date):
errors.append("Missing or invalid 'end_time' in model.control - must be in 'YYYY-MM-DD' format")
# If we have errors, combine them into a single error message for proper handling
if errors:
error_msg = "; ".join(errors)
raise ValueError(error_msg)When initialization fails, Phase B creates individual error reports for each issue and generates comprehensive reports even during failures, ensuring users always receive actionable guidance.
# Phase B handles CRU data access with proper error handling
def get_mean_monthly_air_temperature(lat: float, lon: float, month: int, spatial_res: float = 0.5) -> float:
# Validate inputs
if not (1 <= month <= 12):
raise ValueError(f"Month must be between 1 and 12, got {month}")
if not (-90 <= lat <= 90):
raise ValueError(f"Latitude must be between -90 and 90, got {lat}")
if not (-180 <= lon <= 180):
raise ValueError(f"Longitude must be between -180 and 180, got {lon}")- Coordinate Range Validation: Latitude (-90 to 90°), longitude (-180 to 180°)
- Missing Coordinate Handling: ERROR status for missing lat/lng parameters
- Invalid Coordinate Types: ERROR status for non-numeric coordinate values
- Timezone Warnings: WARNING status if timezone parameter is missing
- rslmethod-stabilitymethod Dependency: If rslmethod == 2, stabilitymethod must be 3
- storageheatmethod-ohmincqf Compatibility: If StorageHeatMethod == 1, OhmIncQf must be 0
- Missing Required Parameters: ERROR status for null physics parameters
- Physics Section Missing: WARNING status if entire physics section is empty
Phase B output serves as input to subsequent phases in the validation pipeline:
When Phase B runs as part of multi-phase pipelines, its output is processed internally and consolidated:
# Phase B in multi-phase workflows
User Input: config.yml
↓
Phase A (internal) → Phase B (internal) → ...
↓
Final Output: updated_config.yml, report_config.txt
# The final report consolidates Phase B findings with other phases
# File naming is standardised regardless of pipeline (AB, BC, ABC, etc.)- Both Modes Public and Dev: Provide identical scientific validation - mode only affects report header
- Phase Consolidation: Integrates Phase A reports when available
- Multi-phase workflows (AB, BC, ABC): Phase B intermediate files preserved based on workflow success
- B-only workflow: Phase B files retained as final outputs
- Error Handling: Phase B outputs preserved if subsequent phases fail
- Report Consolidation: Phase B reports include Phase A information when available
Phase B includes comprehensive test coverage.
def test_cru_monthly_temperature_integration():
"""Test CRU monthly climatological temperature integration."""
# Test known coordinates (London)
lat, lng, month = 51.5074, -0.1278, 1
temp = get_mean_monthly_air_temperature(lat, lng, month)
# London January temperature should be reasonable
assert 0 <= temp <= 20, f"Unrealistic temperature: {temp}°C"
assert temp is not None, "CRU lookup should return valid temperature"def test_cru_annual_temperature_integration():
"""Test CRU annual climatological temperature integration."""
# Test known coordinates (London)
lat, lng = 51.5074, -0.1278
annual_temp = get_mean_annual_air_temperature(lat, lng)
# London annual temperature should be reasonable
assert 0 <= annual_temp <= 20, f"Unrealistic annual temperature: {annual_temp}°C"
# Annual mean should be cooler than summer month
summer_temp = get_mean_monthly_air_temperature(lat, lng, 7)
assert annual_temp < summer_temp, "Annual mean should be cooler than summer"- Run Phase B after Phase A to ensure scientific consistency of up-to-date parameters
- Review ACTION NEEDED items carefully - these require user decisions
- Trust scientific corrections - automatic adjustments improve model realism
- Validate coordinates ensure latitude/longitude are correct for CRU integration
- Use AB or ABC workflows for comprehensive validation
- Mode selection is cosmetic - both modes run identical validation
- Add validation rules following the ValidationResult pattern (status: "ERROR"/"WARNING"/"PASS")
- Test CRU integration when adding location-dependent features
- Update adjustment logic using ScientificAdjustment records
- Maintain backward compatibility when modifying validation rules
Issue: "CRU data file not found"
Solution: Ensure CRU Parquet file is available in package
Check: Import should include ext_data/CRU_TS4.06_1991_2020.parquet
Fix: Reinstall SUEWS package or check data file integrity
Issue: "No CRU data found within spatial resolution"
Solution: Coordinates may be over ocean or outside CRU coverage
Check: Verify latitude/longitude are for land locations
Fix: Use land-based coordinates or increase spatial resolution
Issue: "Physics option dependency violation"
Solution: Incompatible physics options selected
Check: Review physics option combinations in SUEWS documentation
Fix: Adjust physics options to compatible combination
Issue: "Surface fractions sum to 1.020000, should equal 1.0"
Solution: Land cover fractions are incomplete or incorrect
Check: Verify surface fractions in your configuration
Fix: Adjust surface fractions so total equals 1.0
Note: Only tiny floating-point errors are automatically corrected
# Direct Python usage for Phase B
from supy.data_model.science_check import run_science_check
# Function returns updated YAML data as dict
updated_data = run_science_check(
uptodate_yaml_file="updatedA_my_config.yml",
user_yaml_file="my_config.yml",
standard_yaml_file="src/supy/sample_data/sample_config.yml",
science_yaml_file="updatedB_my_config.yml",
science_report_file="reportB_my_config.txt",
mode="public", # Mode only affects report header
phase="B"
)
if updated_data:
print("✅ Phase B scientific validation completed successfully")
else:
print("❌ Phase B encountered errors")# Public mode (default) - standard scientific validation
python src/supy/data_model/suews_yaml_processor.py user_config.yml --phase B --mode public
# Developer mode - identical validation with different report header
python src/supy/data_model/suews_yaml_processor.py user_config.yml --phase B --mode dev# Phase B after Phase A (AB workflow)
python src/supy/data_model/suews_yaml_processor.py user_config.yml --phase AB
# Phase B before Phase C (BC workflow)
python src/supy/data_model/suews_yaml_processor.py user_config.yml --phase BC
# Complete pipeline including Phase B (ABC workflow)
python src/supy/data_model/suews_yaml_processor.py user_config.yml --phase ABC- README - Overview of the complete three-phase validation system
- Orchestrator - Implementation and workflow coordination
- Phase A Detailed - Phase A parameter detection and structure validation
- Phase C Detailed - Phase C Pydantic validation and conditional rules
- Complete parameter specifications and validation details in the main SUEWS documentation
- All CRU data are from https://crudata.uea.ac.uk/cru/data/hrg/cru_ts_4.06/