Skip to content

Mise en place de tests unitaires et correction de bogues dans la procédure de calcul#90

Open
EPMPaulPoly wants to merge 2 commits into
chairemobilite:mainfrom
EPMPaulPoly:erreur-agregation-lots
Open

Mise en place de tests unitaires et correction de bogues dans la procédure de calcul#90
EPMPaulPoly wants to merge 2 commits into
chairemobilite:mainfrom
EPMPaulPoly:erreur-agregation-lots

Conversation

@EPMPaulPoly

@EPMPaulPoly EPMPaulPoly commented May 6, 2026

Copy link
Copy Markdown
Collaborator

Pull request répondant à deux issues: #71 et #76

Le premier commit met en place une architecture de test et développe des workflows pour valider que les fonctions de calcul fonctionnent tel que prévu. Si les tests sont effectués à cette étape les bugs sont encore présents. Les tests ne valident que la portion python avec des données bidon sans jamais interagir avec la base de données pour éviter d'avoir à mettre en place une fixture

Le deuxième commit règle les enjeux reliés au calcul qui ont été constatés dans le issue 71:

  • include_groups=True est devenu deprecated dans pandas 3.0+ menant à des échecs de calcul
  • le filtre dans la fonction d'addition qui permettait de choisir une valeur fixe ou une valeur avec une pente échouait si la case de DataFrame n'était pas le type exact de null/None/np.nan. une fonction helper a été mise en place pour rendre cet état plus robuste et éviter l'échec.

Note importante il faudrait relancer les calculs pour Québec pour voir si certains lots n'étaient pas calculé. Il faudra documenter le delta à l'inventaire total.

Summary by CodeRabbit

Release Notes

  • New Features

    • Added numeric input validation for parking threshold calculations.
    • Introduced database fields for rental property tracking.
  • Improvements

    • Enhanced inventory data aggregation for duplicate lots.
    • Improved parking regulation filtering to scope units appropriately.
    • Refined tax data visualization capabilities.
  • Testing

    • Established comprehensive test suite for parking calculations and regulations.
    • Configured automated CI validation pipeline.
  • Documentation

    • Updated municipal data extraction guidance.
  • Chores

    • Updated development environment and dependencies.

@EPMPaulPoly EPMPaulPoly requested a review from greenscientist May 6, 2026 21:51
@EPMPaulPoly EPMPaulPoly self-assigned this May 6, 2026
@EPMPaulPoly EPMPaulPoly linked an issue May 6, 2026 that may be closed by this pull request
@coderabbitai

coderabbitai Bot commented May 6, 2026

Copy link
Copy Markdown

Warning

Rate limit exceeded

@EPMPaulPoly has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 45 minutes and 33 seconds before requesting another review.

To continue reviewing without waiting, purchase usage credits in the billing tab.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 250852cd-6312-4db7-92e0-6827642c6143

📥 Commits

Reviewing files that changed from the base of the PR and between 5ac86fb and 9dcda9e.

📒 Files selected for processing (28)
  • .github/workflows/test-python-scripts.yml
  • .vscode/settings.json
  • docs/md/023-VerseDonneesRole.md
  • serveur_calcul_python/classes/parking_inventory.py
  • serveur_calcul_python/classes/parking_regs.py
  • serveur_calcul_python/classes/reg_set_territory.py
  • serveur_calcul_python/classes/tax_dataset.py
  • serveur_calcul_python/config/config_db.py
  • serveur_calcul_python/environment.yml
  • serveur_calcul_python/environment_python_conversion.yaml
  • serveur_calcul_python/pytest.ini
  • serveur_calcul_python/setup.py
  • serveur_calcul_python/tests/__init__.py
  • serveur_calcul_python/tests/conftest.py
  • serveur_calcul_python/tests/data_gen/calcs_input_data_store.py
  • serveur_calcul_python/tests/data_gen/inventory_data_store.py
  • serveur_calcul_python/tests/data_gen/land_use_table_data_store.py
  • serveur_calcul_python/tests/data_gen/reg_set_terr_data_store.py
  • serveur_calcul_python/tests/data_gen/regulation_data_store.py
  • serveur_calcul_python/tests/data_gen/regulation_set_data_store.py
  • serveur_calcul_python/tests/data_gen/tax_data_store.py
  • serveur_calcul_python/tests/unit/__init__.py
  • serveur_calcul_python/tests/unit/test_affectation_prs_td.py
  • serveur_calcul_python/tests/unit/test_aggregation_inventaire.py
  • serveur_calcul_python/tests/unit/test_conversion_unite.py
  • serveur_calcul_python/tests/unit/test_rule_calc.py
  • serveur_calcul_python/tests/unit/test_subset_calc.py
  • serveur_calcul_python/tests/unit/test_subset_operators.py

Walkthrough

This pull request introduces comprehensive testing infrastructure and continuous integration automation for a parking calculation system. It adds a GitHub Actions workflow to validate parking prediction calculations, configures pytest with environment setup, and establishes development environment specifications. The changes include modifications to core business logic in parking inventory, regulations, and territory management classes—refactoring threshold validation, aggregation behavior, and regulation filtering. Additionally, the PR introduces extensive new test data generators and unit tests covering various calculation scenarios (simple, threshold-based, addition-based, floor, and ceiling logic).

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Suggested reviewers

  • greenscientist
🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 4.23% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main purpose of the PR: implementing unit tests and fixing bugs in the calculation procedure.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

  • Generate code and open pull requests
  • Plan features and break down work
  • Investigate incidents and troubleshoot customer tickets together
  • Automate recurring tasks and respond to alerts with triggers
  • Summarize progress and report instantly

Built for teams:

  • Shared memory across your entire org—no repeating context
  • Per-thread sandboxes to safely plan and execute work
  • Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 22

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)
serveur_calcul_python/classes/reg_set_territory.py (1)

102-116: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Fix the return type annotation — list does not accept two subscripts.

The signature claims ->list[list[RegSetTerritory],list[TD.TaxDataset]], but the function now returns a 2-tuple (line 116), and list[X, Y] is not valid Python typing.

🔧 Proposed fix
-def get_rst_by_tax_data(tax_data:TD.TaxDataset,db_eng=None)->list[list[RegSetTerritory],list[TD.TaxDataset]]:
+def get_rst_by_tax_data(tax_data:TD.TaxDataset,db_eng=None)->tuple[list[RegSetTerritory],list[TD.TaxDataset]]:
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@serveur_calcul_python/classes/reg_set_territory.py` around lines 102 - 116,
The return type annotation on get_rst_by_tax_data is invalid
(list[list[RegSetTerritory], list[TD.TaxDataset]]); update it to a tuple type
that matches the actual returned 2-tuple (relevant_rsts, tax_dataset_match) —
e.g. change the signature to -> tuple[list[RegSetTerritory],
list[TD.TaxDataset]] (or -> Tuple[list[RegSetTerritory], list[TD.TaxDataset]]
and import Tuple from typing) so the annotation correctly reflects the
two-element return.
serveur_calcul_python/classes/parking_inventory.py (1)

692-710: ⚠️ Potential issue | 🔴 Critical | 🏗️ Heavy lift

Add missing parentheses: generate_thresh_data() in test file

At serveur_calcul_python/tests/unit/test_subset_calc.py:26, CIDS.generate_thresh_data is missing parentheses. This passes the function object instead of calling it, causing the pipeline failure AttributeError: 'function' object has no attribute 'get_by_reg' downstream.

Change:

input_data = CIDS.generate_thresh_data

to:

input_data = CIDS.generate_thresh_data()

(Compare with lines 18 and 38 in the same file, which correctly call the function with ().)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@serveur_calcul_python/classes/parking_inventory.py` around lines 692 - 710,
In the unit test where input_data is set, call the data factory instead of
assigning the function object: replace the assignment that uses
CIDS.generate_thresh_data (the function object) with a call to
CIDS.generate_thresh_data() so input_data becomes the returned data structure
used by calculate_threshold_based_subset_from_inputs_class and other helpers
(this fixes the AttributeError when code calls .get_by_reg on the test input).
serveur_calcul_python/classes/parking_regs.py (1)

161-184: 🧹 Nitpick | 🔵 Trivial | ⚡ Quick win

DRY: the three get_reg_by_id overloads share identical filter logic.

Lines 165-166, 174-175, and 182-183 are byte-identical. Extracting a private helper keeps the dispatch shells minimal and avoids the three copies drifting.

♻️ Proposed refactor
+    def _build_filtered(self, data, long_regs):
+        units_out = self.units_table[
+            self.units_table[config_db.db_column_units_id].isin(
+                long_regs[config_db.db_column_parking_unit_id].unique().tolist()
+            )
+        ]
+        return ParkingRegulations(data, long_regs, units_out)
+
     `@get_reg_by_id.register`
     def _(self,id_to_get_:int):
         data = self.reg_head.loc[self.reg_head[config_db.db_column_parking_regs_id]==id_to_get_]
         long_regs = self.reg_def.loc[self.reg_def[config_db.db_column_parking_regs_id]==id_to_get_]
-        units_out = self.units_table[self.units_table[config_db.db_column_units_id].isin(long_regs[config_db.db_column_parking_unit_id].unique().tolist())]
-        object_out = ParkingRegulations(data,long_regs,units_out)
-        return object_out
+        return self._build_filtered(data, long_regs)

(apply analogous changes to the np.ndarray and list overloads)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@serveur_calcul_python/classes/parking_regs.py` around lines 161 - 184, The
three overloaded get_reg_by_id implementations duplicate the same
filtering/assembly logic; refactor by extracting a private helper method (e.g.,
_build_parking_regulations) that accepts an iterable of ids and returns
ParkingRegulations built from self.reg_head, self.reg_def, and self.units_table
using config_db.db_column_parking_regs_id and
config_db.db_column_parking_unit_id, then have each registered get_reg_by_id
(signatures with int, np.ndarray, list) normalize/convert their id_to_get_ input
to an iterable and call that helper, returning its result.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In @.github/workflows/test-python-scripts.yml:
- Around line 17-19: Remove the unnecessary full-history checkout by deleting
the fetch-depth: 0 setting under the actions/checkout@v6 step (or change it to
the default behavior) so the pytest job uses a shallow checkout; only keep
fetch-depth: 0 if a later step actually needs full git history
(tags/blame/etc.).
- Around line 27-29: The cache key uses the full
hashFiles('serveur_calcul_python/environment.yml') which makes the cache
unusable on any change; add a restore-keys fallback to allow prefix matches and
partial cache restores by supplying a restore-keys entry alongside the existing
key (use the same prefix based on ${{ runner.os }}-conda-${{ env.CACHE_NUMBER }}
so workflows referencing the key can fall back to older caches when the full
hash differs). Ensure you update the workflow step that defines key to include
restore-keys using that prefix so hashFiles remains the primary key but prefix
restores are possible.
- Line 36: Remove the "use-only-tar-bz2: true" setting from the GitHub Actions
workflow so conda can install modern .conda packages (e.g., gdal/fiona); locate
the YAML entry named use-only-tar-bz2 and either delete that line or set it to
false in the workflow job where the conda installer is configured to avoid
forcing legacy .tar.bz2-only resolution.

In `@serveur_calcul_python/classes/tax_dataset.py`:
- Around line 75-76: The local alias tax_table = self.tax_table is redundant;
remove that assignment and call self.tax_table.explore(m=m1, color='red')
directly where tax_table.explore(...) is currently used so you keep the same
behavior while eliminating the unnecessary local variable (update the call site
referencing tax_table.explore to use self.tax_table.explore and leave parameters
m=m1, color='red' unchanged).

In `@serveur_calcul_python/environment.yml`:
- Around line 6-25: The environment.yml currently lists many unpinned packages
(python, numpy, pandas, geopandas, gdal, fiona, scikit-image, sqlalchemy, etc.);
update it to pin at least major.minor versions for those packages or generate
and commit a lockfile (e.g., conda-lock or conda env export/conda-lock.yml) used
by your CI workflow so builds are reproducible; specifically, choose stable
versions for python, numpy, pandas, geopandas, gdal and fiona (and other listed
packages like scikit-image, sqlalchemy, matplotlib) and either replace the bare
package names with versioned specifiers in environment.yml or add the lockfile
and update CI to use it.
- Line 17: The environment.yml currently includes the obsolete PyPI/conda
package "typing" which can shadow stdlib typing on Python 3.12; remove the
"typing" entry from the dependencies in environment.yml and keep
"typing-extensions" (already present) for any extra typing features so imports
use the standard library typing module instead.

In `@serveur_calcul_python/tests/data_gen/calcs_input_data_store.py`:
- Around line 3-11: In generate_all_required_inputs(), pd.concat currently
returns a plain DataFrame which can strip the ParkingCalculationInputs subclass;
wrap the concat result with ParkingCalculationInputs to preserve type: replace
the direct return of pd.concat([...]).reset_index(drop=True) by constructing
ParkingCalculationInputs(pd.concat([...]).reset_index(drop=True)) so downstream
methods like get_by_reg keep the subclass behavior (refer to function
generate_all_required_inputs and class ParkingCalculationInputs).

In `@serveur_calcul_python/tests/data_gen/reg_set_terr_data_store.py`:
- Line 13: The tuple unpack in line using
RSDS.generate_parking_regulation_sets() assigns four variables (prs_1, prs_2,
prs_3, prs_4) but prs_4 is unused; rename it to _prs_4 to signal intentional
unused variable per lint rule RUF059. Update the assignment where
RSDS.generate_parking_regulation_sets() is called so the fourth target is _prs_4
(keep prs_1, prs_2, prs_3 unchanged) to satisfy the linter.

In `@serveur_calcul_python/tests/data_gen/regulation_data_store.py`:
- Around line 28-215: The test fixture constructs DataFrames reg_header,
reg_def, and units using hard-coded column name strings (e.g., 'id_reg_stat',
'unite', 'seuil', 'oper') which diverges from the production constants; replace
those string literals with the shared schema constants (e.g.,
config_db.db_column_* names) so the test uses the same column identifiers as
PR.ParkingRegulations and will break at import time if columns are renamed —
update all column keys in reg_header, reg_def, and units to reference the
appropriate config_db.db_column_* constants.
- Around line 1-7: Remove the sys.path mutation (the
sys.path.append(str(Path(__file__).resolve().parents[2])) line) from
regulation_data_store.py and rely on proper package imports; replace the hack by
using a package-relative import (e.g. adjust to a relative import like from
...classes import parking_regs as PR if this module is inside a package) or
ensure tests set up the import path centrally (conftest/pytest.ini) instead of
modifying sys.path in this file; keep the existing reference to parking_regs as
PR and do not touch other logic.

In `@serveur_calcul_python/tests/data_gen/regulation_set_data_store.py`:
- Line 46: prs_3 and prs_4 use the same name label 'Deuxième_ensemble_reglement'
as prs_2 which causes duplicate names; update the name fields for prs_3 and
prs_4 (the variables/primitives named prs_3 and prs_4 in the data generation
code) to unique, descriptive labels (e.g., 'Troisième_ensemble_reglement' and
'Quatrième_ensemble_reglement' or other meaningful names) so each pr entry has a
distinct name for clearer debugging and name-based assertions.
- Around line 1-5: The imports in this module use absolute paths; switch them to
the same relative style as sibling modules by replacing "import
serveur_calcul_python.tests.data_gen.regulation_data_store as RDS" and "import
serveur_calcul_python.tests.data_gen.land_use_table_data_store as LUTDS" with
relative imports (e.g., "from tests.data_gen import regulation_data_store as
RDS" and "from tests.data_gen import land_use_table_data_store as LUTDS"),
replace "import classes.parking_reg_sets as PRS" with "from classes import
parking_reg_sets as PRS", and replace "import config.config_db as cf_db" with
"from config import config_db as cf_db"; keep "import pandas as pd" unchanged.

In `@serveur_calcul_python/tests/data_gen/tax_data_store.py`:
- Line 184: The loop variable "index" in the for statement iterating over
lons[0:len(basic_data)] is unused; rename it to "_" to indicate an intentionally
ignored value and silence the Ruff B007 warning (change "for index, lon in
enumerate(lons[0:len(basic_data)]):" to "for _, lon in
enumerate(lons[0:len(basic_data)]):"). Ensure references to "index" elsewhere in
the loop body are not present before applying the rename.

In `@serveur_calcul_python/tests/unit/test_affectation_prs_td.py`:
- Around line 8-23: The test_affectation_TD_RST contains three repeated blocks
of the same three DataFrame assertions for indices 0/1/2; refactor it to use
pytest.mark.parametrize so each case is a separate parameterized test and the
assertions are looped over attributes. Specifically, call
RSTDS.generate_reg_set_terr(), TDS.generate_tax_dataset() and
RST.split_td_by_rst(...) once, then parametrize over (idx, rst_id) mapping to
TDS.generate_tax_dataset(rst_id) and iterate the attributes tax_table,
lot_table, lot_association using cf_db.db_column_tax_id / cf_db.db_column_lot_id
for sorting; keep assertions using pd.testing.assert_frame_equal comparing
sorted/reset frames for split_tax_data_sets[idx] vs theoretical dataset to
collapse the duplicated code and improve failure locality.

In `@serveur_calcul_python/tests/unit/test_aggregation_inventaire.py`:
- Around line 7-8: Remove the duplicate import that creates the same alias PI:
keep only one import for the parking_inventory module (either "from classes
import parking_inventory as PI" or "import classes.parking_inventory as PI") and
delete the other so PI is not overwritten; update any tests that rely on the
specific import style if necessary to ensure PI refers to the parking_inventory
module used elsewhere in the file.

In `@serveur_calcul_python/tests/unit/test_conversion_unite.py`:
- Around line 23-25: The three pd.testing.assert_frame_equal calls (comparing
inputs_data_prs_1/2/3 to inputs_theoretical_1/2/3) should allow dtype
differences coming from Python literals; update each assert_frame_equal
invocation to pass check_dtype=False so integer/float dtype mismatches (e.g.,
valeur int64 vs float64) don't cause spurious test failures.
- Line 16: The tuple unpacking assigns prs_1, prs_2, prs_3, prs_4 =
generate_parking_regulation_sets() but prs_4 is unused; to silence the linter,
rename the unused variable to _prs_4 or simply _ (e.g., prs_1, prs_2, prs_3, _ =
generate_parking_regulation_sets()) in the test file so
generate_parking_regulation_sets() remains called but the unused value is
ignored — update the line where generate_parking_regulation_sets() is unpacked.

In `@serveur_calcul_python/tests/unit/test_subset_calc.py`:
- Around line 26-28: The test binds the function object instead of calling it:
change the assignment of input_data from CIDS.generate_thresh_data to calling
the function (e.g., CIDS.generate_thresh_data()) so
calculate_parking_subset_from_inputs_class(simple_park_reg, 1, input_data, 2)
receives the actual data; update the line where input_data is set to invoke
generate_thresh_data.
- Around line 46-48: The __main__ block currently calls test_subset_simple_min()
and test_subset_seuil_min() but omits running test_subset_addition_min(), so add
a call to test_subset_addition_min() in the if __name__=="__main__": block (or
replace the manual calls with a unittest runner such as unittest.main()) to
ensure the test_subset_addition_min function is executed when the file is run
directly; update the block containing test_subset_simple_min,
test_subset_seuil_min to include test_subset_addition_min (or switch to
unittest.main()) accordingly.
- Around line 20-21: The two assertions are comparing a Series which causes a
truth-value ambiguity; update the assertions that reference
inventaire.parking_frame.loc[...] for the 'n_places_min' column (using
cf_db.db_column_lot_id to filter) to extract a scalar before comparison (e.g.,
use .iloc[0] or .values[0] like other tests) so each assert compares a single
value (for the rows with lot_id 'a' and 'b').

In `@serveur_calcul_python/tests/unit/test_subset_operators.py`:
- Around line 33-87: The current helper functions
(min_left_smaller_min_right_simple_or, min_left_larger_min_right_simple_or,
min_left_smaller_min_right_most_const_or,
min_left_larger_min_right_most_const_or,
max_left_smaller_min_right_most_const_or,
max_left_larger_min_right_most_const_or,
min_left_smaller_max_right_most_const_or,
min_left_larger_max_right_most_const_or) are defined as plain helpers and are
all called from one test causing the first failure to abort others; fix by
turning each helper into an individual pytest test (rename each to start with
test_...) or replace them with a single pytest.mark.parametrize-based test over
the distinct scenarios using PI.subset_operation and assertions, ensuring each
scenario is reported separately (reference the functions above and the use of
PI.subset_operation and IDS.get_* helpers to structure cases).
- Around line 1-12: The test file has inconsistent import style and a duplicate
import of sys: remove the redundant "import sys" and change the absolute import
"serveur_calcul_python.tests.data_gen.inventory_data_store as IDS" to the same
relative/sibling import style used in other tests (e.g., import the
inventory_data_store module via a package-relative import such as from
tests.data_gen import inventory_data_store as IDS or the appropriate relative
form), and keep the existing Path/sys.path manipulation (or canonicalize it) so
imports for classes.parking_inventory (PI) resolve consistently; ensure only one
sys import remains and that IDS and PI references still resolve after the
change.

---

Outside diff comments:
In `@serveur_calcul_python/classes/parking_inventory.py`:
- Around line 692-710: In the unit test where input_data is set, call the data
factory instead of assigning the function object: replace the assignment that
uses CIDS.generate_thresh_data (the function object) with a call to
CIDS.generate_thresh_data() so input_data becomes the returned data structure
used by calculate_threshold_based_subset_from_inputs_class and other helpers
(this fixes the AttributeError when code calls .get_by_reg on the test input).

In `@serveur_calcul_python/classes/parking_regs.py`:
- Around line 161-184: The three overloaded get_reg_by_id implementations
duplicate the same filtering/assembly logic; refactor by extracting a private
helper method (e.g., _build_parking_regulations) that accepts an iterable of ids
and returns ParkingRegulations built from self.reg_head, self.reg_def, and
self.units_table using config_db.db_column_parking_regs_id and
config_db.db_column_parking_unit_id, then have each registered get_reg_by_id
(signatures with int, np.ndarray, list) normalize/convert their id_to_get_ input
to an iterable and call that helper, returning its result.

In `@serveur_calcul_python/classes/reg_set_territory.py`:
- Around line 102-116: The return type annotation on get_rst_by_tax_data is
invalid (list[list[RegSetTerritory], list[TD.TaxDataset]]); update it to a tuple
type that matches the actual returned 2-tuple (relevant_rsts, tax_dataset_match)
— e.g. change the signature to -> tuple[list[RegSetTerritory],
list[TD.TaxDataset]] (or -> Tuple[list[RegSetTerritory], list[TD.TaxDataset]]
and import Tuple from typing) so the annotation correctly reflects the
two-element return.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: b15f5347-a691-4fd3-9c4c-5876a688e446

📥 Commits

Reviewing files that changed from the base of the PR and between 7a3249f and 5ac86fb.

📒 Files selected for processing (28)
  • .github/workflows/test-python-scripts.yml
  • .vscode/settings.json
  • docs/md/023-VerseDonneesRole.md
  • serveur_calcul_python/classes/parking_inventory.py
  • serveur_calcul_python/classes/parking_regs.py
  • serveur_calcul_python/classes/reg_set_territory.py
  • serveur_calcul_python/classes/tax_dataset.py
  • serveur_calcul_python/config/config_db.py
  • serveur_calcul_python/environment.yml
  • serveur_calcul_python/environment_python_conversion.yaml
  • serveur_calcul_python/pytest.ini
  • serveur_calcul_python/setup.py
  • serveur_calcul_python/tests/__init__.py
  • serveur_calcul_python/tests/conftest.py
  • serveur_calcul_python/tests/data_gen/calcs_input_data_store.py
  • serveur_calcul_python/tests/data_gen/inventory_data_store.py
  • serveur_calcul_python/tests/data_gen/land_use_table_data_store.py
  • serveur_calcul_python/tests/data_gen/reg_set_terr_data_store.py
  • serveur_calcul_python/tests/data_gen/regulation_data_store.py
  • serveur_calcul_python/tests/data_gen/regulation_set_data_store.py
  • serveur_calcul_python/tests/data_gen/tax_data_store.py
  • serveur_calcul_python/tests/unit/__init__.py
  • serveur_calcul_python/tests/unit/test_affectation_prs_td.py
  • serveur_calcul_python/tests/unit/test_aggregation_inventaire.py
  • serveur_calcul_python/tests/unit/test_conversion_unite.py
  • serveur_calcul_python/tests/unit/test_rule_calc.py
  • serveur_calcul_python/tests/unit/test_subset_calc.py
  • serveur_calcul_python/tests/unit/test_subset_operators.py
💤 Files with no reviewable changes (1)
  • serveur_calcul_python/environment_python_conversion.yaml
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: pr-build-check (server, ., serveur/Dockerfile)
🧰 Additional context used
🪛 GitHub Actions: Check parking prediction calculations / 0_engine-test.txt
serveur_calcul_python/tests/unit/test_subset_calc.py

[error] 20-20: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

serveur_calcul_python/classes/parking_inventory.py

[error] 707-707: AttributeError: 'function' object has no attribute 'get_by_reg'.

🪛 GitHub Actions: Check parking prediction calculations / engine-test
serveur_calcul_python/tests/unit/test_subset_calc.py

[error] 20-20: Test failure in test_subset_simple_min: ValueError due to ambiguous truth value of a pandas Series when asserting equality (The truth value of a Series is ambiguous...).

serveur_calcul_python/classes/parking_inventory.py

[error] 707-707: AttributeError: 'function' object has no attribute 'get_by_reg' during threshold-based subset calculation; likely caused by passing the function CIDS.generate_thresh_data instead of its return value.

🪛 markdownlint-cli2 (0.22.1)
docs/md/023-VerseDonneesRole.md

[warning] 16-16: Fenced code blocks should be surrounded by blank lines

(MD031, blanks-around-fences)


[warning] 16-16: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

🪛 Ruff (0.15.12)
serveur_calcul_python/tests/data_gen/tax_data_store.py

[warning] 30-30: Unnecessary assignment to tax_dataset before return statement

Remove unnecessary assignment

(RET504)


[warning] 31-31: Unnecessary elif after return statement

Remove unnecessary elif

(RET505)


[warning] 39-39: Unnecessary assignment to tax_dataset before return statement

Remove unnecessary assignment

(RET504)


[warning] 45-45: Unnecessary assignment to tax_dataset before return statement

Remove unnecessary assignment

(RET504)


[warning] 51-51: Unnecessary assignment to tax_dataset before return statement

Remove unnecessary assignment

(RET504)


[warning] 184-184: Loop control variable index not used within loop body

Rename unused index to _index

(B007)


[warning] 187-187: zip() without an explicit strict= parameter

Add explicit value for parameter strict=

(B905)

serveur_calcul_python/tests/data_gen/reg_set_terr_data_store.py

[warning] 13-13: Unpacked variable prs_4 is never used

Prefix it with an underscore or any other dummy variable pattern

(RUF059)


[warning] 22-22: zip() without an explicit strict= parameter

Add explicit value for parameter strict=

(B905)


[warning] 33-33: zip() without an explicit strict= parameter

Add explicit value for parameter strict=

(B905)


[warning] 44-44: zip() without an explicit strict= parameter

Add explicit value for parameter strict=

(B905)

serveur_calcul_python/tests/unit/test_conversion_unite.py

[warning] 16-16: Unpacked variable prs_4 is never used

Prefix it with an underscore or any other dummy variable pattern

(RUF059)

serveur_calcul_python/classes/parking_regs.py

[warning] 167-167: Unnecessary assignment to object_out before return statement

Remove unnecessary assignment

(RET504)


[warning] 171-171: Missing return type annotation for private function _

(ANN202)


[warning] 176-176: Unnecessary assignment to object_out before return statement

Remove unnecessary assignment

(RET504)


[warning] 179-179: Missing return type annotation for private function _

(ANN202)

serveur_calcul_python/tests/data_gen/regulation_data_store.py

[warning] 133-133: Line with empty comment

Delete the empty comment

(PLR2044)


[warning] 217-217: Unnecessary assignment to big_pr before return statement

Remove unnecessary assignment

(RET504)

serveur_calcul_python/tests/data_gen/calcs_input_data_store.py

[warning] 11-11: Unnecessary assignment to output before return statement

Remove unnecessary assignment

(RET504)


[warning] 19-19: Unnecessary assignment to output before return statement

Remove unnecessary assignment

(RET504)


[warning] 23-23: Unnecessary assignment to ceil_data_thresh before return statement

Remove unnecessary assignment

(RET504)


[warning] 27-27: Unnecessary assignment to thresh_data before return statement

Remove unnecessary assignment

(RET504)

🔇 Additional comments (15)
serveur_calcul_python/pytest.ini (1)

1-7: LGTM!

Clean, minimal pytest configuration that aligns with the new test layout.

serveur_calcul_python/setup.py (1)

1-7: LGTM!

Minimal scaffolding is enough to make the package importable in CI and IDEs.

serveur_calcul_python/tests/conftest.py (1)

1-10: LGTM!

Lightweight session-scoped diagnostic. The print output relies on pytest being run with -s (already set in .vscode/settings.json); otherwise, output is captured — no impact on correctness.

.vscode/settings.json (1)

3-11: LGTM!

Settings are consistent with the new pytest + Conda setup.

docs/md/023-VerseDonneesRole.md (1)

15-18: LGTM!

Documentation correctly points to the consolidated environment.yml.

Optional: markdownlint hints (MD031/MD040) suggest blank lines around the fence and a language tag (e.g. ```bash) — purely cosmetic.

.github/workflows/test-python-scripts.yml (1)

35-35: ⚡ Quick win

Remove auto-activate — it is deprecated in favor of auto-activate-base.

The setup-miniconda action has deprecated auto-activate in favor of auto-activate-base. With activate-environment: serveur_calcul_python, the action will activate the named environment regardless, so you can simply remove this line.

🛠 Suggested fix
         with:
           activate-environment: serveur_calcul_python
           environment-file: serveur_calcul_python/environment.yml
-          auto-activate: true
-          use-only-tar-bz2: true
+          auto-activate-base: false
			> Likely an incorrect or invalid review comment.
serveur_calcul_python/classes/reg_set_territory.py (1)

118-123: split_td_by_rst extraction looks good.

Clean separation and the chained year_filter/territory_filter per RST correctly mirrors the prior inline logic.

serveur_calcul_python/classes/parking_inventory.py (3)

93-93: include_groups=False fix is correct.

Setting include_groups=True is deprecated since pandas 2.2.0, and only False will be allowed in a future version.


717-728: is_number usage looks correct and cleaner than the previous ad-hoc checks.


752-753: is_number implementation looks solid.

Correctly handles None, np.nan, np.inf, and numpy numeric types.

serveur_calcul_python/tests/data_gen/inventory_data_store.py (1)

1-79: Test data factory looks clean and well-structured.

serveur_calcul_python/config/config_db.py (1)

52-53: New constants follow the existing naming conventions.

serveur_calcul_python/tests/unit/test_rule_calc.py (1)

32-41: test_reglement_seuil_min is the test triggering the CI pipeline failure.

This test calls calculate_parking_specific_reg_from_inputs_class with generate_thresh_data() output, which eventually hits the failing line 707 in parking_inventory.py. Once the root cause there is resolved, rerun this test to confirm it passes.

serveur_calcul_python/tests/data_gen/land_use_table_data_store.py (1)

4-47: LGTM — straightforward fixture builder.

Minor nit (skipping): if config_db exposes a constant for the description column, using it would keep the fixture aligned with the rest of the codebase. Not a blocker.

serveur_calcul_python/classes/parking_regs.py (1)

165-165: ⚡ Quick win

No issue found — db_column_units_id correctly resolves to 'id_unite'.

The configuration confirms that config_db.db_column_units_id is indeed 'id_unite', which matches the column name used elsewhere in the codebase (e.g., line 227 joins and test fixtures). The code on line 165 is correct.

Comment on lines +17 to +19
- uses: actions/checkout@v6
with:
fetch-depth: 0

@coderabbitai coderabbitai Bot May 6, 2026

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial | 💤 Low value

Nit: fetch-depth: 0 is rarely needed for a pytest job.

Full history isn't required to run unit tests; the default shallow checkout will be a bit faster. Keep it only if a downstream step actually inspects git history (tags, blame, etc.).

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.github/workflows/test-python-scripts.yml around lines 17 - 19, Remove the
unnecessary full-history checkout by deleting the fetch-depth: 0 setting under
the actions/checkout@v6 step (or change it to the default behavior) so the
pytest job uses a shallow checkout; only keep fetch-depth: 0 if a later step
actually needs full git history (tags/blame/etc.).

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@EPMPaulPoly Seem appropriate, other add comment on why its needed

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like the humans are having a chat. I'll hop back into my burrow for now. If you need me again, just tag @coderabbitai in a new comment, and I'll come hopping out!

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, I double checked. The fetch-depth comes from this blog post I used to start building this. Looking back through the miniconda documentation, there doesn't appear to be a need for this. Will remove

Comment on lines +27 to +29
key:
${{ runner.os }}-conda-${{ env.CACHE_NUMBER }}-${{
hashFiles('serveur_calcul_python/environment.yml') }}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial | 💤 Low value

Cache key minor: consider adding restore-keys.

Without restore-keys, any change to environment.yml invalidates the cache completely with no fallback. A simple prefix fallback can speed up subsequent solves while you iterate.

♻️ Suggested tweak
           key:
             ${{ runner.os }}-conda-${{ env.CACHE_NUMBER }}-${{
             hashFiles('serveur_calcul_python/environment.yml') }}
+          restore-keys: |
+            ${{ runner.os }}-conda-${{ env.CACHE_NUMBER }}-
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
key:
${{ runner.os }}-conda-${{ env.CACHE_NUMBER }}-${{
hashFiles('serveur_calcul_python/environment.yml') }}
key:
${{ runner.os }}-conda-${{ env.CACHE_NUMBER }}-${{
hashFiles('serveur_calcul_python/environment.yml') }}
restore-keys: |
${{ runner.os }}-conda-${{ env.CACHE_NUMBER }}-
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.github/workflows/test-python-scripts.yml around lines 27 - 29, The cache
key uses the full hashFiles('serveur_calcul_python/environment.yml') which makes
the cache unusable on any change; add a restore-keys fallback to allow prefix
matches and partial cache restores by supplying a restore-keys entry alongside
the existing key (use the same prefix based on ${{ runner.os }}-conda-${{
env.CACHE_NUMBER }} so workflows referencing the key can fall back to older
caches when the full hash differs). Ensure you update the workflow step that
defines key to include restore-keys using that prefix so hashFiles remains the
primary key but prefix restores are possible.

activate-environment: serveur_calcul_python
environment-file: serveur_calcul_python/environment.yml
auto-activate: true
use-only-tar-bz2: true

@coderabbitai coderabbitai Bot May 6, 2026

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

🧩 Analysis chain

🌐 Web query:

conda-forge gdal fiona .conda vs .tar.bz2 use-only-tar-bz2 deprecated

💡 Result:

Conda-forge primarily produces and provides packages in the modern.conda format, especially for newer versions of packages like GDAL and Fiona. The traditional.tar.bz2 format is no longer generated by default for many feedstocks (unless specifically overridden per-feedstock).[1][2] The use_only_tar_bz2 configuration option in.condarc forces conda to only look for.tar.bz2 packages, which will prevent installation of conda-forge packages available only as.conda files, leading to PackagesNotFoundError.[3][1][4][2] This flag is effectively deprecated for conda-forge use because it causes users to miss packages.[1] To install GDAL and Fiona from conda-forge, ensure use_only_tar_bz2 is not enabled (set to false or remove it): conda config --set use_only_tar_bz2 false Then: conda install -c conda-forge gdal fiona GDAL and Fiona feedstocks build and upload packages to conda-forge channel, which now favor.conda.[5][6][7]

Citations:


Remove use-only-tar-bz2: true — will break solving for gdal/fiona.

This flag forces conda to use only legacy .tar.bz2 packages, but conda-forge now publishes most builds (especially geospatial ones like gdal and fiona) as .conda files. The flag will cause missing-package or unsatisfiable-environment errors.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.github/workflows/test-python-scripts.yml at line 36, Remove the
"use-only-tar-bz2: true" setting from the GitHub Actions workflow so conda can
install modern .conda packages (e.g., gdal/fiona); locate the YAML entry named
use-only-tar-bz2 and either delete that line or set it to false in the workflow
job where the conda installer is configured to avoid forcing legacy
.tar.bz2-only resolution.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm. Seems like caching is not going to be a thing then. The how-tos on the conda package indicate to use this for caching to work:

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@EPMPaulPoly, you're right — for caching to work properly, you need use-only-tar-bz2: true. My original concern was that conda-forge now favors .conda packages for geospatial libraries like gdal and fiona, which that flag would block.

There's a real trade-off here:

  • Keep use-only-tar-bz2: true → caching works, but if gdal/fiona are only available as .conda on conda-forge, the environment solve will fail.
  • Remove it → caching breaks, but the full .conda package catalog is available.

The pragmatic approach is to keep it and just try running the workflow. Many conda-forge packages still provide .tar.bz2 alongside .conda, so it may work fine in practice. If you see a PackagesNotFoundError for gdal or fiona, then you'd have to drop the caching block (remove the actions/cache step and use-only-tar-bz2) and accept longer CI times. I'm happy to withdraw my original concern given this context — sorry for the noise!

Comment on lines +75 to 76
tax_table = self.tax_table
tax_table.explore(m=m1,color='red')

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial | 💤 Low value

Tiny nit: the local alias is now redundant.

Since tax_table = self.tax_table no longer does anything (the previous drop is gone), you can call self.tax_table.explore(...) directly.

♻️ Suggested tweak
-        tax_table = self.tax_table
-        tax_table.explore(m=m1,color='red')
+        self.tax_table.explore(m=m1, color='red')
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
tax_table = self.tax_table
tax_table.explore(m=m1,color='red')
self.tax_table.explore(m=m1, color='red')
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@serveur_calcul_python/classes/tax_dataset.py` around lines 75 - 76, The local
alias tax_table = self.tax_table is redundant; remove that assignment and call
self.tax_table.explore(m=m1, color='red') directly where tax_table.explore(...)
is currently used so you keep the same behavior while eliminating the
unnecessary local variable (update the call site referencing tax_table.explore
to use self.tax_table.explore and leave parameters m=m1, color='red' unchanged).

Comment on lines 6 to +25
- python=3.12
- numpy
- pandas
- geopandas
- typing
- debugpy
- scikit-image
- sqlalchemy
- pandas
- lxml
- openpyxl
- psycopg2
- sqlalchemy
- geoalchemy2
- xlrd
- git
- typing-extensions
- geopandas
- typing
- xlrd
- matplotlib
- debugpy No newline at end of file
- geoalchemy2
- typing-extensions
- pytest
- coverage
- gdal
- fiona

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial | ⚖️ Poor tradeoff

Optional: pin versions for reproducible CI builds.

Unpinned dependencies (pandas, geopandas, gdal, fiona, …) can cause CI to start failing without any code change when upstream releases land. Consider at least pinning major/minor versions, or committing a conda-lock/explicit lock file used by the workflow.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@serveur_calcul_python/environment.yml` around lines 6 - 25, The
environment.yml currently lists many unpinned packages (python, numpy, pandas,
geopandas, gdal, fiona, scikit-image, sqlalchemy, etc.); update it to pin at
least major.minor versions for those packages or generate and commit a lockfile
(e.g., conda-lock or conda env export/conda-lock.yml) used by your CI workflow
so builds are reproducible; specifically, choose stable versions for python,
numpy, pandas, geopandas, gdal and fiona (and other listed packages like
scikit-image, sqlalchemy, matplotlib) and either replace the bare package names
with versioned specifiers in environment.yml or add the lockfile and update CI
to use it.

Comment thread serveur_calcul_python/tests/unit/test_subset_calc.py Outdated
Comment thread serveur_calcul_python/tests/unit/test_subset_calc.py Outdated
Comment on lines +46 to +48
if __name__=="__main__":
test_subset_simple_min()
test_subset_seuil_min() No newline at end of file

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

__main__ block omits test_subset_addition_min.

Running the file directly will skip the third test.

✏️ Proposed fix
 if __name__=="__main__":
     test_subset_simple_min()
     test_subset_seuil_min()
+    test_subset_addition_min()
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if __name__=="__main__":
test_subset_simple_min()
test_subset_seuil_min()
if __name__=="__main__":
test_subset_simple_min()
test_subset_seuil_min()
test_subset_addition_min()
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@serveur_calcul_python/tests/unit/test_subset_calc.py` around lines 46 - 48,
The __main__ block currently calls test_subset_simple_min() and
test_subset_seuil_min() but omits running test_subset_addition_min(), so add a
call to test_subset_addition_min() in the if __name__=="__main__": block (or
replace the manual calls with a unittest runner such as unittest.main()) to
ensure the test_subset_addition_min function is executed when the file is run
directly; update the block containing test_subset_simple_min,
test_subset_seuil_min to include test_subset_addition_min (or switch to
unittest.main()) accordingly.

Comment on lines +1 to +12
# import des librairies
import pandas as pd
import sys, pathlib
import numpy as np
import os
import sys
import serveur_calcul_python.tests.data_gen.inventory_data_store as IDS
from pathlib import Path
# ajoute le dossier serveur_calcul_python (qui contient classes, calcs, etc.) au sys.path
#sys.path.append(str(Path(__file__).resolve().parents[2]))
# import des classes et fonctions à tester
from classes import parking_inventory as PI

@coderabbitai coderabbitai Bot May 6, 2026

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial | ⚡ Quick win

Same import-path inconsistency and a duplicate import sys.

Line 7 uses the absolute serveur_calcul_python.tests.data_gen... form (different from sibling test files), and sys is imported on both Line 3 and Line 6. Aligning the import style across tests/ avoids surprising ModuleNotFoundErrors depending on how pytest is launched.

✏️ Proposed cleanup
-import pandas as pd
-import sys, pathlib
-import numpy as np
-import os
-import sys
-import serveur_calcul_python.tests.data_gen.inventory_data_store as IDS
-from pathlib import Path
+import pandas as pd
+import numpy as np
+from tests.data_gen import inventory_data_store as IDS
 from classes import parking_inventory as PI
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@serveur_calcul_python/tests/unit/test_subset_operators.py` around lines 1 -
12, The test file has inconsistent import style and a duplicate import of sys:
remove the redundant "import sys" and change the absolute import
"serveur_calcul_python.tests.data_gen.inventory_data_store as IDS" to the same
relative/sibling import style used in other tests (e.g., import the
inventory_data_store module via a package-relative import such as from
tests.data_gen import inventory_data_store as IDS or the appropriate relative
form), and keep the existing Path/sys.path manipulation (or canonicalize it) so
imports for classes.parking_inventory (PI) resolve consistently; ensure only one
sys import remains and that IDS and PI references still resolve after the
change.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oups, will fix

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@EPMPaulPoly No worries, thanks! 🐇

Comment on lines +33 to +87
def min_left_smaller_min_right_simple_or():
left_PI = IDS.get_min_only_small_PI()
right_PI = IDS.get_min_only_large_PI()
test_subject_1 = PI.subset_operation(left_PI,6,right_PI)
assert test_subject_1.parking_frame.iloc[0]['n_places_min'] == IDS.get_low_inventory()[0]
assert test_subject_1.parking_frame.iloc[1]['n_places_min'] == IDS.get_low_inventory()[1]

def min_left_larger_min_right_simple_or():
left_PI = IDS.get_min_only_large_PI()
right_PI = IDS.get_min_only_small_PI()
test_subject_1 = PI.subset_operation(left_PI,6,right_PI)
assert test_subject_1.parking_frame.iloc[0]['n_places_min'] == IDS.get_low_inventory()[0]
assert test_subject_1.parking_frame.iloc[1]['n_places_min'] == IDS.get_low_inventory()[1]

def min_left_smaller_min_right_most_const_or():
left_PI = IDS.get_min_only_small_PI()
right_PI = IDS.get_min_only_large_PI()
test_subject_1 = PI.subset_operation(left_PI,3,right_PI)
assert test_subject_1.parking_frame.iloc[0]['n_places_min'] == IDS.get_high_inventory()[0]
assert test_subject_1.parking_frame.iloc[1]['n_places_min'] == IDS.get_high_inventory()[1]

def min_left_larger_min_right_most_const_or():
left_PI = IDS.get_min_only_large_PI()
right_PI = IDS.get_min_only_small_PI()
test_subject_1 = PI.subset_operation(left_PI,3,right_PI)
assert test_subject_1.parking_frame.iloc[0]['n_places_min'] == IDS.get_high_inventory()[0]
assert test_subject_1.parking_frame.iloc[1]['n_places_min'] == IDS.get_high_inventory()[1]

def max_left_smaller_min_right_most_const_or():
left_PI = IDS.get_max_only_small_PI()
right_PI = IDS.get_min_only_large_PI()
test_subject_1 = PI.subset_operation(left_PI,3,right_PI)
assert test_subject_1.parking_frame.iloc[0]['n_places_min'] == IDS.get_low_inventory()[0]
assert test_subject_1.parking_frame.iloc[1]['n_places_min'] == IDS.get_low_inventory()[1]

def max_left_larger_min_right_most_const_or():
left_PI = IDS.get_max_only_large_PI()
right_PI = IDS.get_min_only_small_PI()
test_subject_1 = PI.subset_operation(left_PI,3,right_PI)
assert test_subject_1.parking_frame.iloc[0]['n_places_min'] == IDS.get_low_inventory()[0]
assert test_subject_1.parking_frame.iloc[1]['n_places_min'] == IDS.get_low_inventory()[1]

def min_left_smaller_max_right_most_const_or():
left_PI = IDS.get_min_only_small_PI()
right_PI = IDS.get_max_only_large_PI()
test_subject_1 = PI.subset_operation(left_PI,3,right_PI)
assert test_subject_1.parking_frame.iloc[0]['n_places_min'] == IDS.get_low_inventory()[0]
assert test_subject_1.parking_frame.iloc[1]['n_places_min'] == IDS.get_low_inventory()[1]

def min_left_larger_max_right_most_const_or():
left_PI = IDS.get_min_only_large_PI()
right_PI = IDS.get_max_only_small_PI()
test_subject_1 = PI.subset_operation(left_PI,3,right_PI)
assert test_subject_1.parking_frame.iloc[0]['n_places_min'] == IDS.get_low_inventory()[0]
assert test_subject_1.parking_frame.iloc[1]['n_places_min'] == IDS.get_low_inventory()[1]

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial | ⚡ Quick win

Helpers couple sub-cases — failure obscures which one broke.

Calling 6 helpers inside one test_or_plus_contraignant means the first failure aborts the rest and the test name doesn't say which scenario blew up. Renaming each helper to test_... (or using pytest.mark.parametrize) would surface failures individually.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@serveur_calcul_python/tests/unit/test_subset_operators.py` around lines 33 -
87, The current helper functions (min_left_smaller_min_right_simple_or,
min_left_larger_min_right_simple_or, min_left_smaller_min_right_most_const_or,
min_left_larger_min_right_most_const_or,
max_left_smaller_min_right_most_const_or,
max_left_larger_min_right_most_const_or,
min_left_smaller_max_right_most_const_or,
min_left_larger_max_right_most_const_or) are defined as plain helpers and are
all called from one test causing the first failure to abort others; fix by
turning each helper into an individual pytest test (rename each to start with
test_...) or replace them with a single pytest.mark.parametrize-based test over
the distinct scenarios using PI.subset_operation and assertions, ensuring each
scenario is reported separately (reference the functions above and the use of
PI.subset_operation and IDS.get_* helpers to structure cases).

EPMPaulPoly added 2 commits May 6, 2026 18:04
Des tests unitaires sont mis en place pour les fonctions de calcul pour assurer
que ces fonctions se comportent tel qu'énoncé dans le mémoire. Cet ensemble de
test est voué à assurer que la correction de bug résout les enjeux et assurer
que de modifications futures sont cohérentes.

1. Mise en place des test unitaires

test_affectation_prs_td.py permet de valider que l'affectation de points du
TaxDataset(rôle + cadastre) se fait correctement en fonction de la variation
spatio-temporelle des règlements. L'espace-temps est divisé en deux périodes
de deux territoires. Le TaxDataset est construit de manière à se séparer par
date et par localisation (voir les entrées sur datagen). La fonction qui
effectue cette opération a été légèrement modifiée de manière à pouvoir tester
seulement la fonction d'affectation sans connexion à la base de données

test_conversion_unite.py permet de valider la deuxième étape qui est de vérifier
que la conversion d'unité se fait telle que prévu. Pour ce faire, les combinaisons
de TaxDataset RegSetTerritory générées à la première étape voient leurs unités
converties un assert vérifie que le résultat est selon les attentes.

test_rule_calc.py gère des tests pour sept règlements de plus en plus complexes
(lire rares): règlements simples, règlements par seuil, règlements par addition,
règlement avec plancher, règlement avec plancher et plafond implémenté avec des
booléens et règlement avec plancher et plafond implémenté avec des seuils

test_subset_calc.py gère des test pour un sous-ensemble particulier. On vise ici
à valider que les fonction d'addition et de seuil fonctionnent telles que prévu

test_subset_operators.py permet de valider que les opérateurs entre sous-ensemble
sont mis en place correctement selon la définition du mémoire. On test ici les
opérations entre sous ensembles.

test_aggregation_inventaire.py teste les fonctions d'agrégatin et de dissolution
des inventaires avec différents cas de figure (mins, min et max). Ceci ne représente
pas nécessairement l'ensemble des permutations possibles et ne gère pas du tout les
cas qui vont avoir des erreurs. Des ensembles distincts de ceux pour les autre tests
sont utilisés ici.

pytest.ini et setup.py ont été modifiés pour qu'on puisse rouler directement les
tests

2. Modifications aux classes
Les classes RegSetTerritory et ParkingRegulations ont été légèrement modifiées.
Ici, la fonction ParkingRegulations.get_reg_by_id a été modifiée pour filtrer
les unités qui ne sont pas utilisées dans le règlement filtré et la fonction
RegSetTerritory.get_rst_by_tax_data a été modifiée de manière à ce que la
séparation du TaxDataSet d'entrée en fonction des RegSetTerritory se fasse
dans une fonction unique pour éviter qu'on puisse implémenter les tests sans
avoir à créer une fixture.

3. Combinaison des environment.yml

Deux environment.yml existaient: un pour la conversion du rôle et l'autre pour
tout le reste. Ces derniers ont été combinés et la documentation modifiée
pour réduire la quantité de fichiers. Cela permet aussi de mettre en place
un seul environnement de test au besoin

4. Création d'une github Action pour que les tests soient roulés sur les PR
le fichier test-python-scripts.yml a été créé de manière a rouler pytest dans
un environnement conda créé à partir de environment.yml. À noter qu'il
serait pertinent de mettre en place un conda lock pour que le build soit
plus déterministe
1. Clause groupby
La fonction subset_operation lançait une erreur sur pandas 3.0+
car include_groups=True est deprecated. Le paramètre a été
changé à False sans enjeu

2. Type guard plus robuste
un type guard plus robuste a été mis en place pour slope et intercept
dans la fonction de calcul pour les règlements par seuil avec la création
de is_number. Pas très élégant mais ça fonctionne
@EPMPaulPoly EPMPaulPoly force-pushed the erreur-agregation-lots branch from 5ac86fb to 9dcda9e Compare May 6, 2026 22:05
@EPMPaulPoly EPMPaulPoly linked an issue May 6, 2026 that may be closed by this pull request

@greenscientist greenscientist left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

small things from me.

Check the comment of the lapin for the unit tests, I did not check them in details, but some of its comment seemed useful (not all of them)

id: cache-env
uses: actions/cache@v5
env:
CACHE_NUMBER: 0

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add comment on my this value

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This value serves to reset the cache if necessary, again pulled from miniconda documentation

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, putting the link to the doc in the comment will be good.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@greenscientist when you say putting a link in comment you mean you want it in the yml file somewhere?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, when I say I want acomment it's in the file, not just here as an answer.

Comment on lines +17 to +19
- uses: actions/checkout@v6
with:
fetch-depth: 0

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@EPMPaulPoly Seem appropriate, other add comment on why its needed

parking_frame_thresh = pd.DataFrame()
parking_frame_thresh[config_db.db_column_lot_id] = relevant_data[config_db.db_column_lot_id]
if zero_crossing_min is not None and slope_min is not None:
if is_number(zero_crossing_min) and is_number(slope_min):

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This probably need some comments. I would understand the previous code, where if a value is not set, it's not available to be computed, but now I don't fully understand why a variable would have a value other than a number or None.

Should it be a case where an exception should be thrown instead ??

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sooo you're right. The value can either be a number or undefined (if it's a fixed value for example, the slope is [null] in the databse, or if the value isn't set at all (let's say if it's only a maximum condition and no minimum, the the slope and offset in the minimum will be null in the database). However, somewhere in the stack there seem to be some shenanigans where it can be either None or np.nan. The issue was that given that it was passing "is not None" because it was np.nan, we were getting issues with it going into the first if and returning a np.nan when it should skip the first if and go into the second one.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@greenscientist what do you want me to do here?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mon grand questionnement ici. Si slope_min a une valeur qui n'est pas None, mais qui n'est pas un nombre, est-ce que c'est plutot un cas d'erreur ou bien le fall back sur juste le zero_crossing min est suffisant? Est-ce que l'intention était d'avoir une pente et que y'a un eu un processing autre qui a fail et donc on devrait plus donner une exception plutot qu'une 2e valeur par default? Peut-etre que mettre ce petit bout de if else dans une function permetterais de clarifier l'intention? Est-ce que le Else devrait plutot etre un elif zero_crossing_min == None && slope_min == None et le else Throw exception ??

else:
ValueError('Can only calculate one rule at a time')

def is_number(x):

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a comment for the future, but a docstring would be nice for documenting functions.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok will put one in once we settle on all the things I need to cleanup here

if parking_final.empty:
parking_final = parking_frame_thresh
else:
parking_final = pd.concat([parking_final,parking_frame_thresh])

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ca pas rapport au code que tu as changer, mais je me demandais, tu ne peux pas faire un pd.concat avec un empty parking_final? (Pourrait éviter un if else)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Implémenter des tests sur les fonctions de calcul des règlements Erreur lors de l'agrégation par lot dans le calcul

2 participants