Skip to content
Closed
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,10 @@
appear in the final output CSV. Keep columns that are not id, weight,
covariate, or outcome columns will be placed into ``ignore_columns`` during
processing but are still retained and available in the output.
- **Clarified `_prepare_input_model_matrix` argument docs**
- Updated docstrings in `balance.utils.model_matrix` with
explicit descriptions for `sample`, `target`, `variables`, and `add_na`
behavior when preparing model-matrix inputs.

## Bug Fixes

Expand Down
20 changes: 15 additions & 5 deletions balance/utils/model_matrix.py
Original file line number Diff line number Diff line change
Expand Up @@ -285,11 +285,21 @@ def _prepare_input_model_matrix(
- Add na indicator if required.

Args:
sample (pd.DataFrame | Any): This can either be a DataFrame or a Sample object. TODO: add text.
target (pd.DataFrame | Any | None, optional): This can either be a DataFrame or a Sample object.. Defaults to None.
variables (List[str] | None, optional): Defaults to None. TODO: add text.
add_na (bool, optional): Defaults to True. TODO: add text.
fix_columns_names (bool, optional): Defaults to True. If to fix the column names of the DataFrame by changing special characters to '_'.
sample (pd.DataFrame | Any): Input sample data as either a DataFrame or
a ``Sample``-like object that stores the data in ``._df``.
target (pd.DataFrame | Any | None, optional): Optional target data as
either a DataFrame or a ``Sample``-like object. If provided, rows
are concatenated with sample rows for downstream matrix creation.
Defaults to None.
variables (List[str] | None, optional): Explicit variables to keep from
``sample``/``target`` before concatenation. If None, variables are
inferred via ``choose_variables`` on the provided inputs.
add_na (bool, optional): If True, add missingness indicator columns to
the concatenated data. If False, drop rows with missing values and
preserve target-only-all-NA validation behavior. Defaults to True.
fix_columns_names (bool, optional): Defaults to True. If to fix the
column names of the DataFrame by changing special characters to
'_'.

Raises:
Exception: "Variable names cannot contain characters '[' or ']'"
Expand Down
Loading