facebookresearch · neuralsorcerer · Mar 1, 2026 · Mar 1, 2026 · Mar 1, 2026 · Mar 1, 2026
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -44,6 +44,10 @@
     appear in the final output CSV. Keep columns that are not id, weight,
     covariate, or outcome columns will be placed into ``ignore_columns`` during
     processing but are still retained and available in the output.
+- **Clarified `_prepare_input_model_matrix` argument docs**
+  - Updated docstrings in `balance.utils.model_matrix` with
+    explicit descriptions for `sample`, `target`, `variables`, and `add_na`
+    behavior when preparing model-matrix inputs.
 
 ## Bug Fixes
 

diff --git a/balance/utils/model_matrix.py b/balance/utils/model_matrix.py
@@ -285,11 +285,21 @@ def _prepare_input_model_matrix(
         - Add na indicator if required.
 
     Args:
-        sample (pd.DataFrame | Any): This can either be a DataFrame or a Sample object. TODO: add text.
-        target (pd.DataFrame | Any | None, optional): This can either be a DataFrame or a Sample object.. Defaults to None.
-        variables (List[str] | None, optional): Defaults to None. TODO: add text.
-        add_na (bool, optional): Defaults to True. TODO: add text.
-        fix_columns_names (bool, optional): Defaults to True. If to fix the column names of the DataFrame by changing special characters to '_'.
+        sample (pd.DataFrame | Any): Input sample data as either a DataFrame or
+            a ``Sample``-like object that stores the data in ``._df``.
+        target (pd.DataFrame | Any | None, optional): Optional target data as
+            either a DataFrame or a ``Sample``-like object. If provided, rows
+            are concatenated with sample rows for downstream matrix creation.
+            Defaults to None.
+        variables (List[str] | None, optional): Explicit variables to keep from
+            ``sample``/``target`` before concatenation. If None, variables are
+            inferred via ``choose_variables`` on the provided inputs.
+        add_na (bool, optional): If True, add missingness indicator columns to
+            the concatenated data. If False, drop rows with missing values and
+            preserve target-only-all-NA validation behavior. Defaults to True.
+        fix_columns_names (bool, optional): Defaults to True. If to fix the
+            column names of the DataFrame by changing special characters to
+            '_'.
 
     Raises:
         Exception: "Variable names cannot contain characters '[' or ']'"