Add Clustered Standard Errors Support #996

msukiasyan · 2025-08-13T21:28:57Z

Adding support for cluster-robust standard errors.

Implementation:

Add clustered variance computation methods to StatsModelsLinearRegression and StatsModels2SLS classes, using existing groups parameter plumbing. Matches linearmodels package implementation.
Add cov_type parameter to OrthoIV constructor to expose the option since model_final isn't an argument.

File modifications:

econml/sklearn_extensions/linear_model.py - Added _compute_clustered_variance() methods
econml/iv/dml/_dml.py - Added cov_type parameter to OrthoIV class, pass groups parameter to model_final.
econml/dml/_rlearner.py - Pass groups parameter to model_final.
econml/tests/test_clustered_se.py - Tests validating against statsmodels implementation (up to small sample correction).

First contribution, so please let me know if I've missed anything!

msukiasyan · 2025-09-12T22:03:44Z

@kbattocchi was wondering if you had a chance to look into this PR and have any feedback I can work on. Thank you!

kbattocchi · 2025-09-22T15:50:12Z

@kbattocchi was wondering if you had a chance to look into this PR and have any feedback I can work on. Thank you!

Apologies for the slow response, I was traveling and missed this.

First of all, thanks for the contribution, this is definitely a valuable feature that we'd love to have.

In terms of high-level feedback, one thing is that I think we'd want to support federalization even with clustered standard errors, with the restriction that groups are assumed to be fully partitioned to each individual estimator (or in other words, group "1" for the first estimator is assumed to be different from group "1" in the second estimator). I believe that this should then be straightforward because again you can just compute the moments locally and combine them, except that you may add a count to the state to keep track of the number of groups when aggregating later (I'm not sure).

The only other thing that comes to mind immediately is a question: is the bias-correction for clustered errors completely standard, or would it make sense to provide options for this akin to HC0 vs HC1 for non-clustered errors, where we can optionally correct for the degrees of freedom?

msukiasyan · 2025-09-30T00:03:44Z

@kbattocchi thanks for the feedback!

On federalization: I think that makes sense, I'll try to add that soon.

On bias-correction: libraries do differ a bit on the defaults and whether they allow switching bias-correction on/off. Across Statsmodels/linearmodels/Stata, there is only a single "cluster" string option. Statsmodels and linearmodels both allow to toggle bias-correction via a separate argument but, from what I can tell, Stata just sticks with the (G/(G-1))*((N - 1)/(N - k)) correction (manual, page 51). I think it makes sense to have that same correction as default but I'm unsure if we want to a) expose a separate use_correction argument in high level API vs b) use modified strings like "cluster_HC0" vs c) no toggling allowed at all. Any thoughts on this?

- Implement clustered variance calculation in StatsModelsLinearRegression and StatsModels2SLS - Add cov_type='clustered' parameter to OrthoIV estimator - Add tests validating against statsmodels implementation Signed-off-by: Mikayel Sukiasyan <[email protected]>

… to handle groups=None Signed-off-by: Mikayel Sukiasyan <[email protected]>

Signed-off-by: Mikayel Sukiasyan <[email protected]>

…d corresponding test 2. Fix clustered SE computation issue with summarized data; add corresponding test Signed-off-by: Mikayel Sukiasyan <[email protected]>

…rection on/off for clustered covariance 2. Set both corrections on by default 3. Add new test for corrections and modify others with the new corrections defaults Signed-off-by: Mikayel Sukiasyan <[email protected]>

msukiasyan · 2025-10-02T21:27:43Z

Updates:

Federated learning
a. Added federated support for clustered covariance. This requires storing a few additional moments and n_groups for each learner. Added tests.
b. Fixed an issue with summarized data (frequencies > 1) and clustered covariance. Added a test to catch this.
Small sample corrections
a. Set small sample correction to match Stata/statsmodels defaults, edited tests to reflect
b. Added cov_options to allow toggling like in statsmodels

Will mark this ready for review for now.

Signed-off-by: Mikayel Sukiasyan <[email protected]>

msukiasyan marked this pull request as ready for review August 13, 2025 21:52

msukiasyan force-pushed the clustered-std-errors branch from a147e95 to b08b338 Compare October 2, 2025 07:37

msukiasyan marked this pull request as draft October 2, 2025 17:28

msukiasyan and others added 5 commits October 2, 2025 14:13

Fix StatsModels2SLS clustered SE bug: return groups from _check_input…

d94e34a

… to handle groups=None Signed-off-by: Mikayel Sukiasyan <[email protected]>

Tighten tolerance in test_clustered_se_matches_statsmodels

022b43d

Signed-off-by: Mikayel Sukiasyan <[email protected]>

1. Add support for federated learning in clustered SE computation; ad…

a4faabe

…d corresponding test 2. Fix clustered SE computation issue with summarized data; add corresponding test Signed-off-by: Mikayel Sukiasyan <[email protected]>

msukiasyan force-pushed the clustered-std-errors branch from 7576ff7 to 79cf45c Compare October 2, 2025 21:13

msukiasyan marked this pull request as ready for review October 2, 2025 21:27

Add missing federation test

3ffa0f1

Signed-off-by: Mikayel Sukiasyan <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add Clustered Standard Errors Support #996

Add Clustered Standard Errors Support #996

Uh oh!

msukiasyan commented Aug 13, 2025

Uh oh!

msukiasyan commented Sep 12, 2025

Uh oh!

kbattocchi commented Sep 22, 2025

Uh oh!

msukiasyan commented Sep 30, 2025

Uh oh!

msukiasyan commented Oct 2, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Add Clustered Standard Errors Support #996

Are you sure you want to change the base?

Add Clustered Standard Errors Support #996

Uh oh!

Conversation

msukiasyan commented Aug 13, 2025

Uh oh!

msukiasyan commented Sep 12, 2025

Uh oh!

kbattocchi commented Sep 22, 2025

Uh oh!

msukiasyan commented Sep 30, 2025

Uh oh!

msukiasyan commented Oct 2, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants