Skip to content

Improve simulation#254

Open
stemangiola wants to merge 13 commits into
masterfrom
improve-simulation
Open

Improve simulation#254
stemangiola wants to merge 13 commits into
masterfrom
improve-simulation

Conversation

@stemangiola
Copy link
Copy Markdown
Collaborator

No description provided.

…n validation

- Added support for variability formulas in the simulate_data function, allowing users to specify a formula for variability alongside the main formula.
- Implemented dimension validation to ensure that simulated data does not exceed the dimensions of the original model, preventing potential errors during simulation.
- Introduced a new utility function to normalize sum-to-zero vector parameters in posterior draws, addressing floating-point precision issues.
- Updated Stan model to accommodate new design matrices for random effects and variability, ensuring compatibility with existing functionality.
- Added comprehensive unit tests to validate the new features and ensure robustness against various model configurations.
- Introduced `sccomp_simulate` function for simulating data from fitted models, enhancing usability and flexibility.
- Deprecated the `simulate_data` function, providing a warning to users to transition to `sccomp_simulate`.
- Updated documentation to reflect changes and provide clear guidance on the new function's usage.
- Enhanced unit tests to validate the new simulation functionality and ensure compatibility with existing workflows.
- Adjusted Stan model to accommodate new parameters for user-provided coefficients, improving simulation accuracy and performance.
- Added validation to ensure user-provided coefficients sum to zero within a specified tolerance, crucial for compositional models.
- Implemented checks using tidyverse to identify and report invalid coefficients, improving error handling and user feedback.
- Updated comments to clarify the importance of coefficient normalization in the context of simulation.
- Added optional parameters `mean_dispersion_slope` and `mean_dispersion_intercept` to allow users to specify the slope and intercept for the mean-dispersion association, enhancing flexibility in simulations.
- Updated the `simulate_data` function to reflect changes in parameter defaults, setting `variability_multiplier` to 1 for consistency.
- Enhanced documentation to include new parameters and their usage, ensuring clarity for users.
- Adjusted the Stan model to accommodate the new mean-dispersion parameters, improving simulation accuracy.
- Changed default `variability_multiplier` from 5 to 1 in `sccomp_simulate()` and `simulate_data()`, preserving model variability by default.
- Enhanced simulation documentation for clarity on `mean_dispersion_slope` effects and updated vignette examples for consistency.
- Clarified behavior of mean-dispersion association parameters in simulation, detailing their impact on intercept and factor-level variability.
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request improves the simulation functionality in the sccomp package by introducing a new sccomp_simulate() function with enhanced capabilities, deprecating the old simulate_data() function, and providing comprehensive documentation and tests.

Changes:

  • Introduced new sccomp_simulate() function with improved API design supporting separate coefficient tables, optional mean-dispersion association parameters, and better handling of new data scenarios
  • Changed default variability_multiplier from 5 to 1 to preserve fitted model variability by default
  • Added comprehensive test suite with 11 test cases covering various simulation scenarios
  • Updated Stan model to support user-provided coefficients and mean-dispersion parameters
  • Created extensive vignette with practical examples and use cases

Reviewed changes

Copilot reviewed 10 out of 10 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
R/simulate_data.R Major refactoring introducing sccomp_simulate() with new API; deprecated simulate_data() wrapper
inst/stan/glm_multi_beta_binomial_simulate_data.stan Enhanced Stan model with user-provided coefficients and mean-dispersion parameters support
tests/testthat/test-simulate_data.R New comprehensive test suite with 407 lines covering 11 different scenarios
man/sccomp_simulate.Rd New documentation for sccomp_simulate() function
man/simulate_data.Rd Updated to indicate deprecation
dev/sccomp_simulate.qmd New 859-line vignette with detailed examples and use cases
R/utilities.R Added normalize_sum_to_zero_params() utility function (not currently used)
inst/NEWS.rd Release notes documenting simulation improvements
NAMESPACE Updated exports for new function
DESCRIPTION Version bump to 2.1.25

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread inst/NEWS.rd Outdated
Comment thread R/simulate_data.R Outdated
Comment thread DESCRIPTION
Comment thread man/sccomp_simulate.Rd Outdated
Comment thread dev/sccomp_simulate.qmd Outdated
Comment thread R/simulate_data.R
Comment on lines +730 to +731
mean_dispersion_slope = mean_dispersion_slope,
mean_dispersion_intercept = mean_dispersion_intercept,
Copy link

Copilot AI Jan 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The deprecated simulate_data() function references mean_dispersion_slope and mean_dispersion_intercept parameters on lines 730-731, but these parameters are not defined in the function signature. This will cause an error when users call simulate_data(). These parameters should either be added to the function signature or removed from the call to sccomp_simulate().

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed from the call to sccomp_simulate()

Comment thread man/sccomp_simulate.Rd
stemangiola and others added 6 commits January 13, 2026 11:39
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
- Updated unit tests to check for the presence of the `cache_stan_model` parameter in `sccomp_simulate`, reflecting the recent function introduction.
- Maintained checks for the deprecated `simulate_data` function to ensure backward compatibility.
- Verified that the default value of `cache_stan_model` is set correctly across relevant functions.
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 11 out of 11 changed files in this pull request and generated 8 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

Comment thread man/sccomp_simulate.Rd Outdated
Comment thread R/utilities.R
Comment on lines +697 to +698
first_indices = gsub(paste0("^", param_pattern, "\\[(\\d+),.*"), "\\1", param_names)
second_indices = gsub(paste0("^", param_pattern, "\\[\\d+,(\\d+)\\]"), "\\1", param_names)
Comment thread R/simulate_data.R
Comment on lines +802 to 806
# Unique variability design (for compatibility)
XA = Xa %>%
as_tibble() %>%
distinct()

matrix[A_simulated, A_simulated] XA_simulated;
matrix[C_simulated,M_simulated] beta_simulated;
matrix[N_simulated, A_simulated] Xa_simulated; // Variability design matrix for simulated data
matrix[A_simulated, A_simulated] XA_simulated; // Unique variability design (for compatibility with old code)
Comment thread inst/NEWS.rd
Comment on lines +6 to +8
\item **Simulation improvements:** Changed default \code{variability_multiplier} parameter from 5 to 1 in \code{sccomp_simulate()} and \code{simulate_data()}. The new default preserves the fitted model's variability by default, rather than artificially increasing it. Users can still specify custom values greater than 1 to increase variability for benchmarking or sensitivity analysis.
\item Enhanced simulation documentation. Improved clarity on how \code{mean_dispersion_slope} parameter affects both intercept and factor-level variability parameters in the variability formula. Updated vignette examples to use consistent \code{number_of_draws} values.
\item Clarified behavior of mean-dispersion association parameters in simulation. The \code{mean_dispersion_slope} affects the slope for both intercept columns (alpha1) and factor columns (alpha2, e.g., type), with intercept only added to intercept columns as per the model specification.
Comment thread R/simulate_data.R
#'
#' # # Simulate data
#' # simulate_data(counts_obj, estimate, ~type, ~1, sample, cell_group, c(b_0, b_1))
#' # sccomp_simulate(estimate, ~type, ~1, counts_obj, sample, cell_group, c(b_0, b_1))
Comment on lines +8 to +14
# Helper function to skip tests if cmdstan is not available
skip_cmdstan <- function() {
if (!instantiate::stan_cmdstan_exists()) {
skip("CmdStan not available")
}
}

Comment thread man/sccomp_simulate.Rd
Comment on lines +37 to +39
\item{mean_dispersion_slope}{Optional numeric value for the slope parameter of the mean-dispersion association. If NULL, the slope from the fitted model (prec_coeff\link{2}) will be used.}

\item{mean_dispersion_intercept}{Optional numeric value for the intercept parameter of the mean-dispersion association. If NULL, the intercept from the fitted model (prec_coeff\link{1}) will be used. This ensures that when only the slope is changed, the intercept remains the same.}
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants