Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
49 changes: 32 additions & 17 deletions .Rbuildignore
Original file line number Diff line number Diff line change
@@ -1,21 +1,36 @@
^renv$
^renv\.lock$
^README\.Rmd$
^README\.html$
^LICENSE$
.ignore
.editorconfig
.gitignore
^.*\.Rproj$
^\.agents$
^\.ccache$
^\.clangd$
^\.claude$
^\.cspell$
^\.cursor$
^\.editorconfig$
^\.git$
^\.github$
^\.gitignore$
^\.ignore$
^\.lintr$
^\.Rproj\.user$
^man-roxygen$
^pkgdown$
^\.vscode$
^\.lintr$
^\.github$
^\.ccache$
^docs$
^revdep$
^.*\.Rproj$
^AGENTS.md$
^air.toml$
^attic$
^attic_local$
^CITATION.cff$
^CLAUDE.md$
^cspell.json$
^CONTRIBUTING.md$
^cran-comments\.md$
^CRAN-SUBMISSION$
^.claude$
^docs$
^inst/extdata/.+\.R$
^LICENSE$
^local_attic$
^man-roxygen$
^paper$
^pkgdown$
^README\.Rmd$
^README.html$
^revdep$
^tests/testthat/_object_snapshots$
132 changes: 132 additions & 0 deletions .agents/mlr3.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,132 @@
### Architecture

This package uses R6 classes organized around a dictionary registry pattern.

#### Class hierarchy

- `Learner` > `LearnerClassif` / `LearnerRegr` > concrete (e.g., `LearnerClassifRpart`)
- `Task` > `TaskSupervised` > `TaskClassif` / `TaskRegr`
- `Measure` > `MeasureClassif` / `MeasureRegr` / `MeasureSimilarity`
- `Resampling` > `ResamplingCV`, `ResamplingHoldout`, etc.
- `DataBackend` > `DataBackendDataTable`, `DataBackendCbind`, etc.
- `Prediction` > `PredictionClassif` / `PredictionRegr`

#### File naming

- One R6 class per file, named exactly as the class: `LearnerClassifRpart.R` contains `LearnerClassifRpart`.
- Named dataset tasks use an underscore: `TaskClassif_iris.R`.
- Dictionary files: `mlr_learners.R`, `mlr_tasks.R`, etc.

#### Dictionary system

Objects are registered in dictionaries and accessed via sugar functions:

| Dictionary | Sugar | Example |
|-----------------------|----------------------|----------------------------------|
| `mlr_learners` | `lrn()` / `lrns()` | `lrn("classif.rpart", cp = 0.1)` |
| `mlr_tasks` | `tsk()` / `tsks()` | `tsk("iris")` |
| `mlr_measures` | `msr()` / `msrs()` | `msr("classif.ce")` |
| `mlr_resamplings` | `rsmp()` / `rsmps()` | `rsmp("cv", folds = 5)` |
| `mlr_task_generators` | `tgen()` / `tgens()` | `tgen("friedman1")` |

Every new object **must** be registered at the bottom of its file:

```r
#' @include mlr_learners.R
mlr_learners$add("classif.rpart", function() LearnerClassifRpart$new())
```

#### Collation order

Derived classes must declare `#' @include ParentClass.R` in their roxygen header. This controls the `Collate:` field in DESCRIPTION so base classes load before derived classes.

#### Hyperparameters (paradox)

Parameters are defined with `paradox::ps()` and must be tagged `"train"` or `"predict"`:

```r
ps = ps(
cp = p_dbl(0, 1, default = 0.01, tags = "train"),
keep_model = p_lgl(default = FALSE, tags = "train")
)
```

In `.train()` / `.predict()`, retrieve values with `self$param_set$get_values(tags = "train")`.

There is a distinction between `default` and `init` values:
- `default` describes the behavior when a parameter is not set at all (i.e., the upstream function's default). It is informational only.
- `init` (via `p_xxx(init = ...)`) sets the parameter to a value upon construction. Use this when the mlr3 default should differ from the upstream default.
- A parameter tagged `"required"` causes an error if not set. A required parameter cannot have a `default` (that would be contradictory).
- paradox does type-checking and range-checking automatically; `get_values()` checks that required params are present. Additional feasibility checks are rarely needed.

#### Core dependencies

`data.table`, `checkmate`, `mlr3misc`, `paradox`, `R6`, and `cli` are imported wholesale. Use their functions directly without `::`. Key mlr3misc utilities: `map()`, `map_chr()`, `invoke()`, `calculate_hash()`, `str_collapse()`, `%nin%`, `%??%`.

#### Error handling

Use structured error/warning functions from mlr3misc: `error_config()`, `error_input()`, `error_learner_train()`, `error_learner_predict()`, `warning_config()`, `warning_input()`. These support `sprintf`-style formatting.

#### Reflections

`mlr_reflections` is an environment that stores allowed types, properties, and roles. Extension packages modify it to register new task types. Check it when adding new properties or feature types.

### Testing

- Tests for `R/{name}.R` go in `tests/testthat/test_{name}.R`.
- All new code should have an accompanying test.
- If there are existing tests, place new tests next to similar existing tests.
- Strive to keep your tests minimal with few comments.
- The full test suite takes a long time. Only run tests relevant to your changes with `devtools::test(filter = '^{name}')`.
- New learners must pass `run_autotest()` and `run_paramtest()`.
- Use shared assertion helpers: `expect_learner()`, `expect_task()`, `expect_resampling()`, `expect_measure()`, `expect_prediction()`.
- Shared test infrastructure lives in `inst/testthat/` and is sourced by extension packages too.

### Documentation

- Every user-facing function should be exported and have roxygen2 documentation.
- Wrap roxygen comments at 120 characters.
- Write one sentence per line.
- If a sentence exceeds the limit, break at a comma, "and", "or", "but", or other appropriate point.
- Internal functions should not have roxygen documentation.
- Whenever you add a new (non-internal) documentation topic, also add the topic to `_pkgdown.yml`.
- Always re-document the package after changing a roxygen2 comment.
- Use `pkgdown::check_pkgdown()` to check that all topics are included in the reference index.
- Don’t hand-edit generated artifacts: `man/`, or `NAMESPACE`.
- Roxygen templates live in `man-roxygen/` (e.g., `@template learner`, `@template param_id`). Use `@templateVar` to pass values.
- Bibliographic references go in `R/bibentries.R` and are cited with `` `r format_bib("key")` ``.
- Man page names for dictionary objects follow `mlr_learners_classif.rpart`, `mlr_tasks_iris`, etc.
- When you write examples, make sure they work.

### `NEWS.md`

- Every user-facing change should be given a bullet in `NEWS.md`. Do not add bullets for small documentation changes or internal refactorings.
- Each bullet should briefly describe the change to the end user and mention the related issue in parentheses.
- A bullet can consist of multiple sentences but should not contain any new lines (i.e. DO NOT line wrap).
- If the change is related to a function, put the name of the function early in the bullet.
- Order bullets alphabetically by function name. Put all bullets that don't mention function names at the beginning.

### GitHub

- If you use `gh` to retrieve information about an issue, always use `--comments` to read all the comments.

### Writing

- Use sentence case for headings.
- Use US English.

### Proofreading

If the user asks you to proofread a file, act as an expert proofreader and editor with a deep understanding of clear, engaging, and well-structured writing.

Work paragraph by paragraph, always starting by making a TODO list that includes individual items for each top-level heading.

Fix spelling, grammar, and other minor problems without asking the user. Label any unclear, confusing, or ambiguous sentences with a FIXME comment.

Only report what you have changed.

### References

- [mlr3book](https://mlr3book.mlr-org.com/) — comprehensive guide to the mlr3 ecosystem.
- [mlr3misc](https://github.com/mlr-org/mlr3misc) — helper functions used throughout the codebase.
- [paradox](https://github.com/mlr-org/paradox) — hyperparameter/configuration space definitions.
7 changes: 7 additions & 0 deletions .claude/settings.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
{
"permissions": {
"allow": [
"Bash(gh run view:*)"
]
}
}
2 changes: 2 additions & 0 deletions .cspell/project-words.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
# Project-specific words — commit and share with the team.
# Add words here (or via "Add to project dictionary" in VS Code / Cursor).
24 changes: 7 additions & 17 deletions .editorconfig
Original file line number Diff line number Diff line change
@@ -1,21 +1,11 @@
# See http://editorconfig.org
root = true

# settings for all files
[*]
charset = utf-8
end_of_line = lf
insert_final_newline = true
indent_style = space
trim_trailing_whitespace = true

[*.{r,R,md,Rmd}]
indent_size = 2

[*.{c,h}]
indent_size = 4

[*.{cpp,hpp}]
indent_size = 4

[{NEWS.md,DESCRIPTION,LICENSE}]
max_line_length = 80
charset = utf-8 # Ensure all files are saved in UTF-8 encoding
end_of_line = lf # Use LF line endings (Unix style)
indent_style = space # Use spaces for indentation
indent_size = 2 # always use 2 spaces for indentation, R, C, python, etc.
max_line_length = 120 # max line length
trim_trailing_whitespace = true # Remove trailing whitespace
19 changes: 11 additions & 8 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# File created using '.gitignore Generator' for Visual Studio Code: https://bit.ly/vscode-gig
# Created by https://www.toptal.com/developers/gitignore/api/windows,visualstudiocode,r,macos,linux
# Edit at https://www.toptal.com/developers/gitignore?templates=windows,visualstudiocode,r,macos,linux
# Created by https://www.toptal.com/developers/gitignore/api/windows,visualstudiocode,macos,linux,r
# Edit at https://www.toptal.com/developers/gitignore?templates=windows,visualstudiocode,macos,linux,r

### Linux ###
*~
Expand Down Expand Up @@ -150,13 +150,17 @@ $RECYCLE.BIN/
# Windows shortcuts
*.lnk

# End of https://www.toptal.com/developers/gitignore/api/windows,visualstudiocode,r,macos,linux
# End of https://www.toptal.com/developers/gitignore/api/windows,visualstudiocode,macos,linux,r

# Custom rules (everything added below won't be overriden by 'Generate .gitignore File' if you use 'Update' option)

# R
.Rprofile
README.html
src/*.o
src/*.so
src/*.dll
.clangd

# CRAN
cran-comments.md
Expand All @@ -170,10 +174,9 @@ docs/
renv/
renv.lock

# vscode
.vscode

# revdep
revdep/
check/*
.claude/

# AI
.claude/settings.local.json
CLAUDE.md
10 changes: 7 additions & 3 deletions .lintr
Original file line number Diff line number Diff line change
@@ -1,9 +1,13 @@
linters: linters_with_defaults(
# lintr defaults: https://github.com/jimhester/lintr#available-linters
# lintr defaults: https://lintr.r-lib.org/reference/default_linters.html
# the following setup changes/removes certain linters
assignment_linter = NULL, # do not force using <- for assignments
object_name_linter = object_name_linter(c("snake_case", "CamelCase")), # only allow snake case and camel case object names
object_name_linter = object_name_linter(c("snake_case", "CamelCase", "SNAKE_CASE")), # only allow snake case and camel case object names
cyclocomp_linter = NULL, # do not check function complexity
commented_code_linter = NULL, # allow code in comments
line_length_linter = line_length_linter(2000)
line_length_linter = line_length_linter(120L), # same as .editorconfig
# use indent=2 as in .editorconfig; also use block-aligned continuation with 2 space,
# not “align under first argument” style.
indentation_linter = indentation_linter(indent = 2L, hanging_indent_style = "never")
)

34 changes: 34 additions & 0 deletions .vscode/settings.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
{

// ********** settings git / gitlens **********

// disable "blame hover", to remove visual noise
"gitlens.currentLine.enabled": false,

// ********** settings for cspell *************
// show spelling errors as hints (not in problems panel)
"cSpell.diagnosticLevel": "Hint",
// file type whitelist, useGitignore, and languageSettings live in cspell.json

// ********** settings for R *************

// format on save so we dont have to manually format, use AIR for formatting
"[r]": {
"editor.formatOnSave": true,
"editor.defaultFormatter": "Posit.air-vscode",
// disable hover for R, to remove visual noise
"editor.hover.enabled": false
},

// ********** settings for C / C++ **********

"[c]": {
"editor.formatOnSave": true,
"editor.defaultFormatter": "llvm-vs-code-extensions.vscode-clangd"
},
"[cpp]": {
"editor.formatOnSave": true,
"editor.defaultFormatter": "llvm-vs-code-extensions.vscode-clangd"
}
}

Loading
Loading