37 short vs long #42

stephaniereinders · 2025-03-27T13:30:43Z

Update functions to allow users to input an optional second data frame of writer profiles. These features will allow the user to train model comparisons between short and long writing prompts.

…aframes Distance functions now accept an optional second dataframe parameter. When provided, functions calculate distances between all pairs of rows from the first and second dataframes, rather than within a single dataframe. The primary use case for this is to calculate distances between pairs of handwriting samples where one sample is short and the other is long, without also calculating the distances between two long samples or two short samples.

Fixed an issue where the ntrees parameter was defined but not used in the train_rf() function. The function was previously using a hardcoded number of trees regardless of the parameter value passed by the user.

…een two dataframes train_rf() function now accepts an optional second dataframe parameter. When provided, train_rf() calculates distances between all pairs of rows from the first and second dataframes, rather than within a single dataframe. Then, a random forest is trained on the distances. The primary use case for this is to train a random forest on distances between pairs of handwriting samples where one sample is short and the other is long, without also calculating the distances between two long samples or two short samples.

The get_ref_scores() function now accepts an optional second dataframe parameter. When provided, the funtion calculates similarity scores between all pairs of rows from the first and second dataframes, rather than within a single dataframe. The primary use case for this is to calculate scores between pairs of handwriting samples where one smple is short and the other is long, without also calculating the scores between two longsamples or two short samples.

…dataframes compare_writer_profiles() now accepts an optional second dataframe parameter. When provided, the function compares all pairs of writer profiles from the first and second dataframes, rather than within a single dataframe. The primary use case for this is to compare handwriting samples where one sample is short and the other is long, without also comparing two long samples or two short samples.

…2nd dataframe

…e argument

…rame of writer profiles

More tests need to be performed before a random forest for short vs long comparisons is added to the handwriterRF package. short_vs_long.R script is now in the handwriterRF_short_vs_long repo for those tests.

Merge branch 'main' into 37-short-vs-long # Conflicts: # tests/testthat/test-compare.R

stephaniereinders added 11 commits February 28, 2025 10:25

fix(train): use ntrees parameter instead of hardcoded value

a715a7c

Fixed an issue where the ntrees parameter was defined but not used in the train_rf() function. The function was previously using a hardcoded number of trees regardless of the parameter value passed by the user.

Fix(distance): get_distances() now correctly adds a writer column to …

796b101

…2nd dataframe

docs: Added an optional 2nd dataframe parameter to documentation

704df99

test: add tests to check functions that have an optional 2nd datafram…

3b4eb85

…e argument

docs: adds comment about functions that now accept optional 2nd dataf…

51de823

…rame of writer profiles

chore: move short_vs_long.R script to handwriterRF_short_vs_long repo

3a64add

More tests need to be performed before a random forest for short vs long comparisons is added to the handwriterRF package. short_vs_long.R script is now in the handwriterRF_short_vs_long repo for those tests.

chore: fix merge conflicts

d7446c9

Merge branch 'main' into 37-short-vs-long # Conflicts: # tests/testthat/test-compare.R

stephaniereinders linked an issue Mar 27, 2025 that may be closed by this pull request

Does training a random forest on short vs long samples improve the results on short vs long comparisons? #37

Open

stephaniereinders merged commit be0d812 into main Mar 27, 2025
1 of 6 checks passed

stephaniereinders deleted the 37-short-vs-long branch March 27, 2025 13:31

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

37 short vs long #42

37 short vs long #42

Uh oh!

stephaniereinders commented Mar 27, 2025 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

37 short vs long #42

37 short vs long #42

Uh oh!

Conversation

stephaniereinders commented Mar 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

stephaniereinders commented Mar 27, 2025 •

edited

Loading