-
Notifications
You must be signed in to change notification settings - Fork 3
37 short vs long #42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
37 short vs long #42
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…aframes Distance functions now accept an optional second dataframe parameter. When provided, functions calculate distances between all pairs of rows from the first and second dataframes, rather than within a single dataframe. The primary use case for this is to calculate distances between pairs of handwriting samples where one sample is short and the other is long, without also calculating the distances between two long samples or two short samples.
Fixed an issue where the ntrees parameter was defined but not used in the train_rf() function. The function was previously using a hardcoded number of trees regardless of the parameter value passed by the user.
…een two dataframes train_rf() function now accepts an optional second dataframe parameter. When provided, train_rf() calculates distances between all pairs of rows from the first and second dataframes, rather than within a single dataframe. Then, a random forest is trained on the distances. The primary use case for this is to train a random forest on distances between pairs of handwriting samples where one sample is short and the other is long, without also calculating the distances between two long samples or two short samples.
The get_ref_scores() function now accepts an optional second dataframe parameter. When provided, the funtion calculates similarity scores between all pairs of rows from the first and second dataframes, rather than within a single dataframe. The primary use case for this is to calculate scores between pairs of handwriting samples where one smple is short and the other is long, without also calculating the scores between two longsamples or two short samples.
…dataframes compare_writer_profiles() now accepts an optional second dataframe parameter. When provided, the function compares all pairs of writer profiles from the first and second dataframes, rather than within a single dataframe. The primary use case for this is to compare handwriting samples where one sample is short and the other is long, without also comparing two long samples or two short samples.
…rame of writer profiles
More tests need to be performed before a random forest for short vs long comparisons is added to the handwriterRF package. short_vs_long.R script is now in the handwriterRF_short_vs_long repo for those tests.
Merge branch 'main' into 37-short-vs-long # Conflicts: # tests/testthat/test-compare.R
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Update functions to allow users to input an optional second data frame of writer profiles. These features will allow the user to train model comparisons between short and long writing prompts.