Skip to content

Conversation

@stephaniereinders
Copy link
Member

Changes to get_clusters_batch()

If the user supplies writer and doc indices, the output data frame contains columns 'docname', 'writer', 'doc', and columns for cluster assignment and graph measurements. If the user does not supply writer and doc indices, the output data frame has the 'docname' column but not 'writer' or 'doc' columns. The cluster assignment and graph measurement columns are the same.

Changes to get_cluster_fill_counts()

get_cluster_fill_counts() now checks whether the 'writer' and 'doc' columns are present in the input data frame. If they are present, the function groups by them, along with 'docname' and 'cluster'. If they are not present, the function groups by 'docname' and 'cluster' alone.

Tests

I created tests for get_clusters_batch() and get_cluster_fill_counts() when writer and doc indices are not used.

I also renamed the test files to match the corresponding R files to make finding things easier.

## Changes to `get_clusters_batch()`
If the user supplies writer and doc indices, the output data frame contains columns 'docname', 'writer', 'doc', and columns for cluster assignment and graph measurements. If the user does not supply writer and doc indices, the output data frame has the 'docname' column but not 'writer' or 'doc' columns. The cluster assignment and graph measurement columns are the same.

## Changes to `get_cluster_fill_counts()`
`get_cluster_fill_counts()` now checks whether the 'writer' and 'doc' columns are present in the input data frame. If they are present, the function groups by them, along with 'docname' and 'cluster'. If they are not present, the function groups by 'docname' and 'cluster' alone.

## Tests
I created tests for `get_clusters_batch()` and `get_cluster_fill_counts()` when writer and doc indices are not used.

I also renamed the test files to match the corresponding R files to make finding things easier.
@stephaniereinders stephaniereinders merged commit e15d397 into master Nov 6, 2024
1 check passed
@stephaniereinders stephaniereinders deleted the 201-clusters-no-indices branch November 6, 2024 23:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Change get_clusters_batch() to not require writer and doc indices

2 participants