Package Review
Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide
Documentation
The package includes all the following forms of documentation:
Readme file requirements
The package meets the readme requirements below:
The README should include, from top to bottom:
NOTE: If the README has many more badges, you might want to consider using a table for badges: see this example. Such a table should be wider than high. A badge for pyOpenSci peer review will be provided when the package is accepted.
Usability
Reviewers are encouraged to submit suggestions (or pull requests) that will improve the usability of the package as a whole.
The package structure should follow the general community best practices. In general, please consider whether:
Functionality
For packages also submitting to JOSS
Note: Be sure to check this carefully, as JOSS's submission requirements and scope differ from pyOpenSci's in terms of what types of packages are accepted.
The package contains a paper.md matching JOSS's requirements with:
Final approval (post-review)
Estimated hours spent reviewing:
1
Review Comments
-
In your 'development setup' portion of the readme, you say to create environment using the line conda env create -f environment.yml. However, we do not have environment.yml file yet, and even after cloning your repo I didn't have it.
-
Also in your readme file, in the development setup portion, we can't run tests and coverage before running pip install -e ., so you should consider clarifying this or changing the order around.
-
I like how in the data comparison section, you have a clear way of outputting a human-readable and clear report. I wonder if, for the other functions, you might consider doing the same? especially for a function like load_optimized_csv - to really take your package above and beyond, a report generating function could be super cool here.
-
I think you can be clearer in the function documentation about what data_correction and what it is doing exactly.
-
For the function data_version_diff, I am just not seeing that much why it is important, and I think you could expand on this.
-
Referencing claude usage only in the generate_report function documentation - would be good to clarify this for the other functions as well.
Package Review
Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide
Documentation
The package includes all the following forms of documentation:
pyproject.tomlfile or elsewhere.Readme file requirements
The package meets the readme requirements below:
The README should include, from top to bottom:
NOTE: If the README has many more badges, you might want to consider using a table for badges: see this example. Such a table should be wider than high. A badge for pyOpenSci peer review will be provided when the package is accepted.
Usability
Reviewers are encouraged to submit suggestions (or pull requests) that will improve the usability of the package as a whole.
The package structure should follow the general community best practices. In general, please consider whether:
Functionality
A few notable highlights to look at:
For packages also submitting to JOSS
Note: Be sure to check this carefully, as JOSS's submission requirements and scope differ from pyOpenSci's in terms of what types of packages are accepted.
The package contains a
paper.mdmatching JOSS's requirements with:Final approval (post-review)
Estimated hours spent reviewing:
1
Review Comments
In your 'development setup' portion of the readme, you say to create environment using the line conda env create -f environment.yml. However, we do not have environment.yml file yet, and even after cloning your repo I didn't have it.
Also in your readme file, in the development setup portion, we can't run tests and coverage before running pip install -e ., so you should consider clarifying this or changing the order around.
I like how in the data comparison section, you have a clear way of outputting a human-readable and clear report. I wonder if, for the other functions, you might consider doing the same? especially for a function like load_optimized_csv - to really take your package above and beyond, a report generating function could be super cool here.
I think you can be clearer in the function documentation about what data_correction and what it is doing exactly.
For the function data_version_diff, I am just not seeing that much why it is important, and I think you could expand on this.
Referencing claude usage only in the generate_report function documentation - would be good to clarify this for the other functions as well.