Peer Review - Wendy Frankel

## Package Review

*Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide*

- [x] As the reviewer, I confirm that there are no conflicts of interest for me to review this work (If you are unsure whether you are in conflict, please speak to your editor _before_ starting your review).

#### Documentation

The package includes all the following forms of documentation:

- [x] **A statement of need** clearly stating problems the software is designed to solve and its target audience in the README file.
- [x] **Installation instructions:** for the development version of the package and any non-standard dependencies in README.
- [x] **Short quickstart tutorials** demonstrating significant functionality that successfully runs locally.
- [x] **Function Documentation:** for all user-facing functions.
- [x] **Examples** for all user-facing functions.
- [x] **Community guidelines** including contribution guidelines in the README or CONTRIBUTING.
- [ ] **Metadata** including author(s), author e-mail(s), a URL, and any other relevant metadata, for example, in a `pyproject.toml` file or elsewhere.

Readme file  requirements
The package meets the readme requirements below:

- [x] Package has a README.md file in the root directory.

The README should include, from top to bottom:

- [x] The package name
- [ ] Badges for:
    - [x] Continuous integration and test coverage,
    - [ ] Docs building (if you have a documentation website),
    - [x] Python versions supported,
    - [x] Current package version (on PyPI / Conda).

*NOTE: If the README has many more badges, you might want to consider using a table for badges: [see this example](https://github.com/ropensci/drake). Such a table should be wider than high. A badge for pyOpenSci peer review will be provided when the package is accepted.*

- [x] Short description of package goals.
- [ ] Package installation instructions
- [ ] Any additional setup required to use the package (authentication tokens, etc.)
- [x] Descriptive links to all vignettes. If the package is small, there may only be a need for one vignette which could be placed in the README.md file.
    - [x] Brief demonstration of package usage (as it makes sense - links to vignettes could also suffice here if package description is clear)
- [x] Link to your documentation website.
- [x] If applicable, how the package compares to other similar packages and/or how it relates to other packages in the scientific ecosystem.
- [ ] Citation information

#### Usability

Reviewers are encouraged to submit suggestions (or pull requests) that will improve the usability of the package as a whole.
The package structure should follow the general community best practices. In general, please consider whether:

- [x] Package documentation is clear and easy to find and use.
- [x] The need for the package is clear
- [x] All functions have documentation and associated examples for use
- [x] The package is easy to install


#### Functionality

- [ ] **Installation:** Installation succeeds as documented.
- [ ] **Functionality:** Any functional claims of the software been confirmed.
- [x] **Performance:** Any performance claims of the software been confirmed.
- [x] **Automated tests:**
  - [x] All tests pass on the reviewer's local machine for the package version submitted by the author. Ideally this should be a tagged version making it easy for reviewers to install.
  - [x] Tests cover essential functions of the package and a reasonable range of inputs and conditions.
- [x] **Continuous Integration:** Has continuous integration setup (We suggest using Github actions but any CI platform is acceptable for review)
- [x] **Packaging guidelines**: The package conforms to the pyOpenSci [packaging guidelines](https://www.pyopensci.org/python-package-guide).
    A few notable highlights to look at:
    - [x] Package supports modern versions of Python and not [End of life versions](https://endoflife.date/python).
    - [x] Code format is standard throughout package and follows PEP 8 guidelines (CI tests for linting pass)

#### For packages also submitting to JOSS

- [x] The package has an **obvious research application** according to JOSS's definition in their [submission requirements](http://joss.theoj.org/about#submission_requirements).

*Note:* Be sure to check this carefully, as JOSS's submission requirements and scope differ from pyOpenSci's in terms of what types of packages are accepted.

The package contains a `paper.md` matching [JOSS's requirements](http://joss.theoj.org/about#paper_structure) with:

- [x] **A short summary** describing the high-level functionality of the software
- [ ] **Authors:** A list of authors with their affiliations
- [x] **A statement of need** clearly stating problems the software is designed to solve and its target audience.
- [ ] **References:** With DOIs for all those that have one (e.g. papers, datasets, software).

#### Final approval (post-review)

- [ ] **The author has responded to my review and made changes to my satisfaction. I recommend approving this package.**

Estimated hours spent reviewing:

1

#### Review Comments

1. In your 'development setup' portion of the readme, you say to create environment using the line *conda env create -f environment.yml*. However, we do not have environment.yml file yet, and even after cloning your repo I didn't have it. 

2. Also in your readme file, in the development setup portion, we can't run tests and coverage before running *pip install -e .*, so you should consider clarifying this or changing the order around. 

3. I like how in the data comparison section, you have a clear way of outputting a human-readable and clear report. I wonder if, for the other functions, you might consider doing the same? especially for a function like load_optimized_csv - to really take your package above and beyond, a report generating function could be super cool here.

4. I think you can be clearer in the function documentation about what data_correction and what it is doing exactly. 

5. For the function data_version_diff, I am just not seeing that much why it is important, and I think you could expand on this.

6. Referencing claude usage only in the generate_report function documentation - would be good to clarify this for the other functions as well. 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Peer Review - Wendy Frankel #103

Package Review

Documentation

Usability

Functionality

For packages also submitting to JOSS

Final approval (post-review)

Review Comments

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Peer Review - Wendy Frankel #103

Description

Package Review

Documentation

Usability

Functionality

For packages also submitting to JOSS

Final approval (post-review)

Review Comments

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions