Skip to content

Upload the files explaining the algorithm and the benchmark results#15

Merged
ypriverol merged 18 commits into
bigbio:mainfrom
weizhongchun:main
Nov 20, 2025
Merged

Upload the files explaining the algorithm and the benchmark results#15
ypriverol merged 18 commits into
bigbio:mainfrom
weizhongchun:main

Conversation

@weizhongchun

@weizhongchun weizhongchun commented Nov 20, 2025

Copy link
Copy Markdown
Collaborator

PR Type

Documentation


Description

  • Add comprehensive README documentation for onsite package

  • Document three PTM localization algorithms: AScore, PhosphoRS, LucXor

  • Include installation instructions, CLI usage, and command-line options

  • Add benchmark results comparing four phosphorylation site localization tools

  • Provide example result files and algorithm parameter documentation


Diagram Walkthrough

flowchart LR
  A["Documentation Files"] --> B["README.md"]
  A --> C["benchmark.md"]
  B --> D["Algorithm Details"]
  B --> E["Installation & Usage"]
  B --> F["CLI Examples"]
  C --> G["PXD000138 Results"]
  C --> H["Tool Comparison"]
  D --> I["AScore, PhosphoRS, LucXor"]
Loading

File Walkthrough

Relevant files
Documentation
README.md
Main package documentation and user guide                               

docs/README.md

  • Comprehensive package documentation with feature overview and key
    capabilities
  • Installation instructions for Poetry, pip, and development setup
  • Detailed CLI usage examples for all three algorithms with parameter
    options
  • Algorithm-specific documentation covering AScore, PhosphoRS, and
    LucXor implementations
  • Command-line options tables with defaults and descriptions for each
    tool
  • Contributing guidelines, license information, and acknowledgments
+335/-0 
benchmark.md
Benchmark results and tool comparison analysis                     

docs/benchmark.md

  • Benchmark comparison of four phosphorylation site localization tools
    on PXD000138 dataset
  • Methodology section describing data processing pipeline and filtering
    criteria
  • Three result tables comparing well-resolved sites, uncertain sites,
    and overall quality metrics
  • Analysis of Global FLR, well-resolved sites count, and uncertain sites
    distribution
  • Trade-off analysis between sensitivity and specificity for each tool
  • Conclusions with recommendations for tool selection based on use case
    requirements
+108/-0 
1_ascore_result.idXML
AScore algorithm example results                                                 

data/1_ascore_result.idXML

  • Example result file from AScore algorithm execution
  • Contains phosphorylation site localization results in idXML format
  • Demonstrates AScore output structure and scoring metrics
+52123/-0
1_lucxor_result.idXML
LucXor algorithm example results                                                 

data/1_lucxor_result.idXML

  • Example result file from LucXor algorithm execution
  • Contains PTM localization results with FLR values in idXML format
  • Demonstrates LucXor output structure and scoring metrics
+47968/-0
1_phosphors_result.idXML
PhosphoRS algorithm example results                                           

data/1_phosphors_result.idXML

  • Example result file from PhosphoRS algorithm execution
  • Contains phosphorylation site localization results in idXML format
  • Demonstrates PhosphoRS output structure and scoring metrics
+43125/-0

Summary by CodeRabbit

  • Documentation
    • Added a comprehensive project README with installation, unified CLI, per-algorithm pipelines, usage examples, contribution/citation/license info, and streamlined examples.
    • Added a benchmark guide comparing phosphorylation-site localization tools with methods, results, analysis, and recommendations.
    • Added and updated citation records and documentation layout; rebranded name casing to "onsite" and standardized docs for a CLI-first structure.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai

coderabbitai Bot commented Nov 20, 2025

Copy link
Copy Markdown
Contributor

Note

Other AI code review bot(s) detected

CodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review.

Walkthrough

Adds new and updated documentation: a new docs/README.md and docs/benchmark.md, large edits to the root README.md, and updates to docs/citations.md. All changes are documentation-only; no code, tests, or public API modifications.

Changes

Cohort / File(s) Summary
New comprehensive docs
docs/README.md
Adds a full project README documenting purpose, features, supported PTM localization algorithms (AScore, PhosphoRS, LucXor), installation and development setup (Poetry/pip), unified CLI and per-algorithm pipelines, CLI options and algorithm-specific parameters, example results, documentation links, contribution guidelines, license, and citation info.
Benchmark report
docs/benchmark.md
Adds benchmarking documentation comparing phosphorylation site localization tools on dataset PXD000138: data processing pipeline, filtering thresholds, uncertainty classification, result tables (well-resolved, uncertain, localization quality), Global FLR analysis, and references.
Root README overhaul
README.md
Rebrands project to "onsite", reorganizes content for CLI-first usage, consolidates installation/usage examples, removes extended Python API examples, updates features/algorithm descriptions, and revises documentation, contribution, and license sections.
Citations and metadata
docs/citations.md
Normalizes project name to "onsite", updates and expands citation entries (authors, years, DOIs, BibTeX), adjusts URLs/references, and refines formatting and acknowledgments.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

  • Documentation-only changes with consistent patterns and moderate volume.
  • Spot-check recommendations:
    • Verify CLI examples and flag names match actual repository commands.
    • Confirm benchmark dataset identifiers, file paths, thresholds, and table values.
    • Validate citation metadata and BibTeX formatting.

Suggested labels

Review effort 2/5

Suggested reviewers

  • ypriverol

Poem

🐰 I hopped through lines and nudged each phrase so bright,
README and bench now snug, all set for flight,
Citations gleam, the docs are neatly spun,
Tip‑tap my paw — a tidy job well done,
Hop in and read, the project's warm as night.

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The PR title accurately describes the main changes: adding documentation files (docs/README.md, docs/benchmark.md) and updating existing documentation files explaining algorithms and benchmark results.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b44899c and 30106b5.

📒 Files selected for processing (2)
  • docs/README.md (1 hunks)
  • docs/benchmark.md (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • docs/benchmark.md
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: build
🔇 Additional comments (1)
docs/README.md (1)

256-259: The referenced documentation files all exist in the repository.

Verification confirms that all four files referenced in lines 256-259 are present:

  • docs/algorithms/ascore.md
  • docs/algorithms/phosphors.md
  • docs/algorithms/lucxor.md
  • docs/citations.md

The documentation links are valid and will not result in 404 errors. No action is required.

Likely an incorrect or invalid review comment.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@qodo-code-review

qodo-code-review Bot commented Nov 20, 2025

Copy link
Copy Markdown

PR Compliance Guide 🔍

(Compliance updated until commit 30106b5)

Below is a summary of compliance checks for this PR:

Security Compliance
🟢
No security concerns identified No security vulnerabilities detected by AI analysis. Human verification advised for critical code.
Ticket Compliance
🎫 No ticket provided
  • Create ticket/issue
Codebase Duplication Compliance
Codebase context is not defined

Follow the guide to enable codebase context checks.

Custom Compliance
Generic: Comprehensive Audit Trails

Objective: To create a detailed and reliable record of critical system actions for security analysis
and compliance.

Status:
No runtime logs: The PR only adds documentation and does not introduce or modify application code to log
critical actions, so audit trail compliance cannot be verified from the diff.

Referred Code
# onsite

[![Python application](https://github.com/bigbio/onsite/actions/workflows/python-app.yml/badge.svg?branch=main)](https://github.com/bigbio/onsite/actions/workflows/python-app.yml)
![PyPI - Version](https://img.shields.io/pypi/v/onsite?style=flat)
![PyPI - Downloads](https://img.shields.io/pypi/dm/onsite)
![Pepy Total Downloads](https://img.shields.io/pepy/dt/onsite)
![GitHub Repo stars](https://img.shields.io/github/stars/bigbio/onsite)

## What is onsite?

**onsite** is a comprehensive Python package for mass spectrometry post-translational modification (PTM) localization. It provides algorithms for confident phosphorylation site localization and scoring, including implementations of AScore, PhosphoRS, and LucXor (LuciPHOr2).

### Key Features

- **Multiple Algorithms**: AScore, PhosphoRS, and LucXor implementations
- **Statistical Validation**: Probability-based scoring with FLR estimation
- **Unified CLI**: Single command-line interface for all algorithms
- **Multi-threading**: Parallel processing for improved performance
- **PyOpenMS Integration**: Seamless integration with the OpenMS ecosystem
- **High Accuracy**: Confident site localization with statistical validation
- **Flexible API**: Both command-line and Python API support


 ... (clipped 285 lines)

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Meaningful Naming and Self-Documenting Code

Objective: Ensure all identifiers clearly express their purpose and intent, making code
self-documenting

Status:
No code added: This PR adds documentation only and does not include new identifiers or code constructs to
assess naming conventions.

Referred Code
# onsite 🔬🎯

[![Python application](https://github.com/bigbio/onsite/actions/workflows/python-app.yml/badge.svg?branch=main)](https://github.com/bigbio/onsite/actions/workflows/python-app.yml)
![PyPI - Version](https://img.shields.io/pypi/v/onsite?style=flat)
![PyPI - Downloads](https://img.shields.io/pypi/dm/onsite)
![Pepy Total Downloads](https://img.shields.io/pepy/dt/onsite)
![GitHub Repo stars](https://img.shields.io/github/stars/bigbio/onsite)

## 🚀 What is onsite?

**onsite** is a comprehensive Python package for mass spectrometry post-translational modification (PTM) localization. It provides algorithms for confident phosphorylation site localization and scoring, including implementations of AScore, PhosphoRS, and LucXor (LuciPHOr2).

### ✨ Key Features

- 🎯 **Multiple Algorithms**: AScore, PhosphoRS, and LucXor implementations
- 📊 **Statistical Validation**: Probability-based scoring with FLR estimation
- 💻 **Unified CLI**: Single command-line interface for all algorithms
-**Multi-threading**: Parallel processing for improved performance
- 🔬 **PyOpenMS Integration**: Seamless integration with the OpenMS ecosystem
- 📈 **High Accuracy**: Confident site localization with statistical validation
- 🧩 **Flexible API**: Both command-line and Python API support


 ... (clipped 284 lines)

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Robust Error Handling and Edge Case Management

Objective: Ensure comprehensive error handling that provides meaningful context and graceful
degradation

Status:
No error handling: The changes are documentation-only and do not show new code paths or error handling, so
compliance cannot be determined from this diff.

Referred Code
### Command Line Interface

onsite provides a unified command-line interface for all algorithms:

#### Unified onsite CLI

```bash
# AScore algorithm
onsite ascore -in spectra.mzML -id identifications.idXML -out results.idXML

# PhosphoRS algorithm  
onsite phosphors -in spectra.mzML -id identifications.idXML -out results.idXML

# LucXor algorithm
onsite lucxor -in spectra.mzML -id identifications.idXML -out results.idXML

Individual Pipeline Tools

AScore Pipeline

... (clipped 95 lines)


</details>

> Learn more about managing compliance <a href='https://qodo-merge-docs.qodo.ai/tools/compliance/#configuration-options'>generic rules</a> or creating your own <a href='https://qodo-merge-docs.qodo.ai/tools/compliance/#custom-compliance'>custom rules</a>
</details></td></tr>
<tr><td><details>
<summary><strong>Generic: Secure Error Handling</strong></summary><br>

**Objective:** To prevent the leakage of sensitive system information through error messages while <br>providing sufficient detail for internal debugging.<br>

**Status:** <br><a href='https://github.com/bigbio/onsite/pull/15/files#diff-0b5ca119d2be595aa307d34512d9679e49186307ef94201e4b3dfa079aa89938R145-R156'><strong>User errors unclear</strong></a>: Documentation mentions a --debug flag but no user-facing error messaging or safeguards are <br>added in code, so secure error handling cannot be assessed from this PR.<br>
<details open><summary>Referred Code</summary>

```markdown
```bash
# Basic usage
python -m onsite.lucxor.cli -in spectra.mzML -id identifications.idXML -out results.idXML

# With custom parameters
python -m onsite.lucxor.cli -in spectra.mzML -id identifications.idXML -out results.idXML \
    --fragment-method HCD \
    --fragment-mass-tolerance 0.5 \
    --fragment-error-units Da \
    --threads 8 \
    --debug

</details>

> Learn more about managing compliance <a href='https://qodo-merge-docs.qodo.ai/tools/compliance/#configuration-options'>generic rules</a> or creating your own <a href='https://qodo-merge-docs.qodo.ai/tools/compliance/#custom-compliance'>custom rules</a>
</details></td></tr>
<tr><td><details>
<summary><strong>Generic: Secure Logging Practices</strong></summary><br>

**Objective:** To ensure logs are useful for debugging and auditing without exposing sensitive <br>information like PII, PHI, or cardholder data.<br>

**Status:** <br><a href='https://github.com/bigbio/onsite/pull/15/files#diff-b335630551682c19a781afebcf4d07bf978fb1f8ac04c6bf87428ed5106870f5R155-R171'><strong>Debug flag noted</strong></a>: The docs reference enabling debug logging but no logging implementation is changed in this <br>PR, so we cannot verify that logs avoid sensitive data.<br>
<details open><summary>Referred Code</summary>

```markdown
    --debug

Command-line Options

AScore Options

Option Default Description
-in - Input mzML file with spectra
-id - Input idXML file with identifications
-out - Output idXML file with scores
--fragment-mass-tolerance 0.05 Fragment mass tolerance
--fragment-mass-unit Da Tolerance unit (Da or ppm)
--threads 1 Number of threads for parallel processing
--add-decoys False Include decoy sites for validation
--debug False Enable debug logging

</details>

> Learn more about managing compliance <a href='https://qodo-merge-docs.qodo.ai/tools/compliance/#configuration-options'>generic rules</a> or creating your own <a href='https://qodo-merge-docs.qodo.ai/tools/compliance/#custom-compliance'>custom rules</a>
</details></td></tr>
<tr><td><details>
<summary><strong>Generic: Security-First Input Validation and Data Handling</strong></summary><br>

**Objective:** Ensure all data inputs are validated, sanitized, and handled securely to prevent <br>vulnerabilities<br>

**Status:** <br><a href='https://github.com/bigbio/onsite/pull/15/files#diff-3d312199f1da21c13b99d654a88a249129c2d8d05712a224085c98f16ff19315R1-R108'><strong>Docs only change</strong></a>: The PR adds documentation and benchmark content without modifying input handling or <br>validation code, so security-first input validation cannot be evaluated from this diff.<br>
<details open><summary>Referred Code</summary>

```markdown
# Benchmark Results: PXD000138 Dataset

## Overview

This document presents the benchmark results of four phosphorylation site localization tools (LuciPHOr, Ascore, pyLucXor, and PhosphoRS) on the PXD000138 dataset. All tools were tested using identical input files (mzML and idXML) to ensure fair comparison.

## Methodology

### Data Processing Pipeline

1. **Initial Processing**: LuciPHOr results were obtained using quantms workflow on PXD000138 dataset
2. **Comparative Testing**: Ascore, pyLucXor, and PhosphoRS were tested using the same mzML and idXML files as LuciPHOr
3. **Filtering Criteria**: 
   - **LuciPHOr & pyLucXor**: local_flr < 0.01
   - **Ascore**: Ascore_site > 20
   - **PhosphoRS**: site_prob > 99%
   - **All tools**: FDR < 0.01
4. **Post-filtering**: Removed peptides with ambiguous sites and decoy peptides
5. **Validation**: Matched filtered results from each tool against the ground truth/reference dataset (see line 102) to calculate True Positives (TP) and False Positives (FP)

### Uncertainty Classification


 ... (clipped 87 lines)

Learn more about managing compliance generic rules or creating your own custom rules

Compliance status legend 🟢 - Fully Compliant
🟡 - Partial Compliant
🔴 - Not Compliant
⚪ - Requires Further Human Verification
🏷️ - Compliance label

Previous compliance checks

Compliance check up to commit 0128a5a
Security Compliance
🟢
No security concerns identified No security vulnerabilities detected by AI analysis. Human verification advised for critical code.
Ticket Compliance
🎫 No ticket provided
  • Create ticket/issue
Codebase Duplication Compliance
Codebase context is not defined

Follow the guide to enable codebase context checks.

Custom Compliance
🟢
Generic: Meaningful Naming and Self-Documenting Code

Objective: Ensure all identifiers clearly express their purpose and intent, making code
self-documenting

Status: Passed

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Secure Error Handling

Objective: To prevent the leakage of sensitive system information through error messages while
providing sufficient detail for internal debugging.

Status: Passed

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Comprehensive Audit Trails

Objective: To create a detailed and reliable record of critical system actions for security analysis
and compliance.

Status:
No runtime logs: The PR adds only documentation and does not introduce or modify application code to add
audit logging for critical actions, so audit trail compliance cannot be verified from the
diff.

Referred Code
# onsite 🔬🎯

[![Python application](https://github.com/bigbio/onsite/actions/workflows/python-app.yml/badge.svg?branch=main)](https://github.com/bigbio/onsite/actions/workflows/python-app.yml)
![PyPI - Version](https://img.shields.io/pypi/v/onsite?style=flat)
![PyPI - Downloads](https://img.shields.io/pypi/dm/onsite)
![Pepy Total Downloads](https://img.shields.io/pepy/dt/onsite)
![GitHub Repo stars](https://img.shields.io/github/stars/bigbio/onsite)

## 🚀 What is onsite?

**onsite** is a comprehensive Python package for mass spectrometry post-translational modification (PTM) localization. It provides algorithms for confident phosphorylation site localization and scoring, including implementations of AScore, PhosphoRS, and LucXor (LuciPHOr2).

### ✨ Key Features

- 🎯 **Multiple Algorithms**: AScore, PhosphoRS, and LucXor implementations
- 📊 **Statistical Validation**: Probability-based scoring with FLR estimation
- 💻 **Unified CLI**: Single command-line interface for all algorithms
-**Multi-threading**: Parallel processing for improved performance
- 🔬 **PyOpenMS Integration**: Seamless integration with the OpenMS ecosystem
- 📈 **High Accuracy**: Confident site localization with statistical validation
- 🧩 **Flexible API**: Both command-line and Python API support


 ... (clipped 314 lines)

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Robust Error Handling and Edge Case Management

Objective: Ensure comprehensive error handling that provides meaningful context and graceful
degradation

Status:
No error paths: The PR introduces documentation only and does not add executable code where error handling
and edge case management could be assessed.

Referred Code
## 🛠️ Usage

### Command Line Interface

onsite provides a unified command-line interface for all algorithms:

#### Unified onsite CLI

```bash
# AScore algorithm
onsite ascore -in spectra.mzML -id identifications.idXML -out results.idXML

# PhosphoRS algorithm  
onsite phosphors -in spectra.mzML -id identifications.idXML -out results.idXML

# LucXor algorithm
onsite lucxor -in spectra.mzML -id identifications.idXML -out results.idXML

Individual Pipeline Tools

... (clipped 96 lines)


</details>

> Learn more about managing compliance <a href='https://qodo-merge-docs.qodo.ai/tools/compliance/#configuration-options'>generic rules</a> or creating your own <a href='https://qodo-merge-docs.qodo.ai/tools/compliance/#custom-compliance'>custom rules</a>
</details></td></tr>
<tr><td><details>
<summary><strong>Generic: Secure Logging Practices</strong></summary><br>

**Objective:** To ensure logs are useful for debugging and auditing without exposing sensitive <br>information like PII, PHI, or cardholder data.<br>

**Status:** <br><a href='https://github.com/bigbio/onsite/pull/15/files#diff-0b5ca119d2be595aa307d34512d9679e49186307ef94201e4b3dfa079aa89938R169-R210'><strong>Logging unspecified</strong></a>: The documentation references debug flags but does not change logging code, so we cannot <br>verify secure logging practices from this PR.<br>
<details open><summary>Referred Code</summary>

```markdown
| `--threads` | 1 | Number of threads for parallel processing |
| `--add-decoys` | False | Include decoy sites for validation |
| `--debug` | False | Enable debug logging |

#### PhosphoRS Options

| Option | Default | Description |
|---|---|---|
| `-in` | - | Input mzML file with spectra |
| `-id` | - | Input idXML file with identifications |
| `-out` | - | Output idXML file with scores |
| `--fragment-mass-tolerance` | 0.05 | Fragment mass tolerance |
| `--fragment-mass-unit` | Da | Tolerance unit (Da or ppm) |
| `--threads` | 1 | Number of threads for parallel processing |
| `--add-decoys` | False | Include decoy sites for validation |
| `--debug` | False | Enable debug logging |

#### LucXor Options

| Option | Default | Description |
|---|---|---|


 ... (clipped 21 lines)

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Security-First Input Validation and Data Handling

Objective: Ensure all data inputs are validated, sanitized, and handled securely to prevent
vulnerabilities

Status:
Input validation N/A: The PR only adds docs describing CLI parameters and does not include code changes to
assess input validation or secure data handling.

Referred Code
### Command-line Options

#### AScore Options

| Option | Default | Description |
|---|---|---|
| `-in` | - | Input mzML file with spectra |
| `-id` | - | Input idXML file with identifications |
| `-out` | - | Output idXML file with scores |
| `--fragment-mass-tolerance` | 0.05 | Fragment mass tolerance |
| `--fragment-mass-unit` | Da | Tolerance unit (Da or ppm) |
| `--threads` | 1 | Number of threads for parallel processing |
| `--add-decoys` | False | Include decoy sites for validation |
| `--debug` | False | Enable debug logging |

#### PhosphoRS Options

| Option | Default | Description |
|---|---|---|
| `-in` | - | Input mzML file with spectra |
| `-id` | - | Input idXML file with identifications |


 ... (clipped 32 lines)

Learn more about managing compliance generic rules or creating your own custom rules

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 40bc6ef and 0128a5a.

📒 Files selected for processing (2)
  • docs/README.md (1 hunks)
  • docs/benchmark.md (1 hunks)
🧰 Additional context used
🪛 markdownlint-cli2 (0.18.1)
docs/README.md

28-28: Unordered list indentation
Expected: 0; Actual: 3

(MD007, ul-indent)


29-29: Unordered list indentation
Expected: 0; Actual: 3

(MD007, ul-indent)


30-30: Unordered list indentation
Expected: 0; Actual: 3

(MD007, ul-indent)


31-31: Unordered list indentation
Expected: 0; Actual: 3

(MD007, ul-indent)


34-34: Unordered list indentation
Expected: 0; Actual: 3

(MD007, ul-indent)


35-35: Unordered list indentation
Expected: 0; Actual: 3

(MD007, ul-indent)


36-36: Unordered list indentation
Expected: 0; Actual: 3

(MD007, ul-indent)


37-37: Unordered list indentation
Expected: 0; Actual: 3

(MD007, ul-indent)


40-40: Unordered list indentation
Expected: 0; Actual: 3

(MD007, ul-indent)


41-41: Unordered list indentation
Expected: 0; Actual: 3

(MD007, ul-indent)


42-42: Unordered list indentation
Expected: 0; Actual: 3

(MD007, ul-indent)


43-43: Unordered list indentation
Expected: 0; Actual: 3

(MD007, ul-indent)


313-313: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: build
🔇 Additional comments (3)
docs/README.md (1)

1-22: Strong introduction and feature overview.

The project description and key features are clear, well-organized, and effectively communicate the package's value proposition.

docs/benchmark.md (2)

1-28: Well-documented benchmark methodology.

The overview and methodology sections clearly explain the data processing pipeline, filtering criteria for each tool, and uncertainty classification approach. The setup ensures fair comparison across tools.


87-108: Strong conclusions with clear tool recommendations.

The analysis clearly articulates each tool's strengths and provides evidence-based recommendations. The trade-off analysis between precision and sensitivity is particularly helpful for users choosing between tools.

Comment thread docs/benchmark.md
Comment thread docs/README.md Outdated
Comment thread docs/README.md
Comment thread docs/README.md Outdated
Comment thread docs/README.md
@qodo-code-review

qodo-code-review Bot commented Nov 20, 2025

Copy link
Copy Markdown

PR Code Suggestions ✨

Explore these optional code suggestions:

CategorySuggestion                                                                                                                                    Impact
High-level
Documentation added for non-existent code

The PR adds documentation for the onsite package, but the package's source code
is missing. It is suggested to include the implementation along with its
documentation to avoid misleading users.

Examples:

docs/README.md [1-335]
# onsite 🔬🎯

[![Python application](https://github.com/bigbio/onsite/actions/workflows/python-app.yml/badge.svg?branch=main)](https://github.com/bigbio/onsite/actions/workflows/python-app.yml)
![PyPI - Version](https://img.shields.io/pypi/v/onsite?style=flat)
![PyPI - Downloads](https://img.shields.io/pypi/dm/onsite)
![Pepy Total Downloads](https://img.shields.io/pepy/dt/onsite)
![GitHub Repo stars](https://img.shields.io/github/stars/bigbio/onsite)

## 🚀 What is onsite?


 ... (clipped 325 lines)

Solution Walkthrough:

Before:

// PR contains only documentation files for a software package.
// File: docs/README.md
## 💾 Installation
```bash
# Install from PyPI (when available)
pip install onsite

🛠️ Usage

Command Line Interface

# AScore algorithm
onsite ascore -in spectra.mzML -id identifications.idXML -out results.idXML

// No Python source code (e.g., onsite/ascore/cli.py) is included in the PR.




#### After:
```markdown
// PR should include source code alongside documentation.
// File: docs/README.md
## 💾 Installation
```bash
pip install .

🛠️ Usage

Command Line Interface

# AScore algorithm
onsite ascore -in spectra.mzML -id identifications.idXML -out results.idXML

// File: onsite/ascore/cli.py (and other source files)
def main():
# Implementation for the ascore CLI tool
...





<details><summary>Suggestion importance[1-10]: 10</summary>

__

Why: The suggestion correctly identifies a critical and fundamental issue: the PR adds extensive documentation for a software package whose source code is not included, rendering the documentation unusable and misleading.


</details></details></td><td align=center>High

</td></tr><tr><td rowspan=1>General</td>
<td>



<details><summary>Standardize a command-line parameter name</summary>

___

**Standardize the LucXor parameter <code>--fragment-error-units</code> to <code>--fragment-mass-unit</code> <br>to be consistent with the AScore and PhosphoRS algorithms.**

[docs/README.md [195]](https://github.com/bigbio/onsite/pull/15/files#diff-0b5ca119d2be595aa307d34512d9679e49186307ef94201e4b3dfa079aa89938R195-R195)

```diff
-| `--fragment-error-units` | Da | Tolerance units (Da or ppm) |
+| `--fragment-mass-unit` | Da | Tolerance units (Da or ppm) |
  • Apply / Chat
Suggestion importance[1-10]: 6

__

Why: The suggestion correctly identifies an inconsistency in CLI parameter names across different sub-commands, and unifying them would improve the tool's usability and align with the stated goal of a "Unified CLI".

Low
Possible issue
Correct a typo in a default parameter

Correct the likely typo sty to S,T,Y in the default value for the
--neutral-losses option in the documentation to ensure correctness.

docs/README.md [198]

-| `--neutral-losses` | sty -H3PO4 -97.97690 | Neutral loss definitions applied during scoring |
+| `--neutral-losses` | S,T,Y -H3PO4 -97.97690 | Neutral loss definitions applied during scoring |
  • Apply / Chat
Suggestion importance[1-10]: 5

__

Why: The suggestion correctly identifies a likely typo (sty) in a default parameter value within the documentation, improving clarity and preventing potential user error.

Low
  • Update

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
README.md (1)

309-316: Citation format is simplified; consider adding more metadata.

The citation section (lines 309-316) provides a simple text-based citation format. While acceptable, consider expanding it to include: authors, publication year, DOI or GitHub URL, and optionally a BibTeX entry for users who need structured citation formats. This would improve usability for academic citations.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ed0fea1 and f3a6fd5.

📒 Files selected for processing (1)
  • README.md (6 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: build
🔇 Additional comments (7)
README.md (7)

1-43: Documentation introduction and feature overview are clear and well-structured.

The opening sections effectively communicate the package purpose, key features, and algorithm overview with proper academic citations. Structure is logical and helps new users quickly understand onsite's capabilities.


45-156: Installation and usage documentation is comprehensive and well-presented.

Clear step-by-step instructions for multiple installation methods (Poetry, pip, development) and comprehensive CLI examples showing both basic and advanced usage patterns with custom parameters. The distinction between unified CLI and individual tool invocation is helpful for users.


158-209: Command-line options tables are well-structured and internally consistent.

The tables clearly document parameter defaults and descriptions for all three algorithms. Parameter names match the CLI usage examples, and defaults appear sensible (e.g., conservative mass tolerances, single-threaded by default). The increased complexity of LucXor (14 options vs 8 for others) appropriately reflects its two-stage workflow.


229-242: Verify PhosphoRS parameter documentation matches CLI interface.

Lines 234-237 document "Window size: 100.0" and "Max depth: 8" as key parameters for PhosphoRS, but these parameters do not appear in the PhosphoRS command-line options table (lines 173-184). This discrepancy suggests either: (1) these are internal defaults not exposed via CLI, or (2) the documentation is outdated/inconsistent. Please clarify whether these parameters should be included in the CLI options table or if the algorithm details section should be updated to reflect only the exposed CLI parameters.


278-280: Verify example result file paths are correct.

Lines 278-280 reference example files with paths like ../data/1_ascore_result.idXML. If this README is in the repository root (standard practice), the relative paths should use data/... without the ../ prefix. The ../ suggests the README may be in a subdirectory (e.g., docs/), but the PR description indicates updates to the root README. Please verify these paths match the actual file locations.


282-289: Documentation links may reference non-existent files.

Lines 286-289 link to algorithm documentation and citations files (algorithms/ascore.md, algorithms/phosphors.md, algorithms/lucxor.md, citations.md) that are not mentioned in the PR description. The PR adds docs/README.md and docs/benchmark.md, but no algorithm-specific documentation files under an algorithms/ directory. Please verify these files exist or update the links to point to the correct documentation locations (e.g., docs/README.md or files in docs/).


305-307: Verify LICENSE file path.

Line 307 references the LICENSE file with path ../LICENSE. If this README is in the repository root, the path should be LICENSE without the ../ prefix. This is consistent with the earlier path inconsistency noted in the example results section. Please verify the relative path is correct for the actual location of the README file.

Removed detailed project description and installation instructions from README.md.
Updated references and citations for algorithms in the onsite package, including corrections to author names, publication years, and DOIs.

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds comprehensive documentation for the onsite package, including algorithm descriptions, installation instructions, usage examples, and benchmark results comparing phosphorylation site localization tools.

Key Changes

  • Added detailed package documentation covering AScore, PhosphoRS, and LucXor algorithms with installation and CLI usage instructions
  • Included benchmark analysis of four phosphorylation site localization tools on the PXD000138 dataset with methodology and comparative results
  • Updated root README.md with streamlined content and added example result file references

Reviewed Changes

Copilot reviewed 4 out of 7 changed files in this pull request and generated 9 comments.

File Description
docs/README.md Comprehensive package documentation with algorithm details, installation, CLI usage, and command-line options for all three algorithms
docs/benchmark.md Benchmark methodology and comparative results for LuciPHOr, Ascore, pyLucXor, and PhosphoRS tools on PXD000138 dataset
README.md Streamlined root README with key features, algorithm summaries, and references to detailed documentation
data/*.idXML Example result files demonstrating output format for each algorithm

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread README.md Outdated
Comment thread README.md
Comment thread docs/benchmark.md Outdated
Comment thread README.md Outdated
Comment thread docs/README.md
Comment on lines +286 to +289
- [AScore Algorithm Documentation](algorithms/ascore.md)
- [PhosphoRS Algorithm Documentation](algorithms/phosphors.md)
- [LucXor Algorithm Documentation](algorithms/lucxor.md)
- [Citations and References](citations.md)

Copilot AI Nov 20, 2025

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Broken relative links: The documentation links point to files that don't exist in this PR (algorithms/ascore.md, algorithms/phosphors.md, algorithms/lucxor.md, citations.md). Either these files should be included in this PR, or the links should be removed/updated to point to existing documentation.

Suggested change
- [AScore Algorithm Documentation](algorithms/ascore.md)
- [PhosphoRS Algorithm Documentation](algorithms/phosphors.md)
- [LucXor Algorithm Documentation](algorithms/lucxor.md)
- [Citations and References](citations.md)
<!-- Documentation links removed due to missing files. -->

Copilot uses AI. Check for mistakes.
Comment thread docs/benchmark.md Outdated
Comment thread docs/benchmark.md Outdated
Comment thread docs/benchmark.md Outdated
Comment thread README.md Outdated

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f3a6fd5 and 72565c0.

📒 Files selected for processing (2)
  • README.md (6 hunks)
  • docs/citations.md (9 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: build
🔇 Additional comments (3)
README.md (3)

1-7: Branding updates are consistent and well-applied.

The root README successfully rebrand from "OnSite" to "onsite" with updated badge URLs and consistent lowercase usage throughout. The structure and formatting are clear and professional.


242-250: All example result files verified.

The verification confirms that all three example result files referenced in lines 242–250 exist in the data/ directory:

  • data/1_ascore_result.idXML
  • data/1_phosphors_result.idXML
  • data/1_lucxor_result.idXML

The relative paths in the README are correct and resolve properly.


102-111: Documentation verified as accurate.

All three CLI commands (ascore, phosphors, lucxor) are properly registered as entry points in pyproject.toml and added as subcommands in onsite/onsitec.py. The documented parameter names (-in, -id, -out) match the actual CLI implementation, and the example invocations are executable as shown in the README.

Comment thread docs/citations.md
Comment on lines +195 to 202
- **onsite Documentation**: https://github.com/bigbio/onsite/docs
- **PyOpenMS Documentation**: https://pyopenms.readthedocs.io/
- **OpenMS Documentation**: https://openms.readthedocs.io/

### Tutorials and Examples
- **OnSite Tutorials**: https://github.com/bigbio/onsite/docs/tutorials
- **onsite Tutorials**: https://github.com/bigbio/onsite/docs/tutorials
- **Algorithm Comparisons**: https://github.com/bigbio/onsite/docs/benchmarks
- **API Reference**: https://github.com/bigbio/onsite/docs/api

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Fix malformed GitHub documentation URLs.

Lines 195–202 contain GitHub URLs that don't follow standard GitHub path conventions. These links will not resolve correctly.

Correct the URLs to use proper GitHub tree or blob paths:

### Online Documentation
-
- **onsite Documentation**: https://github.com/bigbio/onsite/docs
- **PyOpenMS Documentation**: https://pyopenms.readthedocs.io/
- **OpenMS Documentation**: https://openms.readthedocs.io/

### Tutorials and Examples
-
- **onsite Tutorials**: https://github.com/bigbio/onsite/docs/tutorials
- **Algorithm Comparisons**: https://github.com/bigbio/onsite/docs/benchmarks
- **API Reference**: https://github.com/bigbio/onsite/docs/api
+
+- **onsite Documentation**: https://github.com/bigbio/onsite/tree/main/docs
+- **PyOpenMS Documentation**: https://pyopenms.readthedocs.io/
+- **OpenMS Documentation**: https://openms.readthedocs.io/
+
+### Tutorials and Examples
+
+- **onsite Tutorials**: https://github.com/bigbio/onsite/tree/main/docs
+- **Algorithm Comparisons**: https://github.com/bigbio/onsite/blob/main/docs/benchmark.md
+- **API Reference**: https://github.com/bigbio/onsite/tree/main/docs

Alternatively, if you plan to publish documentation on ReadTheDocs or GitHub Pages, update these URLs to point to the proper documentation host.

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In docs/citations.md around lines 195 to 202, the GitHub documentation URLs are
malformed and won't resolve; update them to point to the correct repository
paths (e.g. use tree/blob with branch like /tree/main/docs,
/tree/main/docs/tutorials, /tree/main/docs/benchmarks, /tree/main/docs/api) or
replace them with the proper ReadTheDocs/GitHub Pages URLs if docs are hosted
elsewhere; ensure each link uses the full correct path including branch (main or
current default) and verify each URL resolves.

Comment thread README.md Outdated
weizhongchun and others added 12 commits November 20, 2025 15:56
Updated README to include project title, badges, and improved algorithm descriptions.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Removed key parameters and two-stage workflow details for AScore, PhosphoRS, and LucXor algorithms from the README.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
@ypriverol ypriverol merged commit 87ca96f into bigbio:main Nov 20, 2025
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants