Skip to content

Commit 4471612

Browse files
committed
new features
1 parent d4c3baf commit 4471612

8 files changed

Lines changed: 212 additions & 2 deletions

File tree

Lines changed: 57 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,57 @@
1+
name: Bug report
2+
description: Report a reproducible problem in ingestion, retrieval, UI, or deployment.
3+
title: "[Bug]: "
4+
labels:
5+
- bug
6+
body:
7+
- type: markdown
8+
attributes:
9+
value: |
10+
Thanks for taking the time to file a bug.
11+
Please do not include PHI, copyrighted textbook content, API keys, or other secrets.
12+
13+
- type: textarea
14+
id: summary
15+
attributes:
16+
label: What happened?
17+
description: Describe the bug and what you expected instead.
18+
placeholder: The UI uploads the PDF, but /library/sync fails and the document never appears.
19+
validations:
20+
required: true
21+
22+
- type: textarea
23+
id: steps
24+
attributes:
25+
label: Steps to reproduce
26+
description: Give the shortest reproduction path you can.
27+
placeholder: |
28+
1. Run `bash scripts/deploy_portable.sh --allow-no-openai`
29+
2. Open `/ui`
30+
3. Upload a PDF
31+
4. Click sync
32+
validations:
33+
required: true
34+
35+
- type: textarea
36+
id: environment
37+
attributes:
38+
label: Environment
39+
description: OS, Python version, Docker version, and any other relevant runtime details.
40+
placeholder: macOS 15, Python 3.12, Docker 29.x, Colima 0.x
41+
validations:
42+
required: true
43+
44+
- type: textarea
45+
id: logs
46+
attributes:
47+
label: Logs or traceback
48+
description: Paste sanitized logs only. Remove PHI, copyrighted text, and secrets first.
49+
render: shell
50+
51+
- type: textarea
52+
id: corpus
53+
attributes:
54+
label: Corpus details
55+
description: If relevant, describe the document type at a high level without uploading copyrighted content.
56+
placeholder: Two local pathology PDFs, both text-based and non-OCR.
57+

.github/ISSUE_TEMPLATE/config.yml

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
blank_issues_enabled: false
2+
contact_links:
3+
- name: Security issue or unsafe data exposure
4+
url: https://github.com/hutaobo/pathology-rag-workbench/blob/main/SECURITY.md
5+
about: Please report security and sensitive-data issues privately first.
6+
- name: Contribution guidelines
7+
url: https://github.com/hutaobo/pathology-rag-workbench/blob/main/CONTRIBUTING.md
8+
about: Read this before opening a pull request.
Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,28 @@
1+
name: Documentation improvement
2+
description: Suggest a README, architecture, setup, or usage docs improvement.
3+
title: "[Docs]: "
4+
labels:
5+
- documentation
6+
body:
7+
- type: textarea
8+
id: gap
9+
attributes:
10+
label: What is unclear, missing, or incorrect?
11+
placeholder: The README explains local PDF folders, but it does not clearly show how `config/library.local.yaml` relates to browser uploads.
12+
validations:
13+
required: true
14+
15+
- type: textarea
16+
id: suggestion
17+
attributes:
18+
label: What should the docs say instead?
19+
placeholder: Add a short section that contrasts upload-driven and config-driven library management.
20+
validations:
21+
required: true
22+
23+
- type: textarea
24+
id: location
25+
attributes:
26+
label: Relevant file or section
27+
placeholder: README.md Quick Start, docs/ARCHITECTURE.md, or a specific API route description
28+
Lines changed: 40 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,40 @@
1+
name: Feature request
2+
description: Propose a new capability or improvement.
3+
title: "[Feature]: "
4+
labels:
5+
- enhancement
6+
body:
7+
- type: markdown
8+
attributes:
9+
value: |
10+
Thanks for the suggestion.
11+
Please keep requests focused on the workbench itself and avoid posting licensed corpus material.
12+
13+
- type: textarea
14+
id: problem
15+
attributes:
16+
label: What problem are you trying to solve?
17+
placeholder: It is hard to limit answers to one selected book from the browser UI.
18+
validations:
19+
required: true
20+
21+
- type: textarea
22+
id: proposal
23+
attributes:
24+
label: What would you like to happen?
25+
placeholder: Add a multi-select document filter in the browser UI and persist the selection between questions.
26+
validations:
27+
required: true
28+
29+
- type: textarea
30+
id: alternatives
31+
attributes:
32+
label: Alternatives considered
33+
placeholder: I can do this from the CLI today with repeated `--document-id` flags, but not from the UI.
34+
35+
- type: textarea
36+
id: context
37+
attributes:
38+
label: Additional context
39+
description: Screenshots, workflow notes, or implementation ideas are welcome as long as they do not include sensitive data.
40+

.github/pull_request_template.md

Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,24 @@
1+
## Summary
2+
3+
- Describe the change in a few sentences.
4+
5+
## Why this change?
6+
7+
- Explain the user-facing or maintainer-facing reason.
8+
9+
## Validation
10+
11+
- [ ] I ran `bash scripts/bootstrap.sh` if dependencies changed
12+
- [ ] I ran `pytest`
13+
- [ ] I ran `ruff check .`
14+
- [ ] I updated docs if behavior or setup changed
15+
16+
## Data and safety checks
17+
18+
- [ ] I did not add copyrighted PDFs, extracted textbook text, embeddings, or page previews
19+
- [ ] I did not add PHI, secrets, or `.env` content
20+
- [ ] I preserved the evidence guardrails for ungrounded medical answers
21+
22+
## Notes for reviewers
23+
24+
- Mention any tradeoffs, follow-up work, or areas where you want extra review.

.github/workflows/ci.yml

Lines changed: 51 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,51 @@
1+
name: CI
2+
3+
on:
4+
push:
5+
branches:
6+
- main
7+
- master
8+
pull_request:
9+
workflow_dispatch:
10+
11+
permissions:
12+
contents: read
13+
14+
concurrency:
15+
group: ci-${{ github.ref }}
16+
cancel-in-progress: true
17+
18+
jobs:
19+
test:
20+
name: Python ${{ matrix.python-version }}
21+
runs-on: ubuntu-latest
22+
strategy:
23+
fail-fast: false
24+
matrix:
25+
python-version: ["3.11", "3.12"]
26+
27+
steps:
28+
- name: Check out repository
29+
uses: actions/checkout@v4
30+
31+
- name: Set up Python
32+
uses: actions/setup-python@v5
33+
with:
34+
python-version: ${{ matrix.python-version }}
35+
cache: pip
36+
37+
- name: Bootstrap virtualenv
38+
shell: bash
39+
run: bash scripts/bootstrap.sh
40+
41+
- name: Lint
42+
shell: bash
43+
run: |
44+
source .venv/bin/activate
45+
ruff check .
46+
47+
- name: Test
48+
shell: bash
49+
run: |
50+
source .venv/bin/activate
51+
pytest

README.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,9 @@
11
# Pathology RAG Workbench
22

3+
[![CI](https://github.com/hutaobo/pathology-rag-workbench/actions/workflows/ci.yml/badge.svg)](https://github.com/hutaobo/pathology-rag-workbench/actions/workflows/ci.yml)
4+
[![Python 3.11+](https://img.shields.io/badge/python-3.11%2B-3776AB.svg)](https://www.python.org/)
5+
[![License: PolyForm Noncommercial](https://img.shields.io/badge/license-PolyForm%20Noncommercial%201.0.0-5B4B8A.svg)](LICENSE)
6+
37
Pathology RAG Workbench is a local-first retrieval and question-answering workspace for pathology PDFs. It combines PDF ingestion, chunking, pgvector-backed retrieval, a FastAPI service, a browser UI, and optional OpenClaw integration so you can build a citation-backed pathology assistant around your own licensed corpus.
48

59
This repository is public, but it is not open source in the OSI sense. The code is released under the PolyForm Noncommercial 1.0.0 license, so commercial use is not allowed.

src/pathology_ai/embeddings.py

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,5 @@
11
from __future__ import annotations
22

3-
from typing import Iterable
4-
53
from openai import OpenAI
64
from sqlalchemy import func, select
75

0 commit comments

Comments
 (0)