feat: End-to-End Medical Report Extraction Pipeline with GGUF-LoRA Integration#179
feat: End-to-End Medical Report Extraction Pipeline with GGUF-LoRA Integration#179pranjal9091 wants to merge 2 commits intoAOSSIE-Org:mainfrom
Conversation
📝 WalkthroughWalkthroughThis pull request adds an OCR-based medical report scanning feature with LLM-powered data extraction to the backend and introduces a corresponding UI screen in the frontend. The backend gains image processing capabilities via Tesseract OCR and Llama model inference for structured data extraction, while new endpoints store extracted weight and blood pressure metrics. The frontend introduces a ScanReport screen with image selection and data persistence workflows. Changes
Sequence Diagram(s)sequenceDiagram
actor User
participant Frontend as Frontend<br/>(ScanReportScreen)
participant ImagePicker as Image Picker
participant Backend as Backend API<br/>(/api/ocr-scan)
participant OCR as Tesseract OCR
participant LLM as Llama LLM
participant Database as Database
User->>Frontend: Tap "Scan Medical Report"
Frontend->>ImagePicker: Launch image picker
ImagePicker-->>Frontend: Return base64 image
Frontend->>Frontend: Set loading state
Frontend->>Backend: POST base64 image
Backend->>OCR: Process image (grayscale, upscale, threshold)
OCR-->>Backend: Raw extracted text
Backend->>LLM: Invoke inference with OCR text
LLM-->>Backend: JSON structured data<br/>(weight, BP, appointment)
Backend-->>Frontend: Extracted metrics
Frontend->>User: Show alert with values
User->>Frontend: Confirm & Save
Frontend->>Backend: POST /api/weight
Backend->>Database: Insert weight record
Frontend->>Backend: POST /api/blood-pressure
Backend->>Database: Insert BP record
Database-->>Backend: Confirmation
Backend-->>Frontend: Success response
Frontend->>User: Show success alert
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 16
🤖 Fix all issues with AI agents
In @.gitignore:
- Around line 32-34: The .gitignore has a malformed entry where the patterns
"*.log" and "Backend/db/chromadb/chroma.sqlite3" are concatenated on one line;
split them into two separate lines so each pattern is on its own line (ensure
"*.log" is one line and "Backend/db/chromadb/chroma.sqlite3" is the next) so
both patterns match as intended; verify the surrounding entries like
"ocr_debug.txt" remain unchanged.
In `@Backend/agent/handlers/appointment.py`:
- Around line 83-105: The parse_date function no longer normalizes common
relative terms ("today", "tomorrow", "next week"), causing those strings to be
stored raw; update parse_date to detect (case-insensitive) "today" -> return
today's date in '%Y-%m-%d', "tomorrow" -> today + 1 day, and "next week" ->
today + 7 days (use the existing datetime/timedelta imports and the same
strftime formatting), keeping the existing handling for weekdays and "next
month"; ensure the checks occur before falling back to returning the original
string so all terms matched by the date_patterns regex are converted to
canonical ISO dates.
In `@Backend/agent/llm.py`:
- Around line 4-9: The module currently creates a global Llama instance (llm) at
import, which can crash or block startup; change to lazy initialization: remove
the top-level llm instantiation and implement a get_llm() (or init_llm())
function that resolves model paths using Path(__file__).resolve().parent /
"models" for model_path and lora_path, performs the Llama(...) creation inside a
try/except, logs and returns None or raises a clear error on failure, and caches
the created instance for subsequent calls to avoid repeated loads; optionally
make the initializer callable from async code or kick off loading in a
background thread if non-blocking startup is required.
In `@Backend/app.py`:
- Around line 139-148: The save_weight route opens a DB connection via open_db()
and never closes it, causing a connection leak; update save_weight to ensure the
connection (db) is closed after use by using a try/finally or a context manager
around open_db(), move cursor usage (cursor = db.cursor(), cursor.execute(...),
db.commit()) inside that block, and call db.close() in the finally (or rely on
context manager exit) so the DB connection is always released even if an
exception occurs.
- Around line 150-159: The save_bp function opens a DB connection via open_db()
but never closes it, causing a connection leak; wrap the DB usage in a
try/finally (mirroring save_weight) so that cursor operations and db.commit()
happen in try and db.close() is called in finally; ensure you reference the same
open_db(), save_bp function and close the connection regardless of success or
exception (and optionally close the cursor inside the try before commit).
- Around line 243-244: The current entrypoint under the if __name__ ==
'__main__' guard calls app.run(host='0.0.0.0', port=5050, debug=True), which
must not be used in production; change it to derive debug and host from
environment variables (e.g., FLASK_DEBUG or APP_ENV) so debug=True is only
enabled for local/dev runs and default to host='127.0.0.1' otherwise, and ensure
app.run uses those variables instead of hardcoded debug=True and host='0.0.0.0'.
- Around line 81-85: Validate and error-handle the incoming base64 image before
processing: after calling request.get_json() and before
base64.b64decode(data['image']), check that data is not None and contains the
'image' key; catch decoding errors from base64.b64decode (e.g.,
binascii.Error/TypeError/ValueError) and return a 400/controlled error; after
np.frombuffer and cv2.imdecode (reference cv2.imdecode and the downstream
cv2.cvtColor use), verify that img is not None and return a clear 400 error if
image decoding failed; do not allow unhandled exceptions to propagate.
- Around line 34-35: Remove the duplicate imports of the module and class by
deleting the redundant "import os" and "from llama_cpp import Llama" entries so
only the original imports remain; locate the duplicate occurrences (the second
"import os" and the second "from llama_cpp import Llama") and remove them,
leaving the first declarations intact to resolve the Ruff F811 duplicate-import
error.
- Line 7: The hardcoded assignment to pytesseract.pytesseract.tesseract_cmd
should be replaced with a cross-platform resolution: check an environment
variable (e.g., TESSERACT_CMD) and if not set, auto-detect the binary using
shutil.which("tesseract") (or platform-specific common paths) and only assign
pytesseract.pytesseract.tesseract_cmd if a valid path is found; update the
initialization where pytesseract.pytesseract.tesseract_cmd is set to implement
this fallback and log or raise a clear error if tesseract cannot be located.
- Line 109: The synchronous call to the Llama model via model(prompt,
max_tokens=150, stop=["}"], temperature=0) can block the Flask request thread
indefinitely; update the inference path to enforce an external timeout by
running Llama inference in a separate monitored execution context (e.g., spawn a
subprocess that runs the Llama call and kill it on timeout, or run model
invocation inside a worker thread/process with a timeout and cancel/terminate if
exceeded). Concretely, move the Llama invocation (the model(...) call) into a
helper that executes in a subprocess or background worker, implement a
configurable timeout value, and ensure the Flask handler for generating output
returns an error response if the helper times out and that any orphaned process
is terminated. Ensure the helper is referenced where output is produced so
callers use the timed wrapper instead of calling model(...) directly.
In `@Frontend/ios/BabyNest/PrivacyInfo.xcprivacy`:
- Around line 33-40: Remove the unused DiskSpace privacy declaration by deleting
the <dict> entry that contains the NSPrivacyAccessedAPIType key with value
NSPrivacyAccessedAPICategoryDiskSpace and its NSPrivacyAccessedAPITypeReasons
array (the reason "85F4.1"); ensure no other privacy entries are altered and
keep the XML well-formed after removal in PrivacyInfo.xcprivacy.
In `@Frontend/package.json`:
- Line 47: Remove the unused client-side OCR dependency "tesseract.js" from
package.json (delete the "tesseract.js": "^7.0.0" entry), run your package
manager to update the lockfile (npm install / yarn install) and ensure there are
no remaining imports/usages of "tesseract.js" in the frontend codebase (search
for "tesseract" or import statements) before committing the change.
In `@Frontend/src/Screens/ScanReportScreen.jsx`:
- Around line 11-19: pickImage currently opens the image library and passes
imageUri to uploadAndScan, but uploadAndScan re-opens the picker causing the
user to select twice; fix by having pickImage request includeBase64
(ImagePicker.launchImageLibrary with includeBase64: true) and pass the picked
asset (uri and base64) into uploadAndScan, then modify uploadAndScan to accept
an image parameter and skip any internal calls to ImagePicker.launchImageLibrary
so it uses the provided image data (reference functions: pickImage,
uploadAndScan, and the ImagePicker.launchImageLibrary call).
- Around line 27-41: The two POST requests that send weight and bp use fetch but
don't check HTTP status—update the weight and bp request blocks to capture the
fetch responses (e.g., const resp = await fetch(...)), verify resp.ok, and
handle non-OK cases by logging/reporting the error or throwing so failures
aren't treated as successes; apply this change to both the weight POST and the
blood-pressure POST (the fetch calls in ScanReportScreen.jsx) and ensure any
downstream code awaits/handles thrown errors or error state appropriately.
- Around line 69-77: The fetch to `${BASE_URL}/api/ocr-scan` in
ScanReportScreen.jsx doesn't check the HTTP status; add a check on the response
(const res = await fetch(...); if (!res.ok) { const errText = await
res.text().catch(()=>res.statusText); throw new Error(`OCR request failed:
${res.status} ${errText}`); }) before calling res.json(), or instead handle
non-ok by showing the error to the user/state; keep references to BASE_URL, the
/api/ocr-scan endpoint, the res variable and the parsed data when updating state
or UI.
- Line 6: Replace the hardcoded BASE_URL in ScanReportScreen.jsx with the same
environment variable pattern used in HomeScreen.jsx (import { API_URL } from
'@env' or the project's env key) and use that variable wherever BASE_URL is
referenced; ensure you remove the literal 'http://10.72.82.230:5050' and add a
sensible fallback or error if the env var is missing so the app doesn't crash at
runtime.
🧹 Nitpick comments (3)
Backend/agent/llm.py (1)
11-21: Add exception handling for inference failures.The
llm()call can raise exceptions (e.g., OOM, malformed input). Unhandled exceptions will propagate and crash the request. The JSON repair logic on line 21 only handles a missing closing brace but won't fix other malformations (missing quotes, truncated keys).Proposed fix
def run_llm(prompt: str) -> str: """Actual inference logic for medical extraction.""" + try: - output = llm( + output = get_llm()( prompt, max_tokens=256, stop=["}"], temperature=0 ) - response = output['choices'][0]['text'].strip() - # Ensuring valid JSON structure - return response + "}" if not response.endswith("}") else response + response = output['choices'][0]['text'].strip() + # Ensure valid JSON structure + if not response.endswith("}"): + response += "}" + return response + except Exception as e: + # Return a safe fallback JSON on inference failure + return '{"error": "extraction_failed"}'Backend/agent/handlers/appointment.py (2)
84-84: Style: multiple statements on one line.Per static analysis (E701), split the guard clause for readability.
Proposed fix
- if not date_str: return None + if not date_str: + return None
100-103: Style: multiple statements on one line.Per static analysis (E701), split for readability.
Proposed fix
if date_str_lower in day_mapping: days_ahead = day_mapping[date_str_lower] - today.weekday() - if days_ahead <= 0: days_ahead += 7 + if days_ahead <= 0: + days_ahead += 7 return (today + timedelta(days=days_ahead)).strftime('%Y-%m-%d')
| # Debugging | ||
| ocr_debug.txt | ||
| *.logBackend/db/chromadb/chroma.sqlite3 |
There was a problem hiding this comment.
Malformed gitignore entry — missing newline.
Line 34 concatenates two patterns: *.log and Backend/db/chromadb/chroma.sqlite3. Neither pattern will match correctly.
Proposed fix
# Debugging
ocr_debug.txt
-*.logBackend/db/chromadb/chroma.sqlite3
+*.log
+Backend/db/chromadb/chroma.sqlite3📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| # Debugging | |
| ocr_debug.txt | |
| *.logBackend/db/chromadb/chroma.sqlite3 | |
| # Debugging | |
| ocr_debug.txt | |
| *.log | |
| Backend/db/chromadb/chroma.sqlite3 |
🤖 Prompt for AI Agents
In @.gitignore around lines 32 - 34, The .gitignore has a malformed entry where
the patterns "*.log" and "Backend/db/chromadb/chroma.sqlite3" are concatenated
on one line; split them into two separate lines so each pattern is on its own
line (ensure "*.log" is one line and "Backend/db/chromadb/chroma.sqlite3" is the
next) so both patterns match as intended; verify the surrounding entries like
"ocr_debug.txt" remain unchanged.
| def parse_date(date_str): | ||
| """Parse date string to ISO format.""" | ||
| if not date_str: | ||
| return None | ||
|
|
||
| if not date_str: return None | ||
| today = datetime.now() | ||
| date_str_lower = date_str.lower() | ||
| date_str_lower = date_str.lower().strip() | ||
|
|
||
| if date_str_lower == 'today': | ||
| return today.strftime('%Y-%m-%d') | ||
| elif date_str_lower == 'tomorrow': | ||
| return (today + timedelta(days=1)).strftime('%Y-%m-%d') | ||
| elif date_str_lower == 'next week': | ||
| return (today + timedelta(days=7)).strftime('%Y-%m-%d') | ||
|
|
||
| # Try to parse as MM/DD or MM/DD/YYYY | ||
| try: | ||
| if '/' in date_str: | ||
| parts = date_str.split('/') | ||
| if len(parts) == 2: | ||
| month, day = int(parts[0]), int(parts[1]) | ||
| year = today.year | ||
| if month < today.month or (month == today.month and day < today.day): | ||
| year += 1 | ||
| return f"{year}-{month:02d}-{day:02d}" | ||
| elif len(parts) == 3: | ||
| month, day, year = int(parts[0]), int(parts[1]), int(parts[2]) | ||
| return f"{year}-{month:02d}-{day:02d}" | ||
| except: | ||
| pass | ||
| if date_str_lower == 'next month': | ||
| next_month = (today.month % 12) + 1 | ||
| year = today.year + (1 if today.month == 12 else 0) | ||
| return f"{year}-{next_month:02d}-01" | ||
|
|
||
| # Try to parse as YYYY-MM-DD | ||
| try: | ||
| datetime.strptime(date_str, '%Y-%m-%d') | ||
| return date_str | ||
| except: | ||
| pass | ||
| day_mapping = { | ||
| 'monday': 0, 'mon': 0, 'tuesday': 1, 'tue': 1, | ||
| 'wednesday': 2, 'wed': 2, 'thursday': 3, 'thu': 3, 'thurs': 3, | ||
| 'friday': 4, 'fri': 4, 'saturday': 5, 'sat': 5, 'sunday': 6, 'sun': 6, | ||
| } | ||
|
|
||
| if date_str_lower in day_mapping: | ||
| days_ahead = day_mapping[date_str_lower] - today.weekday() | ||
| if days_ahead <= 0: days_ahead += 7 | ||
| return (today + timedelta(days=days_ahead)).strftime('%Y-%m-%d') | ||
|
|
||
| return None | ||
| return date_str |
There was a problem hiding this comment.
Missing "today", "tomorrow", and "next week" parsing — regression risk.
The date_patterns regex (lines 39-41) still matches today, tomorrow, and next week, but parse_date no longer handles them. These inputs will pass through unchanged and be stored as raw strings (e.g., "tomorrow") instead of canonical dates.
Based on learnings, ensure parse_date normalizes all date strings the model may produce.
Proposed fix to restore common date handling
def parse_date(date_str):
if not date_str: return None
today = datetime.now()
date_str_lower = date_str.lower().strip()
+ if date_str_lower == 'today':
+ return today.strftime('%Y-%m-%d')
+
+ if date_str_lower == 'tomorrow':
+ return (today + timedelta(days=1)).strftime('%Y-%m-%d')
+
+ if date_str_lower == 'next week':
+ return (today + timedelta(days=7)).strftime('%Y-%m-%d')
if date_str_lower == 'next month':
next_month = (today.month % 12) + 1
year = today.year + (1 if today.month == 12 else 0)
return f"{year}-{next_month:02d}-01"📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| def parse_date(date_str): | |
| """Parse date string to ISO format.""" | |
| if not date_str: | |
| return None | |
| if not date_str: return None | |
| today = datetime.now() | |
| date_str_lower = date_str.lower() | |
| date_str_lower = date_str.lower().strip() | |
| if date_str_lower == 'today': | |
| return today.strftime('%Y-%m-%d') | |
| elif date_str_lower == 'tomorrow': | |
| return (today + timedelta(days=1)).strftime('%Y-%m-%d') | |
| elif date_str_lower == 'next week': | |
| return (today + timedelta(days=7)).strftime('%Y-%m-%d') | |
| # Try to parse as MM/DD or MM/DD/YYYY | |
| try: | |
| if '/' in date_str: | |
| parts = date_str.split('/') | |
| if len(parts) == 2: | |
| month, day = int(parts[0]), int(parts[1]) | |
| year = today.year | |
| if month < today.month or (month == today.month and day < today.day): | |
| year += 1 | |
| return f"{year}-{month:02d}-{day:02d}" | |
| elif len(parts) == 3: | |
| month, day, year = int(parts[0]), int(parts[1]), int(parts[2]) | |
| return f"{year}-{month:02d}-{day:02d}" | |
| except: | |
| pass | |
| if date_str_lower == 'next month': | |
| next_month = (today.month % 12) + 1 | |
| year = today.year + (1 if today.month == 12 else 0) | |
| return f"{year}-{next_month:02d}-01" | |
| # Try to parse as YYYY-MM-DD | |
| try: | |
| datetime.strptime(date_str, '%Y-%m-%d') | |
| return date_str | |
| except: | |
| pass | |
| day_mapping = { | |
| 'monday': 0, 'mon': 0, 'tuesday': 1, 'tue': 1, | |
| 'wednesday': 2, 'wed': 2, 'thursday': 3, 'thu': 3, 'thurs': 3, | |
| 'friday': 4, 'fri': 4, 'saturday': 5, 'sat': 5, 'sunday': 6, 'sun': 6, | |
| } | |
| if date_str_lower in day_mapping: | |
| days_ahead = day_mapping[date_str_lower] - today.weekday() | |
| if days_ahead <= 0: days_ahead += 7 | |
| return (today + timedelta(days=days_ahead)).strftime('%Y-%m-%d') | |
| return None | |
| return date_str | |
| def parse_date(date_str): | |
| if not date_str: return None | |
| today = datetime.now() | |
| date_str_lower = date_str.lower().strip() | |
| if date_str_lower == 'today': | |
| return today.strftime('%Y-%m-%d') | |
| if date_str_lower == 'tomorrow': | |
| return (today + timedelta(days=1)).strftime('%Y-%m-%d') | |
| if date_str_lower == 'next week': | |
| return (today + timedelta(days=7)).strftime('%Y-%m-%d') | |
| if date_str_lower == 'next month': | |
| next_month = (today.month % 12) + 1 | |
| year = today.year + (1 if today.month == 12 else 0) | |
| return f"{year}-{next_month:02d}-01" | |
| day_mapping = { | |
| 'monday': 0, 'mon': 0, 'tuesday': 1, 'tue': 1, | |
| 'wednesday': 2, 'wed': 2, 'thursday': 3, 'thu': 3, 'thurs': 3, | |
| 'friday': 4, 'fri': 4, 'saturday': 5, 'sat': 5, 'sunday': 6, 'sun': 6, | |
| } | |
| if date_str_lower in day_mapping: | |
| days_ahead = day_mapping[date_str_lower] - today.weekday() | |
| if days_ahead <= 0: days_ahead += 7 | |
| return (today + timedelta(days=days_ahead)).strftime('%Y-%m-%d') | |
| return date_str |
🧰 Tools
🪛 Ruff (0.14.13)
84-84: Multiple statements on one line (colon)
(E701)
102-102: Multiple statements on one line (colon)
(E701)
🤖 Prompt for AI Agents
In `@Backend/agent/handlers/appointment.py` around lines 83 - 105, The parse_date
function no longer normalizes common relative terms ("today", "tomorrow", "next
week"), causing those strings to be stored raw; update parse_date to detect
(case-insensitive) "today" -> return today's date in '%Y-%m-%d', "tomorrow" ->
today + 1 day, and "next week" -> today + 7 days (use the existing
datetime/timedelta imports and the same strftime formatting), keeping the
existing handling for weekdays and "next month"; ensure the checks occur before
falling back to returning the original string so all terms matched by the
date_patterns regex are converted to canonical ISO dates.
| llm = Llama( | ||
| model_path="./models/qwen2-0_5b-instruct-q4_k_m.gguf", | ||
| lora_path="./models/adapter_model.bin", | ||
| n_ctx=512, | ||
| n_gpu_layers=-1 | ||
| ) |
There was a problem hiding this comment.
Global model instantiation risks startup failure and blocks import.
Loading the model at module import time has several issues:
- No error handling — if model files are missing, the import crashes the entire application
- Blocks startup — model loading is synchronous and potentially slow
- Relative paths —
"./models/..."depends on the current working directory, which is fragile
Consider lazy initialization with error handling:
Proposed refactor with lazy loading
from llama_cpp import Llama
+import os
-
-llm = Llama(
- model_path="./models/qwen2-0_5b-instruct-q4_k_m.gguf",
- lora_path="./models/adapter_model.bin",
- n_ctx=512,
- n_gpu_layers=-1
-)
+_llm = None
+
+def get_llm():
+ global _llm
+ if _llm is None:
+ base_dir = os.path.dirname(os.path.abspath(__file__))
+ model_path = os.path.join(base_dir, "..", "models", "qwen2-0_5b-instruct-q4_k_m.gguf")
+ lora_path = os.path.join(base_dir, "..", "models", "adapter_model.bin")
+
+ if not os.path.exists(model_path):
+ raise FileNotFoundError(f"Model not found: {model_path}")
+
+ _llm = Llama(
+ model_path=model_path,
+ lora_path=lora_path if os.path.exists(lora_path) else None,
+ n_ctx=512,
+ n_gpu_layers=-1
+ )
+ return _llm🤖 Prompt for AI Agents
In `@Backend/agent/llm.py` around lines 4 - 9, The module currently creates a
global Llama instance (llm) at import, which can crash or block startup; change
to lazy initialization: remove the top-level llm instantiation and implement a
get_llm() (or init_llm()) function that resolves model paths using
Path(__file__).resolve().parent / "models" for model_path and lora_path,
performs the Llama(...) creation inside a try/except, logs and returns None or
raises a clear error on failure, and caches the created instance for subsequent
calls to avoid repeated loads; optionally make the initializer callable from
async code or kick off loading in a background thread if non-blocking startup is
required.
| import json | ||
| import re | ||
| import pytesseract | ||
| pytesseract.pytesseract.tesseract_cmd = r'/opt/homebrew/bin/tesseract' |
There was a problem hiding this comment.
Hardcoded Tesseract path breaks cross-platform portability.
This macOS-specific path will fail on Linux/Windows deployments. Consider using environment variables or auto-detection.
Suggested fix
-pytesseract.pytesseract.tesseract_cmd = r'/opt/homebrew/bin/tesseract'
+import shutil
+pytesseract.pytesseract.tesseract_cmd = os.environ.get(
+ 'TESSERACT_CMD',
+ shutil.which('tesseract') or r'/opt/homebrew/bin/tesseract'
+)🤖 Prompt for AI Agents
In `@Backend/app.py` at line 7, The hardcoded assignment to
pytesseract.pytesseract.tesseract_cmd should be replaced with a cross-platform
resolution: check an environment variable (e.g., TESSERACT_CMD) and if not set,
auto-detect the binary using shutil.which("tesseract") (or platform-specific
common paths) and only assign pytesseract.pytesseract.tesseract_cmd if a valid
path is found; update the initialization where
pytesseract.pytesseract.tesseract_cmd is set to implement this fallback and log
or raise a clear error if tesseract cannot be located.
| import os | ||
| from llama_cpp import Llama |
There was a problem hiding this comment.
Remove duplicate imports.
os and Llama are already imported at lines 2 and 13. As flagged by static analysis (Ruff F811).
Suggested fix
-import os
-from llama_cpp import Llama
-
def get_llm():📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| import os | |
| from llama_cpp import Llama | |
| def get_llm(): |
🧰 Tools
🪛 Ruff (0.14.13)
34-34: Redefinition of unused os from line 2: os redefined here
Remove definition: os
(F811)
35-35: Redefinition of unused Llama from line 13: Llama redefined here
Remove definition: Llama
(F811)
🤖 Prompt for AI Agents
In `@Backend/app.py` around lines 34 - 35, Remove the duplicate imports of the
module and class by deleting the redundant "import os" and "from llama_cpp
import Llama" entries so only the original imports remain; locate the duplicate
occurrences (the second "import os" and the second "from llama_cpp import
Llama") and remove them, leaving the first declarations intact to resolve the
Ruff F811 duplicate-import error.
| "react-navigation": "^4.4.4", | ||
| "react-navigation-bottom-tabs": "^1.0.2" | ||
| "react-navigation-bottom-tabs": "^1.0.2", | ||
| "tesseract.js": "^7.0.0" |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Search for tesseract usage in frontend code
echo "=== Searching for tesseract imports/requires in JS/TS files ==="
rg 'tesseract' Frontend/src/ --type=js --type=ts -i
echo -e "\n=== Searching for tesseract in all files ==="
rg 'tesseract' Frontend/ -i --exclude-dir=node_modules --exclude-dir=.git
echo -e "\n=== Checking package.json for tesseract ==="
cat Frontend/package.json | grep -i tesseractRepository: AOSSIE-Org/BabyNest
Length of output: 679
🏁 Script executed:
#!/bin/bash
# Check if tesseract.js is imported anywhere, even indirectly
echo "=== Searching for any 'require' or 'import' statements that might load tesseract ==="
rg 'from.*tesseract|import.*tesseract|require.*tesseract' Frontend/src/ --type=js --type=ts -i
echo -e "\n=== Check OCR-related imports to see what's actually being used ==="
rg 'OCR|ocr' Frontend/src/ --type=js --type=ts -A 2 -B 2Repository: AOSSIE-Org/BabyNest
Length of output: 1853
🏁 Script executed:
#!/bin/bash
# Get context on when tesseract.js was added and why
echo "=== Check git history for tesseract.js addition ==="
cd Frontend && git log --oneline --all -- package.json | head -20Repository: AOSSIE-Org/BabyNest
Length of output: 272
🏁 Script executed:
#!/bin/bash
# Get the dependencies section of Frontend/package.json to understand context
cat Frontend/package.json | grep -A 5 -B 5 'tesseract.js'
# Check if there are any comments or documentation about this dependency
echo -e "\n=== Full dependencies and devDependencies sections ==="
jq '.dependencies, .devDependencies' Frontend/package.json 2>/dev/null | head -80Repository: AOSSIE-Org/BabyNest
Length of output: 2520
Remove unused dependency: tesseract.js is not referenced anywhere in the frontend code.
OCR processing is handled entirely server-side via the /api/ocr-scan endpoint. This dependency adds unnecessary bundle size and should be removed unless client-side OCR is planned for future offline support.
🤖 Prompt for AI Agents
In `@Frontend/package.json` at line 47, Remove the unused client-side OCR
dependency "tesseract.js" from package.json (delete the "tesseract.js": "^7.0.0"
entry), run your package manager to update the lockfile (npm install / yarn
install) and ensure there are no remaining imports/usages of "tesseract.js" in
the frontend codebase (search for "tesseract" or import statements) before
committing the change.
| import * as ImagePicker from 'react-native-image-picker'; | ||
|
|
||
| // Using your actual machine IP for simulator-to-backend connectivity | ||
| const BASE_URL = 'http://10.72.82.230:5050'; |
There was a problem hiding this comment.
Hardcoded IP address breaks deployment.
BASE_URL should use environment variables like HomeScreen.jsx does with @env. This will fail on any device not on the same network.
Suggested fix
-// Using your actual machine IP for simulator-to-backend connectivity
-const BASE_URL = 'http://10.72.82.230:5050';
+import { BASE_URL } from '@env';📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| const BASE_URL = 'http://10.72.82.230:5050'; | |
| import { BASE_URL } from '@env'; |
🤖 Prompt for AI Agents
In `@Frontend/src/Screens/ScanReportScreen.jsx` at line 6, Replace the hardcoded
BASE_URL in ScanReportScreen.jsx with the same environment variable pattern used
in HomeScreen.jsx (import { API_URL } from '@env' or the project's env key) and
use that variable wherever BASE_URL is referenced; ensure you remove the literal
'http://10.72.82.230:5050' and add a sensible fallback or error if the env var
is missing so the app doesn't crash at runtime.
| const pickImage = () => { | ||
| ImagePicker.launchImageLibrary({ mediaType: 'photo', quality: 0.5 }, async (response) => { | ||
| if (response.assets) { | ||
| setLoading(true); | ||
| const imageUri = response.assets[0].uri; | ||
| uploadAndScan(imageUri); | ||
| } | ||
| }); | ||
| }; |
There was a problem hiding this comment.
Critical bug: User is forced to select an image twice.
pickImage launches the image library and passes imageUri to uploadAndScan, but uploadAndScan ignores that parameter and launches the image library again (line 61). The user must select the same image twice.
Suggested fix — use includeBase64 in the initial picker call
const pickImage = () => {
- ImagePicker.launchImageLibrary({ mediaType: 'photo', quality: 0.5 }, async (response) => {
+ ImagePicker.launchImageLibrary({ mediaType: 'photo', quality: 0.8, includeBase64: true }, async (response) => {
+ if (response.didCancel || response.errorCode) {
+ return;
+ }
if (response.assets) {
setLoading(true);
- const imageUri = response.assets[0].uri;
- uploadAndScan(imageUri);
+ const base64Image = response.assets[0].base64;
+ await uploadAndScan(base64Image);
}
});
};
- const uploadAndScan = async (uri) => {
- setLoading(true);
- try {
- // Base64 conversion using react-native-image-picker capability
- const options = {
- mediaType: 'photo',
- quality: 0.8,
- includeBase64: true,
- };
-
- ImagePicker.launchImageLibrary(options, async (response) => {
- if (response.didCancel || response.errorCode) {
- setLoading(false);
- return;
- }
-
- const base64Image = response.assets[0].base64;
-
+ const uploadAndScan = async (base64Image) => {
+ try {
const res = await fetch(`${BASE_URL}/api/ocr-scan`, {
method: 'POST',
headers: {
'Content-Type': 'application/json',
},
body: JSON.stringify({ image: base64Image }),
});
const data = await res.json();
setLoading(false);
// ... rest of success/error handling
- });
} catch (err) {
setLoading(false);
Alert.alert("Network Error", "Verify backend is running and reachable.");
}
};Also applies to: 51-98
🤖 Prompt for AI Agents
In `@Frontend/src/Screens/ScanReportScreen.jsx` around lines 11 - 19, pickImage
currently opens the image library and passes imageUri to uploadAndScan, but
uploadAndScan re-opens the picker causing the user to select twice; fix by
having pickImage request includeBase64 (ImagePicker.launchImageLibrary with
includeBase64: true) and pass the picked asset (uri and base64) into
uploadAndScan, then modify uploadAndScan to accept an image parameter and skip
any internal calls to ImagePicker.launchImageLibrary so it uses the provided
image data (reference functions: pickImage, uploadAndScan, and the
ImagePicker.launchImageLibrary call).
| if (weight && weight !== "N/A") { | ||
| await fetch(`${BASE_URL}/api/weight`, { | ||
| method: 'POST', | ||
| headers: { 'Content-Type': 'application/json' }, | ||
| body: JSON.stringify({ weight: weight, user_id: 'default' }), | ||
| }); | ||
| } | ||
|
|
||
| if (bp && bp !== "N/A") { | ||
| await fetch(`${BASE_URL}/api/blood-pressure`, { | ||
| method: 'POST', | ||
| headers: { 'Content-Type': 'application/json' }, | ||
| body: JSON.stringify({ bp: bp, user_id: 'default' }), | ||
| }); | ||
| } |
There was a problem hiding this comment.
No error handling for failed HTTP responses.
fetch doesn't throw on HTTP errors (4xx/5xx). If the backend returns an error status, the code proceeds as if successful. Check response.ok before parsing JSON.
Suggested fix
if (weight && weight !== "N/A") {
- await fetch(`${BASE_URL}/api/weight`, {
+ const weightRes = await fetch(`${BASE_URL}/api/weight`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ weight: weight, user_id: 'default' }),
});
+ if (!weightRes.ok) throw new Error('Failed to save weight');
}
if (bp && bp !== "N/A") {
- await fetch(`${BASE_URL}/api/blood-pressure`, {
+ const bpRes = await fetch(`${BASE_URL}/api/blood-pressure`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ bp: bp, user_id: 'default' }),
});
+ if (!bpRes.ok) throw new Error('Failed to save blood pressure');
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if (weight && weight !== "N/A") { | |
| await fetch(`${BASE_URL}/api/weight`, { | |
| method: 'POST', | |
| headers: { 'Content-Type': 'application/json' }, | |
| body: JSON.stringify({ weight: weight, user_id: 'default' }), | |
| }); | |
| } | |
| if (bp && bp !== "N/A") { | |
| await fetch(`${BASE_URL}/api/blood-pressure`, { | |
| method: 'POST', | |
| headers: { 'Content-Type': 'application/json' }, | |
| body: JSON.stringify({ bp: bp, user_id: 'default' }), | |
| }); | |
| } | |
| if (weight && weight !== "N/A") { | |
| const weightRes = await fetch(`${BASE_URL}/api/weight`, { | |
| method: 'POST', | |
| headers: { 'Content-Type': 'application/json' }, | |
| body: JSON.stringify({ weight: weight, user_id: 'default' }), | |
| }); | |
| if (!weightRes.ok) throw new Error('Failed to save weight'); | |
| } | |
| if (bp && bp !== "N/A") { | |
| const bpRes = await fetch(`${BASE_URL}/api/blood-pressure`, { | |
| method: 'POST', | |
| headers: { 'Content-Type': 'application/json' }, | |
| body: JSON.stringify({ bp: bp, user_id: 'default' }), | |
| }); | |
| if (!bpRes.ok) throw new Error('Failed to save blood pressure'); | |
| } |
🤖 Prompt for AI Agents
In `@Frontend/src/Screens/ScanReportScreen.jsx` around lines 27 - 41, The two POST
requests that send weight and bp use fetch but don't check HTTP status—update
the weight and bp request blocks to capture the fetch responses (e.g., const
resp = await fetch(...)), verify resp.ok, and handle non-OK cases by
logging/reporting the error or throwing so failures aren't treated as successes;
apply this change to both the weight POST and the blood-pressure POST (the fetch
calls in ScanReportScreen.jsx) and ensure any downstream code awaits/handles
thrown errors or error state appropriately.
| const res = await fetch(`${BASE_URL}/api/ocr-scan`, { | ||
| method: 'POST', | ||
| headers: { | ||
| 'Content-Type': 'application/json', | ||
| }, | ||
| body: JSON.stringify({ image: base64Image }), | ||
| }); | ||
|
|
||
| const data = await res.json(); |
There was a problem hiding this comment.
Same issue: missing response.ok check for OCR endpoint.
Suggested fix
const res = await fetch(`${BASE_URL}/api/ocr-scan`, {
method: 'POST',
headers: {
'Content-Type': 'application/json',
},
body: JSON.stringify({ image: base64Image }),
});
+ if (!res.ok) {
+ throw new Error(`Server error: ${res.status}`);
+ }
+
const data = await res.json();🤖 Prompt for AI Agents
In `@Frontend/src/Screens/ScanReportScreen.jsx` around lines 69 - 77, The fetch to
`${BASE_URL}/api/ocr-scan` in ScanReportScreen.jsx doesn't check the HTTP
status; add a check on the response (const res = await fetch(...); if (!res.ok)
{ const errText = await res.text().catch(()=>res.statusText); throw new
Error(`OCR request failed: ${res.status} ${errText}`); }) before calling
res.json(), or instead handle non-ok by showing the error to the user/state;
keep references to BASE_URL, the /api/ocr-scan endpoint, the res variable and
the parsed data when updating state or UI.
|
@pranjal9091 |
Overview
This PR implements a robust pipeline for scanning medical reports and extracting key vitals (Weight, Blood Pressure, and Appointment dates) into a structured format. The system is optimized for local execution using quantized GGUF models, bridging the gap between raw OCR data and structured health logs.
Key Technical Contributions
-> Advanced OCR Preprocessing: To handle the noise and distortion typical in handheld camera captures, I implemented an OpenCV-based preprocessing layer. This includes grayscale conversion, cubic resizing for better character definition, and Gaussian adaptive thresholding to maximize Tesseract’s extraction accuracy.
-> Custom LLM Inference Engine: Integrated llama-cpp-python to support GGUF-based inference. Successfully implemented a workflow to load custom fine-tuned LoRA adapters optimized for medical terminology over a Qwen-0.5B base model.
-> Structured Data Extraction: Developed a "Few-Shot" prompting strategy with rigid schema enforcement to ensure consistent JSON output. Added a regex-based fallback parser in the backend to maintain data integrity even if the LLM output is partially malformed due to OCR noise.
-> Environment Portability: Refactored all backend model and adapter paths using os.path dynamic mapping. This ensures the pipeline is reproducible across different development environments without requiring hardcoded system paths.
-> Frontend Integration: Developed a dedicated ScanReportScreen in React Native that manages image encoding (Base64) and real-time state updates for displaying extracted values directly to the user.
Technical Trade-offs & Future Scope
-> Model Selection: Opted for Qwen-0.5B to ensure low latency and high memory efficiency on mobile-connected backends.
-> Future Improvements: Planning to experiment with 1.5B/3B models to further increase reasoning depth and implement automated image deskewing (rotation correction) for even higher OCR precision.
Summary by CodeRabbit
Release Notes
✏️ Tip: You can customize this high-level summary in your review settings.