feat: End-to-End Medical Report Extraction Pipeline with GGUF-LoRA Integration by pranjal9091 · Pull Request #179 · AOSSIE-Org/BabyNest

pranjal9091 · 2026-01-25T16:24:27Z

Overview
This PR implements a robust pipeline for scanning medical reports and extracting key vitals (Weight, Blood Pressure, and Appointment dates) into a structured format. The system is optimized for local execution using quantized GGUF models, bridging the gap between raw OCR data and structured health logs.

Key Technical Contributions

-> Advanced OCR Preprocessing: To handle the noise and distortion typical in handheld camera captures, I implemented an OpenCV-based preprocessing layer. This includes grayscale conversion, cubic resizing for better character definition, and Gaussian adaptive thresholding to maximize Tesseract’s extraction accuracy.

-> Custom LLM Inference Engine: Integrated llama-cpp-python to support GGUF-based inference. Successfully implemented a workflow to load custom fine-tuned LoRA adapters optimized for medical terminology over a Qwen-0.5B base model.

-> Structured Data Extraction: Developed a "Few-Shot" prompting strategy with rigid schema enforcement to ensure consistent JSON output. Added a regex-based fallback parser in the backend to maintain data integrity even if the LLM output is partially malformed due to OCR noise.

-> Environment Portability: Refactored all backend model and adapter paths using os.path dynamic mapping. This ensures the pipeline is reproducible across different development environments without requiring hardcoded system paths.

-> Frontend Integration: Developed a dedicated ScanReportScreen in React Native that manages image encoding (Base64) and real-time state updates for displaying extracted values directly to the user.

Technical Trade-offs & Future Scope
-> Model Selection: Opted for Qwen-0.5B to ensure low latency and high memory efficiency on mobile-connected backends.

-> Future Improvements: Planning to experiment with 1.5B/3B models to further increase reasoning depth and implement automated image deskewing (rotation correction) for even higher OCR precision.

Summary by CodeRabbit

Release Notes

New Features
- Added medical report scanning with OCR to automatically extract health metrics (weight, blood pressure, appointments) from images.
- New "Medical Reports" section in home screen provides quick access to scan and process medical documents.
- Extracted health data can be saved directly to your records.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

coderabbitai · 2026-01-25T16:24:50Z

📝 Walkthrough

Walkthrough

This pull request adds an OCR-based medical report scanning feature with LLM-powered data extraction to the backend and introduces a corresponding UI screen in the frontend. The backend gains image processing capabilities via Tesseract OCR and Llama model inference for structured data extraction, while new endpoints store extracted weight and blood pressure metrics. The frontend introduces a ScanReport screen with image selection and data persistence workflows.

Changes

Cohort / File(s)	Summary
OCR & LLM Backend Integration `Backend/agent/llm.py`, `Backend/models/generation_config.json`	Introduced Llama model client initialization (replacing offline heuristic logic) with actual inference calls and added generation hyperparameters configuration (temperature, top_p, token IDs).
OCR Processing Endpoint `Backend/app.py`	Added `/api/ocr-scan` endpoint with image preprocessing (grayscale, upscaling, adaptive threshold), Tesseract OCR invocation, and LLM-based structured extraction with JSON fallback and regex parsing.
Health Metrics Persistence `Backend/app.py`	Introduced `/api/weight` and `/api/blood-pressure` endpoints to persist extracted metrics into `weekly_weight` and `blood_pressure_logs` tables with user tracking.
Date Parsing Refactor `Backend/agent/handlers/appointment.py`	Significantly reworked `parse_date` function to support next-month and weekday-based dates via day-mapping; removed multiple prior format branches; preserves input when unmatched instead of returning None.
Server Configuration & Blueprints `Backend/app.py`	Increased server port to 5050; registered new blueprints for medicine, symptoms, weight, and blood-pressure routes; added lazy-loaded Llama model initialization with path fallbacks.
Frontend OCR Screen `Frontend/src/Screens/ScanReportScreen.jsx`	New React Native component enabling image selection from gallery, OCR processing via backend, alert-based data confirmation, and conditional persistence to weight/blood-pressure endpoints.
Frontend UI & Navigation Updates `Frontend/src/Screens/HomeScreen.jsx`, `Frontend/src/navigation/StackNavigator.jsx`	Added "Medical Reports" section with Scan Medical Report card in HomeScreen; registered ScanReport route in StackNavigator with explicit initialRouteName.
Frontend Dependencies `Frontend/package.json`	Added `react-native-image-picker` (^8.2.1) and `tesseract.js` (^7.0.0) for image selection and OCR capabilities.
iOS Configuration Updates `Frontend/ios/BabyNest.xcodeproj/project.pbxproj`, `Frontend/ios/Podfile`, `Frontend/ios/BabyNest/PrivacyInfo.xcprivacy`	Updated libPods references and expanded HEADER_SEARCH_PATHS for React Native dependencies; simplified Podfile configuration (fixed platform to 15.5, removed custom linkage); reordered privacy access categories and added DiskSpace permissions.
Ignore Patterns `.gitignore`	Expanded ignore list with Python artifacts, model files, Node/React Native artifacts, macOS system files, and debugging logs.

Sequence Diagram(s)

sequenceDiagram
    actor User
    participant Frontend as Frontend<br/>(ScanReportScreen)
    participant ImagePicker as Image Picker
    participant Backend as Backend API<br/>(/api/ocr-scan)
    participant OCR as Tesseract OCR
    participant LLM as Llama LLM
    participant Database as Database

    User->>Frontend: Tap "Scan Medical Report"
    Frontend->>ImagePicker: Launch image picker
    ImagePicker-->>Frontend: Return base64 image
    Frontend->>Frontend: Set loading state
    Frontend->>Backend: POST base64 image
    Backend->>OCR: Process image (grayscale, upscale, threshold)
    OCR-->>Backend: Raw extracted text
    Backend->>LLM: Invoke inference with OCR text
    LLM-->>Backend: JSON structured data<br/>(weight, BP, appointment)
    Backend-->>Frontend: Extracted metrics
    Frontend->>User: Show alert with values
    User->>Frontend: Confirm & Save
    Frontend->>Backend: POST /api/weight
    Backend->>Database: Insert weight record
    Frontend->>Backend: POST /api/blood-pressure
    Backend->>Database: Insert BP record
    Database-->>Backend: Confirmation
    Backend-->>Frontend: Success response
    Frontend->>User: Show success alert

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Add required tracking features #72: Both PRs add and register health-tracking blueprints (weight, blood-pressure) with corresponding backend endpoints in Backend/app.py, creating overlapping feature additions.
Frontend integration and rag setup #80: Both PRs modify Backend/agent/handlers/appointment.py's parse_date function, altering date parsing logic for appointment processing.

Poem

🐰 Click, scan, and see!
With eyes of OCR and an LLM's mind so keen,
Our bunny reads your health reports, the finest ever seen,
Weight and pressure tracked with care, from image to the screen—
Healthcare hops forward now! 🏥✨

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 30.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The pull request title directly and accurately summarizes the main implementation—an end-to-end pipeline for extracting medical vitals from reports using GGUF-LoRA models, which is the primary focus of the changeset.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 16

🤖 Fix all issues with AI agents

In @.gitignore:
- Around line 32-34: The .gitignore has a malformed entry where the patterns
"*.log" and "Backend/db/chromadb/chroma.sqlite3" are concatenated on one line;
split them into two separate lines so each pattern is on its own line (ensure
"*.log" is one line and "Backend/db/chromadb/chroma.sqlite3" is the next) so
both patterns match as intended; verify the surrounding entries like
"ocr_debug.txt" remain unchanged.

In `@Backend/agent/handlers/appointment.py`:
- Around line 83-105: The parse_date function no longer normalizes common
relative terms ("today", "tomorrow", "next week"), causing those strings to be
stored raw; update parse_date to detect (case-insensitive) "today" -> return
today's date in '%Y-%m-%d', "tomorrow" -> today + 1 day, and "next week" ->
today + 7 days (use the existing datetime/timedelta imports and the same
strftime formatting), keeping the existing handling for weekdays and "next
month"; ensure the checks occur before falling back to returning the original
string so all terms matched by the date_patterns regex are converted to
canonical ISO dates.

In `@Backend/agent/llm.py`:
- Around line 4-9: The module currently creates a global Llama instance (llm) at
import, which can crash or block startup; change to lazy initialization: remove
the top-level llm instantiation and implement a get_llm() (or init_llm())
function that resolves model paths using Path(__file__).resolve().parent /
"models" for model_path and lora_path, performs the Llama(...) creation inside a
try/except, logs and returns None or raises a clear error on failure, and caches
the created instance for subsequent calls to avoid repeated loads; optionally
make the initializer callable from async code or kick off loading in a
background thread if non-blocking startup is required.

In `@Backend/app.py`:
- Around line 139-148: The save_weight route opens a DB connection via open_db()
and never closes it, causing a connection leak; update save_weight to ensure the
connection (db) is closed after use by using a try/finally or a context manager
around open_db(), move cursor usage (cursor = db.cursor(), cursor.execute(...),
db.commit()) inside that block, and call db.close() in the finally (or rely on
context manager exit) so the DB connection is always released even if an
exception occurs.
- Around line 150-159: The save_bp function opens a DB connection via open_db()
but never closes it, causing a connection leak; wrap the DB usage in a
try/finally (mirroring save_weight) so that cursor operations and db.commit()
happen in try and db.close() is called in finally; ensure you reference the same
open_db(), save_bp function and close the connection regardless of success or
exception (and optionally close the cursor inside the try before commit).
- Around line 243-244: The current entrypoint under the if __name__ ==
'__main__' guard calls app.run(host='0.0.0.0', port=5050, debug=True), which
must not be used in production; change it to derive debug and host from
environment variables (e.g., FLASK_DEBUG or APP_ENV) so debug=True is only
enabled for local/dev runs and default to host='127.0.0.1' otherwise, and ensure
app.run uses those variables instead of hardcoded debug=True and host='0.0.0.0'.
- Around line 81-85: Validate and error-handle the incoming base64 image before
processing: after calling request.get_json() and before
base64.b64decode(data['image']), check that data is not None and contains the
'image' key; catch decoding errors from base64.b64decode (e.g.,
binascii.Error/TypeError/ValueError) and return a 400/controlled error; after
np.frombuffer and cv2.imdecode (reference cv2.imdecode and the downstream
cv2.cvtColor use), verify that img is not None and return a clear 400 error if
image decoding failed; do not allow unhandled exceptions to propagate.
- Around line 34-35: Remove the duplicate imports of the module and class by
deleting the redundant "import os" and "from llama_cpp import Llama" entries so
only the original imports remain; locate the duplicate occurrences (the second
"import os" and the second "from llama_cpp import Llama") and remove them,
leaving the first declarations intact to resolve the Ruff F811 duplicate-import
error.
- Line 7: The hardcoded assignment to pytesseract.pytesseract.tesseract_cmd
should be replaced with a cross-platform resolution: check an environment
variable (e.g., TESSERACT_CMD) and if not set, auto-detect the binary using
shutil.which("tesseract") (or platform-specific common paths) and only assign
pytesseract.pytesseract.tesseract_cmd if a valid path is found; update the
initialization where pytesseract.pytesseract.tesseract_cmd is set to implement
this fallback and log or raise a clear error if tesseract cannot be located.
- Line 109: The synchronous call to the Llama model via model(prompt,
max_tokens=150, stop=["}"], temperature=0) can block the Flask request thread
indefinitely; update the inference path to enforce an external timeout by
running Llama inference in a separate monitored execution context (e.g., spawn a
subprocess that runs the Llama call and kill it on timeout, or run model
invocation inside a worker thread/process with a timeout and cancel/terminate if
exceeded). Concretely, move the Llama invocation (the model(...) call) into a
helper that executes in a subprocess or background worker, implement a
configurable timeout value, and ensure the Flask handler for generating output
returns an error response if the helper times out and that any orphaned process
is terminated. Ensure the helper is referenced where output is produced so
callers use the timed wrapper instead of calling model(...) directly.

In `@Frontend/ios/BabyNest/PrivacyInfo.xcprivacy`:
- Around line 33-40: Remove the unused DiskSpace privacy declaration by deleting
the <dict> entry that contains the NSPrivacyAccessedAPIType key with value
NSPrivacyAccessedAPICategoryDiskSpace and its NSPrivacyAccessedAPITypeReasons
array (the reason "85F4.1"); ensure no other privacy entries are altered and
keep the XML well-formed after removal in PrivacyInfo.xcprivacy.

In `@Frontend/package.json`:
- Line 47: Remove the unused client-side OCR dependency "tesseract.js" from
package.json (delete the "tesseract.js": "^7.0.0" entry), run your package
manager to update the lockfile (npm install / yarn install) and ensure there are
no remaining imports/usages of "tesseract.js" in the frontend codebase (search
for "tesseract" or import statements) before committing the change.

In `@Frontend/src/Screens/ScanReportScreen.jsx`:
- Around line 11-19: pickImage currently opens the image library and passes
imageUri to uploadAndScan, but uploadAndScan re-opens the picker causing the
user to select twice; fix by having pickImage request includeBase64
(ImagePicker.launchImageLibrary with includeBase64: true) and pass the picked
asset (uri and base64) into uploadAndScan, then modify uploadAndScan to accept
an image parameter and skip any internal calls to ImagePicker.launchImageLibrary
so it uses the provided image data (reference functions: pickImage,
uploadAndScan, and the ImagePicker.launchImageLibrary call).
- Around line 27-41: The two POST requests that send weight and bp use fetch but
don't check HTTP status—update the weight and bp request blocks to capture the
fetch responses (e.g., const resp = await fetch(...)), verify resp.ok, and
handle non-OK cases by logging/reporting the error or throwing so failures
aren't treated as successes; apply this change to both the weight POST and the
blood-pressure POST (the fetch calls in ScanReportScreen.jsx) and ensure any
downstream code awaits/handles thrown errors or error state appropriately.
- Around line 69-77: The fetch to `${BASE_URL}/api/ocr-scan` in
ScanReportScreen.jsx doesn't check the HTTP status; add a check on the response
(const res = await fetch(...); if (!res.ok) { const errText = await
res.text().catch(()=>res.statusText); throw new Error(`OCR request failed:
${res.status} ${errText}`); }) before calling res.json(), or instead handle
non-ok by showing the error to the user/state; keep references to BASE_URL, the
/api/ocr-scan endpoint, the res variable and the parsed data when updating state
or UI.
- Line 6: Replace the hardcoded BASE_URL in ScanReportScreen.jsx with the same
environment variable pattern used in HomeScreen.jsx (import { API_URL } from
'@env' or the project's env key) and use that variable wherever BASE_URL is
referenced; ensure you remove the literal 'http://10.72.82.230:5050' and add a
sensible fallback or error if the env var is missing so the app doesn't crash at
runtime.

🧹 Nitpick comments (3)

Backend/agent/llm.py (1)

11-21: Add exception handling for inference failures.

The llm() call can raise exceptions (e.g., OOM, malformed input). Unhandled exceptions will propagate and crash the request. The JSON repair logic on line 21 only handles a missing closing brace but won't fix other malformations (missing quotes, truncated keys).

Proposed fix

 def run_llm(prompt: str) -> str:
     """Actual inference logic for medical extraction."""
+    try:
-    output = llm(
+        output = get_llm()(
             prompt, 
             max_tokens=256, 
             stop=["}"], 
             temperature=0
         )
-    response = output['choices'][0]['text'].strip()
-    # Ensuring valid JSON structure
-    return response + "}" if not response.endswith("}") else response
+        response = output['choices'][0]['text'].strip()
+        # Ensure valid JSON structure
+        if not response.endswith("}"):
+            response += "}"
+        return response
+    except Exception as e:
+        # Return a safe fallback JSON on inference failure
+        return '{"error": "extraction_failed"}'

Backend/agent/handlers/appointment.py (2)

84-84: Style: multiple statements on one line.

Per static analysis (E701), split the guard clause for readability.
Proposed fix
-    if not date_str: return None
+    if not date_str:
+        return None
100-103: Style: multiple statements on one line.

Per static analysis (E701), split for readability.
Proposed fix
     if date_str_lower in day_mapping:
         days_ahead = day_mapping[date_str_lower] - today.weekday()
-        if days_ahead <= 0: days_ahead += 7
+        if days_ahead <= 0:
+            days_ahead += 7
         return (today + timedelta(days=days_ahead)).strftime('%Y-%m-%d')

coderabbitai · 2026-01-25T16:29:37Z

.gitignore

+# Debugging
+ocr_debug.txt
+*.logBackend/db/chromadb/chroma.sqlite3


⚠️ Potential issue | 🟡 Minor

Malformed gitignore entry — missing newline.

Line 34 concatenates two patterns: *.log and Backend/db/chromadb/chroma.sqlite3. Neither pattern will match correctly.

Proposed fix

# Debugging ocr_debug.txt -*.logBackend/db/chromadb/chroma.sqlite3 +*.log +Backend/db/chromadb/chroma.sqlite3

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

# Debugging

ocr_debug.txt

*.logBackend/db/chromadb/chroma.sqlite3

# Debugging

ocr_debug.txt

*.log

Backend/db/chromadb/chroma.sqlite3

🤖 Prompt for AI Agents

In @.gitignore around lines 32 - 34, The .gitignore has a malformed entry where the patterns "*.log" and "Backend/db/chromadb/chroma.sqlite3" are concatenated on one line; split them into two separate lines so each pattern is on its own line (ensure "*.log" is one line and "Backend/db/chromadb/chroma.sqlite3" is the next) so both patterns match as intended; verify the surrounding entries like "ocr_debug.txt" remain unchanged.

coderabbitai · 2026-01-25T16:29:37Z

Backend/agent/handlers/appointment.py

 def parse_date(date_str):
-    """Parse date string to ISO format."""
-    if not date_str:
-        return None
-
+    if not date_str: return None
    today = datetime.now()
-    date_str_lower = date_str.lower()
+    date_str_lower = date_str.lower().strip()

-    if date_str_lower == 'today':
-        return today.strftime('%Y-%m-%d')
-    elif date_str_lower == 'tomorrow':
-        return (today + timedelta(days=1)).strftime('%Y-%m-%d')
-    elif date_str_lower == 'next week':
-        return (today + timedelta(days=7)).strftime('%Y-%m-%d')

-    # Try to parse as MM/DD or MM/DD/YYYY
-    try:
-        if '/' in date_str:
-            parts = date_str.split('/')
-            if len(parts) == 2:
-                month, day = int(parts[0]), int(parts[1])
-                year = today.year
-                if month < today.month or (month == today.month and day < today.day):
-                    year += 1
-                return f"{year}-{month:02d}-{day:02d}"
-            elif len(parts) == 3:
-                month, day, year = int(parts[0]), int(parts[1]), int(parts[2])
-                return f"{year}-{month:02d}-{day:02d}"
-    except:
-        pass
+    if date_str_lower == 'next month':
+        next_month = (today.month % 12) + 1
+        year = today.year + (1 if today.month == 12 else 0)
+        return f"{year}-{next_month:02d}-01"

-    # Try to parse as YYYY-MM-DD
-    try:
-        datetime.strptime(date_str, '%Y-%m-%d')
-        return date_str
-    except:
-        pass
+    day_mapping = {
+        'monday': 0, 'mon': 0, 'tuesday': 1, 'tue': 1,
+        'wednesday': 2, 'wed': 2, 'thursday': 3, 'thu': 3, 'thurs': 3,
+        'friday': 4, 'fri': 4, 'saturday': 5, 'sat': 5, 'sunday': 6, 'sun': 6,
+    }
+
+    if date_str_lower in day_mapping:
+        days_ahead = day_mapping[date_str_lower] - today.weekday()
+        if days_ahead <= 0: days_ahead += 7
+        return (today + timedelta(days=days_ahead)).strftime('%Y-%m-%d')

-    return None
+    return date_str 


⚠️ Potential issue | 🟠 Major

Missing "today", "tomorrow", and "next week" parsing — regression risk.

The date_patterns regex (lines 39-41) still matches today, tomorrow, and next week, but parse_date no longer handles them. These inputs will pass through unchanged and be stored as raw strings (e.g., "tomorrow") instead of canonical dates.

Based on learnings, ensure parse_date normalizes all date strings the model may produce.

Proposed fix to restore common date handling

def parse_date(date_str): if not date_str: return None today = datetime.now() date_str_lower = date_str.lower().strip() + if date_str_lower == 'today': + return today.strftime('%Y-%m-%d') + + if date_str_lower == 'tomorrow': + return (today + timedelta(days=1)).strftime('%Y-%m-%d') + + if date_str_lower == 'next week': + return (today + timedelta(days=7)).strftime('%Y-%m-%d') if date_str_lower == 'next month': next_month = (today.month % 12) + 1 year = today.year + (1 if today.month == 12 else 0) return f"{year}-{next_month:02d}-01"

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

def parse_date(date_str):

"""Parse date string to ISO format."""

if not date_str:

return None

if not date_str: return None

today = datetime.now()

date_str_lower = date_str.lower()

date_str_lower = date_str.lower().strip()

if date_str_lower == 'today':

return today.strftime('%Y-%m-%d')

elif date_str_lower == 'tomorrow':

return (today + timedelta(days=1)).strftime('%Y-%m-%d')

elif date_str_lower == 'next week':

return (today + timedelta(days=7)).strftime('%Y-%m-%d')

# Try to parse as MM/DD or MM/DD/YYYY

try:

if '/' in date_str:

parts = date_str.split('/')

if len(parts) == 2:

month, day = int(parts[0]), int(parts[1])

year = today.year

if month < today.month or (month == today.month and day < today.day):

year += 1

return f"{year}-{month:02d}-{day:02d}"

elif len(parts) == 3:

month, day, year = int(parts[0]), int(parts[1]), int(parts[2])

return f"{year}-{month:02d}-{day:02d}"

except:

pass

if date_str_lower == 'next month':

next_month = (today.month % 12) + 1

year = today.year + (1 if today.month == 12 else 0)

return f"{year}-{next_month:02d}-01"

# Try to parse as YYYY-MM-DD

try:

datetime.strptime(date_str, '%Y-%m-%d')

return date_str

except:

pass

day_mapping = {

'monday': 0, 'mon': 0, 'tuesday': 1, 'tue': 1,

'wednesday': 2, 'wed': 2, 'thursday': 3, 'thu': 3, 'thurs': 3,

'friday': 4, 'fri': 4, 'saturday': 5, 'sat': 5, 'sunday': 6, 'sun': 6,

}

if date_str_lower in day_mapping:

days_ahead = day_mapping[date_str_lower] - today.weekday()

if days_ahead <= 0: days_ahead += 7

return (today + timedelta(days=days_ahead)).strftime('%Y-%m-%d')

return None

return date_str

def parse_date(date_str):

if not date_str: return None

today = datetime.now()

date_str_lower = date_str.lower().strip()

if date_str_lower == 'today':

return today.strftime('%Y-%m-%d')

if date_str_lower == 'tomorrow':

return (today + timedelta(days=1)).strftime('%Y-%m-%d')

if date_str_lower == 'next week':

return (today + timedelta(days=7)).strftime('%Y-%m-%d')

if date_str_lower == 'next month':

next_month = (today.month % 12) + 1

year = today.year + (1 if today.month == 12 else 0)

return f"{year}-{next_month:02d}-01"

day_mapping = {

'monday': 0, 'mon': 0, 'tuesday': 1, 'tue': 1,

'wednesday': 2, 'wed': 2, 'thursday': 3, 'thu': 3, 'thurs': 3,

'friday': 4, 'fri': 4, 'saturday': 5, 'sat': 5, 'sunday': 6, 'sun': 6,

}

if date_str_lower in day_mapping:

days_ahead = day_mapping[date_str_lower] - today.weekday()

if days_ahead <= 0: days_ahead += 7

return (today + timedelta(days=days_ahead)).strftime('%Y-%m-%d')

return date_str

🧰 Tools

🪛 Ruff (0.14.13)

84-84: Multiple statements on one line (colon)

(E701)

102-102: Multiple statements on one line (colon)

(E701)

🤖 Prompt for AI Agents

In `@Backend/agent/handlers/appointment.py` around lines 83 - 105, The parse_date function no longer normalizes common relative terms ("today", "tomorrow", "next week"), causing those strings to be stored raw; update parse_date to detect (case-insensitive) "today" -> return today's date in '%Y-%m-%d', "tomorrow" -> today + 1 day, and "next week" -> today + 7 days (use the existing datetime/timedelta imports and the same strftime formatting), keeping the existing handling for weekdays and "next month"; ensure the checks occur before falling back to returning the original string so all terms matched by the date_patterns regex are converted to canonical ISO dates.

coderabbitai · 2026-01-25T16:29:38Z

Backend/agent/llm.py

+llm = Llama(
+    model_path="./models/qwen2-0_5b-instruct-q4_k_m.gguf", 
+    lora_path="./models/adapter_model.bin",               
+    n_ctx=512,                                            
+    n_gpu_layers=-1                                       
+)


⚠️ Potential issue | 🟠 Major

Global model instantiation risks startup failure and blocks import.

Loading the model at module import time has several issues:

No error handling — if model files are missing, the import crashes the entire application

Blocks startup — model loading is synchronous and potentially slow

Relative paths — "./models/..." depends on the current working directory, which is fragile

Consider lazy initialization with error handling:

Proposed refactor with lazy loading

from llama_cpp import Llama +import os - -llm = Llama( - model_path="./models/qwen2-0_5b-instruct-q4_k_m.gguf", - lora_path="./models/adapter_model.bin", - n_ctx=512, - n_gpu_layers=-1 -) +_llm = None + +def get_llm(): + global _llm + if _llm is None: + base_dir = os.path.dirname(os.path.abspath(__file__)) + model_path = os.path.join(base_dir, "..", "models", "qwen2-0_5b-instruct-q4_k_m.gguf") + lora_path = os.path.join(base_dir, "..", "models", "adapter_model.bin") + + if not os.path.exists(model_path): + raise FileNotFoundError(f"Model not found: {model_path}") + + _llm = Llama( + model_path=model_path, + lora_path=lora_path if os.path.exists(lora_path) else None, + n_ctx=512, + n_gpu_layers=-1 + ) + return _llm

🤖 Prompt for AI Agents

In `@Backend/agent/llm.py` around lines 4 - 9, The module currently creates a global Llama instance (llm) at import, which can crash or block startup; change to lazy initialization: remove the top-level llm instantiation and implement a get_llm() (or init_llm()) function that resolves model paths using Path(__file__).resolve().parent / "models" for model_path and lora_path, performs the Llama(...) creation inside a try/except, logs and returns None or raises a clear error on failure, and caches the created instance for subsequent calls to avoid repeated loads; optionally make the initializer callable from async code or kick off loading in a background thread if non-blocking startup is required.

coderabbitai · 2026-01-25T16:29:38Z

Backend/app.py

+import json
+import re
+import pytesseract
+pytesseract.pytesseract.tesseract_cmd = r'/opt/homebrew/bin/tesseract'


⚠️ Potential issue | 🟠 Major

Hardcoded Tesseract path breaks cross-platform portability.

This macOS-specific path will fail on Linux/Windows deployments. Consider using environment variables or auto-detection.

Suggested fix

-pytesseract.pytesseract.tesseract_cmd = r'/opt/homebrew/bin/tesseract' +import shutil +pytesseract.pytesseract.tesseract_cmd = os.environ.get( + 'TESSERACT_CMD', + shutil.which('tesseract') or r'/opt/homebrew/bin/tesseract' +)

🤖 Prompt for AI Agents

In `@Backend/app.py` at line 7, The hardcoded assignment to pytesseract.pytesseract.tesseract_cmd should be replaced with a cross-platform resolution: check an environment variable (e.g., TESSERACT_CMD) and if not set, auto-detect the binary using shutil.which("tesseract") (or platform-specific common paths) and only assign pytesseract.pytesseract.tesseract_cmd if a valid path is found; update the initialization where pytesseract.pytesseract.tesseract_cmd is set to implement this fallback and log or raise a clear error if tesseract cannot be located.

coderabbitai · 2026-01-25T16:29:38Z

Backend/app.py

+import os
+from llama_cpp import Llama


⚠️ Potential issue | 🟡 Minor

Remove duplicate imports.

os and Llama are already imported at lines 2 and 13. As flagged by static analysis (Ruff F811).

Suggested fix

-import os -from llama_cpp import Llama - def get_llm():

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

import os

from llama_cpp import Llama

def get_llm():

🧰 Tools

🪛 Ruff (0.14.13)

34-34: Redefinition of unused os from line 2: os redefined here

Remove definition: os

(F811)

35-35: Redefinition of unused Llama from line 13: Llama redefined here

Remove definition: Llama

(F811)

🤖 Prompt for AI Agents

In `@Backend/app.py` around lines 34 - 35, Remove the duplicate imports of the module and class by deleting the redundant "import os" and "from llama_cpp import Llama" entries so only the original imports remain; locate the duplicate occurrences (the second "import os" and the second "from llama_cpp import Llama") and remove them, leaving the first declarations intact to resolve the Ruff F811 duplicate-import error.

coderabbitai · 2026-01-25T16:29:38Z

Frontend/package.json

    "react-navigation": "^4.4.4",
-    "react-navigation-bottom-tabs": "^1.0.2"
+    "react-navigation-bottom-tabs": "^1.0.2",
+    "tesseract.js": "^7.0.0"


⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Search for tesseract usage in frontend code echo "=== Searching for tesseract imports/requires in JS/TS files ===" rg 'tesseract' Frontend/src/ --type=js --type=ts -i echo -e "\n=== Searching for tesseract in all files ===" rg 'tesseract' Frontend/ -i --exclude-dir=node_modules --exclude-dir=.git echo -e "\n=== Checking package.json for tesseract ===" cat Frontend/package.json | grep -i tesseract

Repository: AOSSIE-Org/BabyNest

Length of output: 679

🏁 Script executed:

#!/bin/bash # Check if tesseract.js is imported anywhere, even indirectly echo "=== Searching for any 'require' or 'import' statements that might load tesseract ===" rg 'from.*tesseract|import.*tesseract|require.*tesseract' Frontend/src/ --type=js --type=ts -i echo -e "\n=== Check OCR-related imports to see what's actually being used ===" rg 'OCR|ocr' Frontend/src/ --type=js --type=ts -A 2 -B 2

Repository: AOSSIE-Org/BabyNest

Length of output: 1853

🏁 Script executed:

#!/bin/bash # Get context on when tesseract.js was added and why echo "=== Check git history for tesseract.js addition ===" cd Frontend && git log --oneline --all -- package.json | head -20

Repository: AOSSIE-Org/BabyNest

Length of output: 272

🏁 Script executed:

#!/bin/bash # Get the dependencies section of Frontend/package.json to understand context cat Frontend/package.json | grep -A 5 -B 5 'tesseract.js' # Check if there are any comments or documentation about this dependency echo -e "\n=== Full dependencies and devDependencies sections ===" jq '.dependencies, .devDependencies' Frontend/package.json 2>/dev/null | head -80

Repository: AOSSIE-Org/BabyNest

Length of output: 2520

Remove unused dependency: tesseract.js is not referenced anywhere in the frontend code.

OCR processing is handled entirely server-side via the /api/ocr-scan endpoint. This dependency adds unnecessary bundle size and should be removed unless client-side OCR is planned for future offline support.

🤖 Prompt for AI Agents

In `@Frontend/package.json` at line 47, Remove the unused client-side OCR dependency "tesseract.js" from package.json (delete the "tesseract.js": "^7.0.0" entry), run your package manager to update the lockfile (npm install / yarn install) and ensure there are no remaining imports/usages of "tesseract.js" in the frontend codebase (search for "tesseract" or import statements) before committing the change.

coderabbitai · 2026-01-25T16:29:38Z

Frontend/src/Screens/ScanReportScreen.jsx

+import * as ImagePicker from 'react-native-image-picker';
+
+// Using your actual machine IP for simulator-to-backend connectivity
+const BASE_URL = 'http://10.72.82.230:5050';


⚠️ Potential issue | 🟠 Major

Hardcoded IP address breaks deployment.

BASE_URL should use environment variables like HomeScreen.jsx does with @env. This will fail on any device not on the same network.

Suggested fix

-// Using your actual machine IP for simulator-to-backend connectivity -const BASE_URL = 'http://10.72.82.230:5050'; +import { BASE_URL } from '@env';

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

const BASE_URL = 'http://10.72.82.230:5050';

import { BASE_URL } from '@env';

🤖 Prompt for AI Agents

In `@Frontend/src/Screens/ScanReportScreen.jsx` at line 6, Replace the hardcoded BASE_URL in ScanReportScreen.jsx with the same environment variable pattern used in HomeScreen.jsx (import { API_URL } from '@env' or the project's env key) and use that variable wherever BASE_URL is referenced; ensure you remove the literal 'http://10.72.82.230:5050' and add a sensible fallback or error if the env var is missing so the app doesn't crash at runtime.

coderabbitai · 2026-01-25T16:29:38Z

Frontend/src/Screens/ScanReportScreen.jsx

+  const pickImage = () => {
+    ImagePicker.launchImageLibrary({ mediaType: 'photo', quality: 0.5 }, async (response) => {
+      if (response.assets) {
+        setLoading(true);
+        const imageUri = response.assets[0].uri;
+        uploadAndScan(imageUri);
+      }
+    });
+  };


⚠️ Potential issue | 🔴 Critical

Critical bug: User is forced to select an image twice.

pickImage launches the image library and passes imageUri to uploadAndScan, but uploadAndScan ignores that parameter and launches the image library again (line 61). The user must select the same image twice.

Suggested fix — use includeBase64 in the initial picker call

const pickImage = () => { - ImagePicker.launchImageLibrary({ mediaType: 'photo', quality: 0.5 }, async (response) => { + ImagePicker.launchImageLibrary({ mediaType: 'photo', quality: 0.8, includeBase64: true }, async (response) => { + if (response.didCancel || response.errorCode) { + return; + } if (response.assets) { setLoading(true); - const imageUri = response.assets[0].uri; - uploadAndScan(imageUri); + const base64Image = response.assets[0].base64; + await uploadAndScan(base64Image); } }); }; - const uploadAndScan = async (uri) => { - setLoading(true); - try { - // Base64 conversion using react-native-image-picker capability - const options = { - mediaType: 'photo', - quality: 0.8, - includeBase64: true, - }; - - ImagePicker.launchImageLibrary(options, async (response) => { - if (response.didCancel || response.errorCode) { - setLoading(false); - return; - } - - const base64Image = response.assets[0].base64; - + const uploadAndScan = async (base64Image) => { + try { const res = await fetch(`${BASE_URL}/api/ocr-scan`, { method: 'POST', headers: { 'Content-Type': 'application/json', }, body: JSON.stringify({ image: base64Image }), }); const data = await res.json(); setLoading(false); // ... rest of success/error handling - }); } catch (err) { setLoading(false); Alert.alert("Network Error", "Verify backend is running and reachable."); } };

Also applies to: 51-98

🤖 Prompt for AI Agents

In `@Frontend/src/Screens/ScanReportScreen.jsx` around lines 11 - 19, pickImage currently opens the image library and passes imageUri to uploadAndScan, but uploadAndScan re-opens the picker causing the user to select twice; fix by having pickImage request includeBase64 (ImagePicker.launchImageLibrary with includeBase64: true) and pass the picked asset (uri and base64) into uploadAndScan, then modify uploadAndScan to accept an image parameter and skip any internal calls to ImagePicker.launchImageLibrary so it uses the provided image data (reference functions: pickImage, uploadAndScan, and the ImagePicker.launchImageLibrary call).

coderabbitai · 2026-01-25T16:29:38Z

Frontend/src/Screens/ScanReportScreen.jsx

+      if (weight && weight !== "N/A") {
+        await fetch(`${BASE_URL}/api/weight`, {
+          method: 'POST',
+          headers: { 'Content-Type': 'application/json' },
+          body: JSON.stringify({ weight: weight, user_id: 'default' }),
+        });
+      }
+
+      if (bp && bp !== "N/A") {
+        await fetch(`${BASE_URL}/api/blood-pressure`, {
+          method: 'POST',
+          headers: { 'Content-Type': 'application/json' },
+          body: JSON.stringify({ bp: bp, user_id: 'default' }),
+        });
+      }


⚠️ Potential issue | 🟠 Major

No error handling for failed HTTP responses.

fetch doesn't throw on HTTP errors (4xx/5xx). If the backend returns an error status, the code proceeds as if successful. Check response.ok before parsing JSON.

Suggested fix

if (weight && weight !== "N/A") { - await fetch(`${BASE_URL}/api/weight`, { + const weightRes = await fetch(`${BASE_URL}/api/weight`, { method: 'POST', headers: { 'Content-Type': 'application/json' }, body: JSON.stringify({ weight: weight, user_id: 'default' }), }); + if (!weightRes.ok) throw new Error('Failed to save weight'); } if (bp && bp !== "N/A") { - await fetch(`${BASE_URL}/api/blood-pressure`, { + const bpRes = await fetch(`${BASE_URL}/api/blood-pressure`, { method: 'POST', headers: { 'Content-Type': 'application/json' }, body: JSON.stringify({ bp: bp, user_id: 'default' }), }); + if (!bpRes.ok) throw new Error('Failed to save blood pressure'); }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

if (weight && weight !== "N/A") {

await fetch(`${BASE_URL}/api/weight`, {

method: 'POST',

headers: { 'Content-Type': 'application/json' },

body: JSON.stringify({ weight: weight, user_id: 'default' }),

});

}

if (bp && bp !== "N/A") {

await fetch(`${BASE_URL}/api/blood-pressure`, {

method: 'POST',

headers: { 'Content-Type': 'application/json' },

body: JSON.stringify({ bp: bp, user_id: 'default' }),

});

}

if (weight && weight !== "N/A") {

const weightRes = await fetch(`${BASE_URL}/api/weight`, {

method: 'POST',

headers: { 'Content-Type': 'application/json' },

body: JSON.stringify({ weight: weight, user_id: 'default' }),

});

if (!weightRes.ok) throw new Error('Failed to save weight');

}

if (bp && bp !== "N/A") {

const bpRes = await fetch(`${BASE_URL}/api/blood-pressure`, {

method: 'POST',

headers: { 'Content-Type': 'application/json' },

body: JSON.stringify({ bp: bp, user_id: 'default' }),

});

if (!bpRes.ok) throw new Error('Failed to save blood pressure');

}

🤖 Prompt for AI Agents

In `@Frontend/src/Screens/ScanReportScreen.jsx` around lines 27 - 41, The two POST requests that send weight and bp use fetch but don't check HTTP status—update the weight and bp request blocks to capture the fetch responses (e.g., const resp = await fetch(...)), verify resp.ok, and handle non-OK cases by logging/reporting the error or throwing so failures aren't treated as successes; apply this change to both the weight POST and the blood-pressure POST (the fetch calls in ScanReportScreen.jsx) and ensure any downstream code awaits/handles thrown errors or error state appropriately.

coderabbitai · 2026-01-25T16:29:38Z

Frontend/src/Screens/ScanReportScreen.jsx

+      const res = await fetch(`${BASE_URL}/api/ocr-scan`, {
+        method: 'POST',
+        headers: {
+          'Content-Type': 'application/json',
+        },
+        body: JSON.stringify({ image: base64Image }),
+      });
+
+      const data = await res.json();


⚠️ Potential issue | 🟠 Major

Same issue: missing response.ok check for OCR endpoint.

Suggested fix

const res = await fetch(`${BASE_URL}/api/ocr-scan`, { method: 'POST', headers: { 'Content-Type': 'application/json', }, body: JSON.stringify({ image: base64Image }), }); + if (!res.ok) { + throw new Error(`Server error: ${res.status}`); + } + const data = await res.json();

🤖 Prompt for AI Agents

In `@Frontend/src/Screens/ScanReportScreen.jsx` around lines 69 - 77, The fetch to `${BASE_URL}/api/ocr-scan` in ScanReportScreen.jsx doesn't check the HTTP status; add a check on the response (const res = await fetch(...); if (!res.ok) { const errText = await res.text().catch(()=>res.statusText); throw new Error(`OCR request failed: ${res.status} ${errText}`); }) before calling res.json(), or instead handle non-ok by showing the error to the user/state; keep references to BASE_URL, the /api/ocr-scan endpoint, the res variable and the parsed data when updating state or UI.

Naren456 · 2026-02-13T11:03:36Z

@pranjal9091
resolve merge conflicts

pranjal9091 added 2 commits January 25, 2026 19:34

feat: implement medical report scanning and vitals extraction pipeline

84018e3

refactor: generalize assistant naming and update extraction prompt

7e55a25

coderabbitai bot reviewed Jan 25, 2026

View reviewed changes

	const BASE_URL = 'http://10.72.82.230:5050';
	import { BASE_URL } from '@env';

Uh oh!

Conversation

pranjal9091 commented Jan 25, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Release Notes

Uh oh!

coderabbitai bot commented Jan 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Poem

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 25, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 25, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 25, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 25, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 25, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 25, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 25, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 25, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 25, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 25, 2026

Choose a reason for hiding this comment

Uh oh!

Naren456 commented Feb 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

pranjal9091 commented Jan 25, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Jan 25, 2026 •

edited

Loading