Conversation
|
Warning Rate limit exceeded
⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (2)
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Actionable comments posted: 4
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/neuralnav/api/routes/reference_data.py (1)
29-31:⚠️ Potential issue | 🟠 MajorInternal error details still exposed, contradicting PR objective.
The PR objective states "stop exposing internal error details in API responses," but this file still exposes exception messages to clients via
str(e)in multiple endpoints:
- Line 31:
detail=str(e)- Line 43:
detail=str(e)- Line 58:
detail=str(e)- Lines 91-92:
detail=f"Failed to load benchmarks: {str(e)}"- Line 121:
detail=str(e)- Lines 173-174:
detail=f"Failed to load weighted scores: {str(e)}"Meanwhile,
database.pyin this same PR uses generic messages like"Failed to load benchmarks"without exposingstr(e). This creates an inconsistency and security gap where stack traces or internal state could leak to clients.🔒 Suggested fix to use generic error messages
`@router.get`("/models") async def list_models(): """Get list of available models.""" try: model_catalog = get_model_catalog() models = model_catalog.get_all_models() return {"models": [model.to_dict() for model in models], "count": len(models)} except Exception as e: - logger.error(f"Failed to list models: {e}") - raise HTTPException(status_code=status.HTTP_500_INTERNAL_SERVER_ERROR, detail=str(e)) from e + logger.error(f"Failed to list models: {e}", exc_info=True) + raise HTTPException(status_code=status.HTTP_500_INTERNAL_SERVER_ERROR, detail="Failed to list models") from eApply similar changes to
list_gpu_types,list_use_cases,get_benchmarks,get_priority_weights, andget_weighted_scores.Also applies to: 41-43, 56-58, 87-92, 119-121, 169-174
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/neuralnav/api/routes/reference_data.py` around lines 29 - 31, The handlers list_models, list_gpu_types, list_use_cases, get_benchmarks, get_priority_weights, and get_weighted_scores are returning HTTPExceptions with detail=str(e) which leaks internal errors; instead, keep the detailed error in server logs (use logger.exception or logger.error with exc_info) and raise HTTPException with a generic message like "Failed to list models" / "Failed to list GPU types" / "Failed to list use cases" / "Failed to load benchmarks" / "Failed to load priority weights" / "Failed to load weighted scores" respectively; locate the exception blocks in reference_data.py (the except blocks currently doing raise HTTPException(status_code=..., detail=str(e))) and replace detail=str(e) or f"...{str(e)}" with the appropriate generic message while preserving logging of e for diagnostics.
🧹 Nitpick comments (2)
src/neuralnav/knowledge_base/model_catalog.py (1)
316-336: Consider reusingget_cost_for_providerfor consistency.The fallback logic
(cost_per_hour_aws if cost_per_hour_aws is not None else cost_per_hour_usd)duplicates the behavior inget_cost_for_provider. If the fallback strategy changes, both locations would need updates.♻️ Optional refactor to reduce duplication
"cost_per_hour_total": gpu.cost_per_hour_usd * total_gpus, "cost_per_month_base": gpu.cost_per_hour_usd * total_gpus * hours_per_month, - "cost_per_month_aws": ( - gpu.cost_per_hour_aws - if gpu.cost_per_hour_aws is not None - else gpu.cost_per_hour_usd - ) - * total_gpus - * hours_per_month, - "cost_per_month_gcp": ( - gpu.cost_per_hour_gcp - if gpu.cost_per_hour_gcp is not None - else gpu.cost_per_hour_usd - ) - * total_gpus - * hours_per_month, - "cost_per_month_azure": ( - gpu.cost_per_hour_azure - if gpu.cost_per_hour_azure is not None - else gpu.cost_per_hour_usd - ) - * total_gpus - * hours_per_month, + "cost_per_month_aws": gpu.get_cost_for_provider("aws") * total_gpus * hours_per_month, + "cost_per_month_gcp": gpu.get_cost_for_provider("gcp") * total_gpus * hours_per_month, + "cost_per_month_azure": gpu.get_cost_for_provider("azure") * total_gpus * hours_per_month, }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/neuralnav/knowledge_base/model_catalog.py` around lines 316 - 336, The cost-per-month calculations duplicate the provider fallback logic; replace the inline ternaries for "cost_per_month_aws", "cost_per_month_gcp", and "cost_per_month_azure" with calls to get_cost_for_provider(gpu, "aws"|"gcp"|"azure") respectively, then multiply the returned hourly cost by total_gpus and hours_per_month (using the existing total_gpus and hours_per_month variables) so the fallback behavior is centralized in get_cost_for_provider.src/neuralnav/configuration/generator.py (1)
123-124: Consider replacingassertwith explicit validation.
assertstatements can be disabled when Python runs with optimization flags (-Oor-OO). Ifgpu_configisNonein production with assertions disabled, this will silently pass and cause a downstreamAttributeError.Proposed fix
- assert gpu_config is not None, "gpu_config is required for template context" + if gpu_config is None: + raise ValueError("gpu_config is required for template context")🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/neuralnav/configuration/generator.py` around lines 123 - 124, Replace the runtime-only assert with explicit validation for gpu_config: instead of "assert gpu_config is not None, ..." add a check like "if gpu_config is None: raise ValueError('gpu_config is required for template context')" so the code always fails fast even when Python assertions are disabled; update any surrounding code in the function that uses gpu_config (the gpu_config variable/parameter in the template generation routine) to rely on this explicit exception.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/neuralnav/api/routes/configuration.py`:
- Around line 211-215: The except Exception block that currently logs and raises
HTTPException(status_code=404) should instead raise a 500 for unexpected errors:
in the except Exception as e handler (the block that logs "Failed to get
deployment status"), change the HTTPException status_code to
status.HTTP_500_INTERNAL_SERVER_ERROR and include the error detail in the
message; reserve HTTP_404_NOT_FOUND only for explicit "deployment not found"
checks (e.g., when a NotFound/KeyError is detected earlier in the get deployment
status logic). Ensure the logger.error still records the exception and the
raised HTTPException contains a clear message referencing deployment_id and the
original error.
In `@src/neuralnav/api/routes/recommendation.py`:
- Around line 250-255: The handler in recommendation.py currently returns
internal error details to clients by using str(e) in the HTTPException detail;
instead keep the full error in the server log (logger.error(..., exc_info=True))
but change the HTTPException detail to a generic message (e.g. "Internal server
error while generating recommendations") and re-raise the HTTPException without
exposing e. Update the except block around the ranked recommendation generation
(the code using logger.error and raising HTTPException) to only surface a
generic client-safe detail while retaining the original exception for
logging/traceability.
- Around line 146-149: The endpoint currently returns internal exception details
by using str(e) in the HTTPException detail; instead, keep the existing
server-side logging (logger.error(..., exc_info=True)) and change the
HTTPException raised in the recommendation route to a generic client-facing
message (e.g., "Failed to generate recommendation" or "Internal server error")
without including str(e) or exception data; update the raise HTTPException(...)
call (the one using status.HTTP_500_INTERNAL_SERVER_ERROR and variable e) to use
the generic detail string while leaving logging unchanged.
In `@src/neuralnav/api/routes/specification.py`:
- Around line 91-98: Replace the handlers that currently log and return internal
exception details (in get_slo_defaults and the other two endpoints referenced)
with generic client-facing messages and detailed internal logs: change
logger.error(...) to log a generic message like "Failed to get SLO defaults for
{use_case}" (or analogous "Failed to get workload/reference data" for the other
endpoints) and add exc_info=True so the stacktrace is recorded; change the
HTTPException detail to a non-sensitive message such as "Internal server error"
(or "Failed to retrieve SLO/workload data") instead of including str(e) or the
missing key. Apply this same pattern to the three exception blocks that
currently use f"Missing SLO/workload data: {e}" and detail=str(e).
---
Outside diff comments:
In `@src/neuralnav/api/routes/reference_data.py`:
- Around line 29-31: The handlers list_models, list_gpu_types, list_use_cases,
get_benchmarks, get_priority_weights, and get_weighted_scores are returning
HTTPExceptions with detail=str(e) which leaks internal errors; instead, keep the
detailed error in server logs (use logger.exception or logger.error with
exc_info) and raise HTTPException with a generic message like "Failed to list
models" / "Failed to list GPU types" / "Failed to list use cases" / "Failed to
load benchmarks" / "Failed to load priority weights" / "Failed to load weighted
scores" respectively; locate the exception blocks in reference_data.py (the
except blocks currently doing raise HTTPException(status_code=...,
detail=str(e))) and replace detail=str(e) or f"...{str(e)}" with the appropriate
generic message while preserving logging of e for diagnostics.
---
Nitpick comments:
In `@src/neuralnav/configuration/generator.py`:
- Around line 123-124: Replace the runtime-only assert with explicit validation
for gpu_config: instead of "assert gpu_config is not None, ..." add a check like
"if gpu_config is None: raise ValueError('gpu_config is required for template
context')" so the code always fails fast even when Python assertions are
disabled; update any surrounding code in the function that uses gpu_config (the
gpu_config variable/parameter in the template generation routine) to rely on
this explicit exception.
In `@src/neuralnav/knowledge_base/model_catalog.py`:
- Around line 316-336: The cost-per-month calculations duplicate the provider
fallback logic; replace the inline ternaries for "cost_per_month_aws",
"cost_per_month_gcp", and "cost_per_month_azure" with calls to
get_cost_for_provider(gpu, "aws"|"gcp"|"azure") respectively, then multiply the
returned hourly cost by total_gpus and hours_per_month (using the existing
total_gpus and hours_per_month variables) so the fallback behavior is
centralized in get_cost_for_provider.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 51c2c618-f8c3-45b1-a1f4-03a710558948
📒 Files selected for processing (22)
pyproject.tomlsrc/neuralnav/api/routes/configuration.pysrc/neuralnav/api/routes/database.pysrc/neuralnav/api/routes/recommendation.pysrc/neuralnav/api/routes/reference_data.pysrc/neuralnav/api/routes/specification.pysrc/neuralnav/configuration/generator.pysrc/neuralnav/knowledge_base/benchmarks.pysrc/neuralnav/knowledge_base/model_catalog.pysrc/neuralnav/knowledge_base/slo_templates.pysrc/neuralnav/llm/ollama_client.pysrc/neuralnav/llm/prompts.pysrc/neuralnav/llm/prompts_experimental.pysrc/neuralnav/orchestration/workflow.pysrc/neuralnav/recommendation/quality/usecase_scorer.pysrc/neuralnav/recommendation/scorer.pysrc/neuralnav/shared/schemas/intent.pysrc/neuralnav/shared/schemas/recommendation.pysrc/neuralnav/shared/schemas/specification.pytests/test_postgresql_integration.pytests/test_recommendation_workflow.pytests/test_yaml_generation.py
| except Exception as e: | ||
| logger.error(f"Failed to get deployment status: {e}") | ||
| raise HTTPException(status_code=status.HTTP_404_NOT_FOUND, detail=f"Deployment not found: {deployment_id}") from e | ||
| raise HTTPException( | ||
| status_code=status.HTTP_404_NOT_FOUND, detail=f"Deployment not found: {deployment_id}" | ||
| ) from e |
There was a problem hiding this comment.
Incorrect HTTP status code for generic exceptions.
Returning 404 Not Found for any exception is semantically incorrect. An unexpected error (e.g., internal bug, external service failure) should return 500 Internal Server Error, not 404. The current code masks genuine server errors as "not found" responses.
Proposed fix
except Exception as e:
logger.error(f"Failed to get deployment status: {e}")
raise HTTPException(
- status_code=status.HTTP_404_NOT_FOUND, detail=f"Deployment not found: {deployment_id}"
+ status_code=status.HTTP_500_INTERNAL_SERVER_ERROR,
+ detail="Failed to get deployment status",
) from e🧰 Tools
🪛 Ruff (0.15.5)
[warning] 212-212: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/neuralnav/api/routes/configuration.py` around lines 211 - 215, The except
Exception block that currently logs and raises HTTPException(status_code=404)
should instead raise a 500 for unexpected errors: in the except Exception as e
handler (the block that logs "Failed to get deployment status"), change the
HTTPException status_code to status.HTTP_500_INTERNAL_SERVER_ERROR and include
the error detail in the message; reserve HTTP_404_NOT_FOUND only for explicit
"deployment not found" checks (e.g., when a NotFound/KeyError is detected
earlier in the get deployment status logic). Ensure the logger.error still
records the exception and the raised HTTPException contains a clear message
referencing deployment_id and the original error.
| raise HTTPException( | ||
| status_code=status.HTTP_500_INTERNAL_SERVER_ERROR, detail=f"Failed to generate recommendation: {str(e)}" | ||
| status_code=status.HTTP_500_INTERNAL_SERVER_ERROR, | ||
| detail=f"Failed to generate recommendation: {str(e)}", | ||
| ) from e |
There was a problem hiding this comment.
Internal error details still exposed in API response.
The PR objective states this change should "stop exposing internal error details in API responses," but this endpoint still includes str(e) in the HTTP 500 response detail. This exposes stack trace information and internal state to clients, which is a security concern.
The error is already logged server-side with exc_info=True (line 145), so client-facing messages should be generic.
🔒 Proposed fix to use generic error message
except Exception as e:
logger.error(f"Failed to generate recommendation: {e}", exc_info=True)
raise HTTPException(
status_code=status.HTTP_500_INTERNAL_SERVER_ERROR,
- detail=f"Failed to generate recommendation: {str(e)}",
+ detail="Failed to generate recommendation. Please try again or contact support.",
) from e📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| raise HTTPException( | |
| status_code=status.HTTP_500_INTERNAL_SERVER_ERROR, detail=f"Failed to generate recommendation: {str(e)}" | |
| status_code=status.HTTP_500_INTERNAL_SERVER_ERROR, | |
| detail=f"Failed to generate recommendation: {str(e)}", | |
| ) from e | |
| raise HTTPException( | |
| status_code=status.HTTP_500_INTERNAL_SERVER_ERROR, | |
| detail="Failed to generate recommendation. Please try again or contact support.", | |
| ) from e |
🧰 Tools
🪛 Ruff (0.15.5)
[warning] 148-148: Use explicit conversion flag
Replace with conversion flag
(RUF010)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/neuralnav/api/routes/recommendation.py` around lines 146 - 149, The
endpoint currently returns internal exception details by using str(e) in the
HTTPException detail; instead, keep the existing server-side logging
(logger.error(..., exc_info=True)) and change the HTTPException raised in the
recommendation route to a generic client-facing message (e.g., "Failed to
generate recommendation" or "Internal server error") without including str(e) or
exception data; update the raise HTTPException(...) call (the one using
status.HTTP_500_INTERNAL_SERVER_ERROR and variable e) to use the generic detail
string while leaving logging unchanged.
| except Exception as e: | ||
| logger.error(f"Failed to generate ranked recommendations from spec: {e}", exc_info=True) | ||
| raise HTTPException( | ||
| status_code=status.HTTP_500_INTERNAL_SERVER_ERROR, detail=f"Failed to generate ranked recommendations: {str(e)}" | ||
| status_code=status.HTTP_500_INTERNAL_SERVER_ERROR, | ||
| detail=f"Failed to generate ranked recommendations: {str(e)}", | ||
| ) from e |
There was a problem hiding this comment.
Same security issue: internal error details exposed.
Same issue as above—str(e) is exposed to API clients in the error response.
🔒 Proposed fix
except Exception as e:
logger.error(f"Failed to generate ranked recommendations from spec: {e}", exc_info=True)
raise HTTPException(
status_code=status.HTTP_500_INTERNAL_SERVER_ERROR,
- detail=f"Failed to generate ranked recommendations: {str(e)}",
+ detail="Failed to generate ranked recommendations. Please try again or contact support.",
) from e📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| except Exception as e: | |
| logger.error(f"Failed to generate ranked recommendations from spec: {e}", exc_info=True) | |
| raise HTTPException( | |
| status_code=status.HTTP_500_INTERNAL_SERVER_ERROR, detail=f"Failed to generate ranked recommendations: {str(e)}" | |
| status_code=status.HTTP_500_INTERNAL_SERVER_ERROR, | |
| detail=f"Failed to generate ranked recommendations: {str(e)}", | |
| ) from e | |
| except Exception as e: | |
| logger.error(f"Failed to generate ranked recommendations from spec: {e}", exc_info=True) | |
| raise HTTPException( | |
| status_code=status.HTTP_500_INTERNAL_SERVER_ERROR, | |
| detail="Failed to generate ranked recommendations. Please try again or contact support.", | |
| ) from e |
🧰 Tools
🪛 Ruff (0.15.5)
[warning] 254-254: Use explicit conversion flag
Replace with conversion flag
(RUF010)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/neuralnav/api/routes/recommendation.py` around lines 250 - 255, The
handler in recommendation.py currently returns internal error details to clients
by using str(e) in the HTTPException detail; instead keep the full error in the
server log (logger.error(..., exc_info=True)) but change the HTTPException
detail to a generic message (e.g. "Internal server error while generating
recommendations") and re-raise the HTTPException without exposing e. Update the
except block around the ranked recommendation generation (the code using
logger.error and raising HTTPException) to only surface a generic client-safe
detail while retaining the original exception for logging/traceability.
| except KeyError as e: | ||
| logger.error(f"Missing SLO data for {use_case}: {e}") | ||
| raise HTTPException(status_code=status.HTTP_500_INTERNAL_SERVER_ERROR, detail=f"Missing SLO data: {e}") from e | ||
| raise HTTPException( | ||
| status_code=status.HTTP_500_INTERNAL_SERVER_ERROR, detail=f"Missing SLO data: {e}" | ||
| ) from e | ||
| except Exception as e: | ||
| logger.error(f"Failed to get SLO defaults for {use_case}: {e}") | ||
| raise HTTPException(status_code=status.HTTP_500_INTERNAL_SERVER_ERROR, detail=str(e)) from e |
There was a problem hiding this comment.
Internal error details still exposed in exception handlers.
Similar to reference_data.py, this file exposes internal details to clients:
- Lines 94, 155, 227:
f"Missing SLO/workload data: {e}"reveals which configuration key is missing - Lines 98, 159, 230:
detail=str(e)exposes the full exception message
For consistency with the PR objective and database.py, use generic messages and add exc_info=True to logging.
🔒 Suggested fix for get_slo_defaults (apply similar pattern to other endpoints)
except KeyError as e:
- logger.error(f"Missing SLO data for {use_case}: {e}")
+ logger.error(f"Missing SLO data for {use_case}: {e}", exc_info=True)
raise HTTPException(
- status_code=status.HTTP_500_INTERNAL_SERVER_ERROR, detail=f"Missing SLO data: {e}"
+ status_code=status.HTTP_500_INTERNAL_SERVER_ERROR, detail="Invalid SLO configuration"
) from e
except Exception as e:
- logger.error(f"Failed to get SLO defaults for {use_case}: {e}")
- raise HTTPException(status_code=status.HTTP_500_INTERNAL_SERVER_ERROR, detail=str(e)) from e
+ logger.error(f"Failed to get SLO defaults for {use_case}: {e}", exc_info=True)
+ raise HTTPException(status_code=status.HTTP_500_INTERNAL_SERVER_ERROR, detail="Failed to get SLO defaults") from eAlso applies to: 152-159, 223-230
🧰 Tools
🪛 Ruff (0.15.5)
[warning] 92-92: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
[warning] 97-97: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/neuralnav/api/routes/specification.py` around lines 91 - 98, Replace the
handlers that currently log and return internal exception details (in
get_slo_defaults and the other two endpoints referenced) with generic
client-facing messages and detailed internal logs: change logger.error(...) to
log a generic message like "Failed to get SLO defaults for {use_case}" (or
analogous "Failed to get workload/reference data" for the other endpoints) and
add exc_info=True so the stacktrace is recorded; change the HTTPException detail
to a non-sensitive message such as "Internal server error" (or "Failed to
retrieve SLO/workload data") instead of including str(e) or the missing key.
Apply this same pattern to the three exception blocks that currently use
f"Missing SLO/workload data: {e}" and detail=str(e).
Replace f-string error details with generic messages in HTTP exceptions to avoid leaking stack traces and internal state to clients. Errors are still logged server-side with exc_info=True. Signed-off-by: Amit Oren <amoren@redhat.com>
99c0b20 to
aa7e703
Compare
Summary by CodeRabbit
Release Notes
Bug Fixes
Improvements