-
Notifications
You must be signed in to change notification settings - Fork 3
Nebula
: Introduce Nebula Transcription Service
#108
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Nebula
:Lecture trasncription using Whisper and Gpt4-0
Nebula
:Lecture trasncription using Whisper and Gpt4-0Nebula
: Introduce Nebula Transcription Service (Local Whisper + GPT Vision Integration)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The first version looks good in my opinion. I added some comments. Also consider adding some doc comments and also maybe if you could have more files to not have the whole functionality in one file. Also please have a look at the linting test as there are some failing tests
Added security.py and health.py in Nebula so other services can use it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (1)
nebula/src/nebula/transcript/README.md (1)
95-115
: Clarify and number local run steps
Running Locally lacks prerequisites and ordered steps—this was flagged in earlier feedback.## ▶️ Running Locally +### Prerequisites +- Ensure you’re in the `nebula` directory +- FFmpeg is installed and in your PATH + +### Steps ```bash - # Set environment variable + # 1. Set environment variablesWindows PowerShell
$env:APPLICATION_YML_PATH=...
# macOS / Linux export APPLICATION_YML_PATH=...
Run the transcription service
2. Run the transcription service
poetry run uvicorn nebula.transcript.app:app --reload --port 5000
In a separate terminal, run the gateway
3. In a separate terminal, run the gateway
poetry run uvicorn nebula.gateway.main:app --reload --port 8000</blockquote></details> </blockquote></details> <details> <summary>🧹 Nitpick comments (2)</summary><blockquote> <details> <summary>nebula/src/nebula/transcript/README.md (2)</summary><blockquote> `54-54`: **Wrap bare URL in markdown** The raw URL triggers MD034; wrap it in angle brackets or use a link label. ```diff - - Download from https://ffmpeg.org/download.html (or use chocolatey: choco install ffmpeg) + - Download from <https://ffmpeg.org/download.html> (or install via Chocolatey: `choco install ffmpeg`)
133-133
: Add language to fenced code block
The project-structure tree needs a code-fence language to satisfy MD040. For example:- ``` + ```text nebula/ ├── docker/ # …
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
nebula/.gitignore
(1 hunks)nebula/src/nebula/transcript/README.md
(1 hunks)nebula/src/nebula/transcript/config.py
(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
- nebula/.gitignore
- nebula/src/nebula/transcript/config.py
🧰 Additional context used
🪛 markdownlint-cli2 (0.17.2)
nebula/src/nebula/transcript/README.md
54-54: Bare URL used
null
(MD034, no-bare-urls)
133-133: Fenced code blocks should have a language specified
null
(MD040, fenced-code-language)
⏰ Context from checks skipped due to timeout of 90000ms (2)
- GitHub Check: Build and Push to GitHub Container Registry / Build linux/arm64 Docker Image for ghcr.io/ls1intum/edutelligence/iris
- GitHub Check: Build and Push to GitHub Container Registry / Build linux/amd64 Docker Image for ghcr.io/ls1intum/edutelligence/iris
🔇 Additional comments (2)
nebula/src/nebula/transcript/README.md (2)
67-71
: Approve: Explicit paths for example configs
Specifying the full paths relative to the repo root makes it crystal-clear where to find and copy these YAMLs.
73-76
: Approve: Explicit paths for LLM config
Great improvement—users no longer have to guess wherellm_config.example.nebula.yml
lives.
…nebula/feature/transcript
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 11
🧹 Nitpick comments (12)
nebula/src/nebula/gateway/grpc_server.py (1)
12-12
: Fix formatting: Add required blank lines before function definition.+ def serve():
nebula/src/nebula/faq/faq_server.py (2)
12-12
: Fix formatting: Add required blank lines before function definition.+ def serve():
22-23
: Fix formatting: Add required blank lines after function definition.if __name__ == "__main__": serve() +
nebula/src/nebula/faq/rewriter_servicer.py (1)
7-7
: Fix formatting: Add required blank lines before class definition.+ class FAQRewriterService(faq_pb2_grpc.FAQServiceServicer):
nebula/docker/faq/Dockerfile (2)
7-7
: Consider installing the root package for proper module resolution.The
--no-root
flag might cause import issues. If the package needs to be importable, consider removing this flag.-RUN pip install poetry && poetry install --only main --no-root +RUN pip install poetry && poetry install --only main
29-29
: Remove unnecessary empty line.-
nebula/src/nebula/gateway/faq_handler.py (2)
12-12
: Remove unnecessary empty line.-
28-33
: Improve error handling and avoid information leakage.The current error handling is good, but consider providing more specific error messages and avoid exposing internal details.
except Exception as e: # Log and return internal server error in case of failure logger.error("❌ Failed to forward request: %s", str(e)) - context.set_details(str(e)) + context.set_details("Failed to process FAQ rewriting request") context.set_code(grpc.StatusCode.INTERNAL) - return faq_pb2.FaqRewritingResponse(result="Internal error while forwarding request") + return faq_pb2.FaqRewritingResponse(result="Service temporarily unavailable. Please try again later.")nebula/src/nebula/protos/faq.proto (3)
5-7
: Consider adding explicit error handling RPCs / StatusA single unary RPC is fine for an MVP, but production gRPC APIs often return
google.rpc.Status
details or defineoption (google.api.http)
mappings for gateway-translation. Evaluate whether theRewriteFAQ
call should:
- Return a
FaqRewritingResponse
wrapped in anoneof { FaqRewritingResponse ok = 1; google.rpc.Status error = 2; }
- Or propagate errors via gRPC status codes with rich details.
Not required for merge but worth planning.
9-12
: Nit: follow Proto style for acronymsGoogle-style recommends
FAQ
➜Faq
(only first letter capitalised) for message names to remain PascalCase (Faq
). Consistency aids code-gen readability.-message FAQ { +message Faq { string question_title = 1; string question_answer = 2; }Update the repeated field type accordingly.
14-17
: Field naming: plural collection already implied
repeated FAQ faqs = 1;
– the field name could be a simplefaq_list
oritems
because plurals are implicit inrepeated
. Not blocking, just readability.nebula/docker-compose.yml (1)
20-37
: Add health-checks for gateway & transcriber
restart: unless-stopped
can mask crash-loops if the service starts but becomes unhealthy. Define simple health-checks (e.g.curl -f http://localhost:5000/health
) so orchestrators can surface issues early.transcriber: ... healthcheck: test: ["CMD", "curl", "-f", "http://localhost:5000/health"] interval: 30s timeout: 5s retries: 3Mirror the pattern for
gateway
andfaq
.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
nebula/poetry.lock
is excluded by!**/*.lock
📒 Files selected for processing (10)
nebula/docker-compose.yml
(1 hunks)nebula/docker/faq/Dockerfile
(1 hunks)nebula/docker/gateway/Dockerfile
(1 hunks)nebula/pyproject.toml
(1 hunks)nebula/src/nebula/faq/faq_server.py
(1 hunks)nebula/src/nebula/faq/rewriter_servicer.py
(1 hunks)nebula/src/nebula/gateway/faq_handler.py
(1 hunks)nebula/src/nebula/gateway/grpc_server.py
(1 hunks)nebula/src/nebula/gateway/main.py
(1 hunks)nebula/src/nebula/protos/faq.proto
(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (3)
- nebula/src/nebula/gateway/main.py
- nebula/docker/gateway/Dockerfile
- nebula/pyproject.toml
🧰 Additional context used
🪛 Flake8 (7.2.0)
nebula/src/nebula/gateway/grpc_server.py
[error] 12-12: expected 2 blank lines, found 1
(E302)
[error] 17-17: block comment should start with '# '
(E265)
nebula/src/nebula/faq/faq_server.py
[error] 12-12: expected 2 blank lines, found 1
(E302)
[error] 22-22: expected 2 blank lines after class or function definition, found 1
(E305)
nebula/src/nebula/faq/rewriter_servicer.py
[error] 7-7: expected 2 blank lines, found 1
(E302)
🪛 Pylint (3.3.7)
nebula/src/nebula/gateway/grpc_server.py
[error] 5-5: No name 'faq_pb2_grpc' in module 'nebula.grpc_stubs'
(E0611)
nebula/src/nebula/faq/faq_server.py
[error] 5-5: No name 'faq_pb2_grpc' in module 'nebula.grpc_stubs'
(E0611)
nebula/src/nebula/gateway/faq_handler.py
[error] 4-4: No name 'faq_pb2' in module 'nebula.grpc_stubs'
(E0611)
[error] 4-4: No name 'faq_pb2_grpc' in module 'nebula.grpc_stubs'
(E0611)
[refactor] 13-13: Too few public methods (1/2)
(R0903)
nebula/src/nebula/faq/rewriter_servicer.py
[error] 3-3: No name 'faq_pb2' in module 'nebula.grpc_stubs'
(E0611)
[error] 3-3: No name 'faq_pb2_grpc' in module 'nebula.grpc_stubs'
(E0611)
[refactor] 7-7: Too few public methods (1/2)
(R0903)
🪛 Buf (1.54.0)
nebula/src/nebula/protos/faq.proto
3-3: Files with package "faqservice" must be within a directory "faqservice" relative to root but were in directory "nebula/src/nebula/protos".
(PACKAGE_DIRECTORY_MATCH)
⏰ Context from checks skipped due to timeout of 90000ms (4)
- GitHub Check: Build and Push to GitHub Container Registry / Build linux/arm64 Docker Image for ghcr.io/ls1intum/edutelligence/iris
- GitHub Check: Build and Push to GitHub Container Registry / Build linux/amd64 Docker Image for ghcr.io/ls1intum/edutelligence/iris
- GitHub Check: Lint
- GitHub Check: Lint
🔇 Additional comments (4)
nebula/src/nebula/faq/rewriter_servicer.py (1)
11-14
: LGTM! Mocked implementation is appropriate for initial development.The placeholder logic is clearly documented and provides a reasonable response structure for testing.
nebula/docker/faq/Dockerfile (1)
11-11
: Verify if the configuration file is actually used.The configuration file is copied but it's unclear if the FAQ service uses it. Consider removing if unnecessary.
#!/bin/bash # Check if the configuration file is referenced in the FAQ service code rg -A 3 -B 3 "application_local\.nebula\.yml" nebula/src/nebula/faq/nebula/src/nebula/gateway/faq_handler.py (1)
14-14
: Fix critical type annotation error.The return type annotation is incorrect - it should return
FaqRewritingResponse
, notFaqRewritingRequest
.- def RewriteFAQ(self, request: faq_pb2.FaqRewritingRequest, context: ServicerContext) -> faq_pb2.FaqRewritingResponse: + def RewriteFAQ(self, request: faq_pb2.FaqRewritingRequest, context: ServicerContext) -> faq_pb2.FaqRewritingResponse:Likely an incorrect or invalid review comment.
nebula/docker-compose.yml (1)
38-49
: Minor: environment variable consistency
faq
lacksLLM_CONFIG_PATH
, yetrewriter_servicer.py
might rely on the same config. Double-check and add if required to avoid run-time KeyErrors.
logger = logging.getLogger("nebula.gateway") | ||
logging.basicConfig(level=logging.INFO) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Avoid overriding global logging configuration.
The basicConfig
call may interfere with other logging configurations in the application. Consider using a more targeted approach or checking if logging is already configured.
-logger = logging.getLogger("nebula.gateway")
-logging.basicConfig(level=logging.INFO)
+logger = logging.getLogger("nebula.gateway")
+if not logger.handlers:
+ handler = logging.StreamHandler()
+ handler.setFormatter(logging.Formatter('%(asctime)s [%(levelname)s] %(name)s: %(message)s'))
+ logger.addHandler(handler)
+ logger.setLevel(logging.INFO)
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
logger = logging.getLogger("nebula.gateway") | |
logging.basicConfig(level=logging.INFO) | |
logger = logging.getLogger("nebula.gateway") | |
if not logger.handlers: | |
handler = logging.StreamHandler() | |
handler.setFormatter(logging.Formatter('%(asctime)s [%(levelname)s] %(name)s: %(message)s')) | |
logger.addHandler(handler) | |
logger.setLevel(logging.INFO) |
🤖 Prompt for AI Agents
In nebula/src/nebula/gateway/grpc_server.py around lines 9 to 10, avoid calling
logging.basicConfig directly as it overrides global logging settings and may
interfere with other configurations. Instead, check if logging is already
configured before calling basicConfig or configure the logger specifically by
setting its level and handlers without using basicConfig to prevent unintended
side effects.
# FAQ-Handler registrieren | ||
faq_pb2_grpc.add_FAQServiceServicer_to_server(FAQServiceHandler(), server) | ||
#(später) Weitere Handler registrieren | ||
# transcription_pb2_grpc.add_TranscriptionServiceServicer_to_server(TranscriptionServiceHandler(), server) | ||
logger.info("🔗 gRPC-Handler für FAQ rewriting registriert") | ||
server.add_insecure_port("[::]:50051") | ||
logger.info("🚀 gRPC server run on Port 50051") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Address multiple issues: Comment formatting, language consistency, and security.
The code has mixed German/English comments and a potential security concern with the server binding to all interfaces.
- # FAQ-Handler registrieren
+ # Register FAQ handler
faq_pb2_grpc.add_FAQServiceServicer_to_server(FAQServiceHandler(), server)
- #(später) Weitere Handler registrieren
+ # TODO: Register additional handlers later
# transcription_pb2_grpc.add_TranscriptionServiceServicer_to_server(TranscriptionServiceHandler(), server)
- logger.info("🔗 gRPC-Handler für FAQ rewriting registriert")
- server.add_insecure_port("[::]:50051")
- logger.info("🚀 gRPC server run on Port 50051")
+ logger.info("🔗 gRPC handler for FAQ rewriting registered")
+ server.add_insecure_port("0.0.0.0:50051") # Consider using localhost for security
+ logger.info("🚀 gRPC server running on port 50051")
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
# FAQ-Handler registrieren | |
faq_pb2_grpc.add_FAQServiceServicer_to_server(FAQServiceHandler(), server) | |
#(später) Weitere Handler registrieren | |
# transcription_pb2_grpc.add_TranscriptionServiceServicer_to_server(TranscriptionServiceHandler(), server) | |
logger.info("🔗 gRPC-Handler für FAQ rewriting registriert") | |
server.add_insecure_port("[::]:50051") | |
logger.info("🚀 gRPC server run on Port 50051") | |
# Register FAQ handler | |
faq_pb2_grpc.add_FAQServiceServicer_to_server(FAQServiceHandler(), server) | |
# TODO: Register additional handlers later | |
# transcription_pb2_grpc.add_TranscriptionServiceServicer_to_server(TranscriptionServiceHandler(), server) | |
logger.info("🔗 gRPC handler for FAQ rewriting registered") | |
server.add_insecure_port("0.0.0.0:50051") # Consider using localhost for security | |
logger.info("🚀 gRPC server running on port 50051") |
🧰 Tools
🪛 Flake8 (7.2.0)
[error] 17-17: block comment should start with '# '
(E265)
🤖 Prompt for AI Agents
In nebula/src/nebula/gateway/grpc_server.py lines 15 to 21, unify all comments
to English for consistency and clarity, replacing German comments with English
equivalents. Additionally, instead of binding the gRPC server to all interfaces
with "[::]:50051", restrict the binding to localhost or a specific secure
interface to avoid exposing the server publicly. Update the comments accordingly
to reflect these changes.
nebula/src/nebula/faq/faq_server.py
Outdated
logger = logging.getLogger("nebula.faq.server") | ||
logging.basicConfig(level=logging.INFO) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Avoid overriding global logging configuration.
Same issue as in the gateway server - the basicConfig
call may interfere with application-wide logging setup.
-logger = logging.getLogger("nebula.faq.server")
-logging.basicConfig(level=logging.INFO)
+logger = logging.getLogger("nebula.faq.server")
+if not logger.handlers:
+ handler = logging.StreamHandler()
+ handler.setFormatter(logging.Formatter('%(asctime)s [%(levelname)s] %(name)s: %(message)s'))
+ logger.addHandler(handler)
+ logger.setLevel(logging.INFO)
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
logger = logging.getLogger("nebula.faq.server") | |
logging.basicConfig(level=logging.INFO) | |
logger = logging.getLogger("nebula.faq.server") | |
if not logger.handlers: | |
handler = logging.StreamHandler() | |
handler.setFormatter(logging.Formatter('%(asctime)s [%(levelname)s] %(name)s: %(message)s')) | |
logger.addHandler(handler) | |
logger.setLevel(logging.INFO) |
🤖 Prompt for AI Agents
In nebula/src/nebula/faq/faq_server.py around lines 9 to 10, remove the call to
logging.basicConfig(level=logging.INFO) to avoid overriding the global logging
configuration. Instead, configure the logger instance directly if needed without
affecting the global logging setup.
nebula/src/nebula/faq/faq_server.py
Outdated
server.add_insecure_port("[::]:50052") | ||
logging.info("🚀 FAQ gRPC server running on port 50052") | ||
server.start() | ||
server.wait_for_termination() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Add error handling and improve server binding security.
Consider binding to localhost instead of all interfaces for better security in containerized environments.
- server.add_insecure_port("[::]:50052")
- logging.info("🚀 FAQ gRPC server running on port 50052")
+ server.add_insecure_port("0.0.0.0:50052") # Explicit binding for containers
+ logger.info("🚀 FAQ gRPC server running on port 50052")
+
+ try:
server.start()
+ logger.info("✅ FAQ gRPC server started successfully")
server.wait_for_termination()
+ except Exception as e:
+ logger.error("❌ Failed to start FAQ gRPC server: %s", str(e))
+ raise
Committable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In nebula/src/nebula/faq/faq_server.py around lines 17 to 20, the server is
currently binding to all network interfaces using "[::]:50052", which can expose
it unnecessarily. Change the binding address to "localhost:50052" to restrict
access to the local machine for improved security. Additionally, add error
handling around the server start and wait calls to catch and log exceptions,
ensuring the server fails gracefully if binding or startup encounters issues.
logging.basicConfig( | ||
level=logging.INFO, | ||
format="%(asctime)s [%(levelname)s] %(name)s: %(message)s" | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Avoid overriding global logging configuration.
The basicConfig
call may interfere with other logging configurations. Consider a more targeted approach.
-logging.basicConfig(
- level=logging.INFO,
- format="%(asctime)s [%(levelname)s] %(name)s: %(message)s"
-)
+# Configure logger only if not already configured
+if not logger.handlers:
+ handler = logging.StreamHandler()
+ handler.setFormatter(logging.Formatter("%(asctime)s [%(levelname)s] %(name)s: %(message)s"))
+ logger.addHandler(handler)
+ logger.setLevel(logging.INFO)
Committable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In nebula/src/nebula/gateway/faq_handler.py around lines 7 to 10, avoid using
logging.basicConfig as it overrides global logging settings and can interfere
with other configurations. Instead, create a logger instance specific to this
module using logging.getLogger(__name__) and configure its level and handlers
directly without calling basicConfig. This approach ensures the logging setup is
localized and does not affect global logging behavior.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (5)
nebula/src/nebula/gateway/grpc_server.py (2)
10-13
: Avoid overriding global logging configuration.The
basicConfig
call may interfere with other logging configurations in the application. Consider using a more targeted approach or checking if logging is already configured.-logging.basicConfig( - level=logging.INFO, - format="%(asctime)s [%(levelname)s] %(name)s: %(message)s" -) +if not logger.handlers: + handler = logging.StreamHandler() + handler.setFormatter(logging.Formatter('%(asctime)s [%(levelname)s] %(name)s: %(message)s')) + logger.addHandler(handler) + logger.setLevel(logging.INFO)
26-27
: Add graceful shutdown handling and proper error handling.The server lacks graceful shutdown mechanisms and error handling for startup failures.
+ try: server.start() + logger.info("✅ gRPC server started successfully") server.wait_for_termination() + except Exception as e: + logger.error("❌ Failed to start gRPC server: %s", str(e)) + raise + finally: + logger.info("🛑 gRPC server shutting down")nebula/src/nebula/faq/faq_server.py (2)
10-13
: Avoid overriding global logging configuration.Same issue as in the gateway server - the
basicConfig
call may interfere with application-wide logging setup.-logging.basicConfig( - level=logging.INFO, - format="%(asctime)s [%(levelname)s] %(name)s: %(message)s" -) +if not logger.handlers: + handler = logging.StreamHandler() + handler.setFormatter(logging.Formatter('%(asctime)s [%(levelname)s] %(name)s: %(message)s')) + logger.addHandler(handler) + logger.setLevel(logging.INFO)
26-27
: Add error handling and improve server reliability.The server lacks proper error handling for startup failures and graceful shutdown mechanisms.
+ try: server.start() + logger.info("✅ FAQ gRPC server started successfully") server.wait_for_termination() + except Exception as e: + logger.error("❌ Failed to start FAQ gRPC server: %s", str(e)) + raise + finally: + logger.info("🛑 FAQ gRPC server shutting down")nebula/src/nebula/faq/rewriter_servicer.py (1)
6-9
: Avoid overriding global logging configuration.The
basicConfig
call may interfere with other logging configurations in the application.-logging.basicConfig( - level=logging.INFO, - format="%(asctime)s [%(levelname)s] %(name)s: %(message)s" -) +if not logger.handlers: + handler = logging.StreamHandler() + handler.setFormatter(logging.Formatter('%(asctime)s [%(levelname)s] %(name)s: %(message)s')) + logger.addHandler(handler) + logger.setLevel(logging.INFO)
🧹 Nitpick comments (4)
nebula/src/nebula/gateway/grpc_server.py (1)
18-18
: Fix formatting: Add required blank lines before function definition.Static analysis indicates missing blank lines before the function definition.
+ def serve():
nebula/src/nebula/faq/faq_server.py (2)
18-18
: Fix formatting: Add required blank lines before function definition.Static analysis indicates missing blank lines before the function definition.
+ def serve():
29-30
: Fix formatting: Add required blank lines after function definition.Static analysis indicates missing blank lines after the function definition.
logger.info(f"FAQ gRPC server running on port {FAQ_SERVICE_PORT}") server.start() server.wait_for_termination() + if __name__ == "__main__":
nebula/src/nebula/faq/rewriter_servicer.py (1)
11-11
: Fix formatting: Add required blank lines before class definition.Static analysis indicates missing blank lines before the class definition.
+ class FAQRewriterService(faq_pb2_grpc.FAQServiceServicer):
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
nebula/docker-compose.yml
(1 hunks)nebula/src/nebula/faq/faq_server.py
(1 hunks)nebula/src/nebula/faq/rewriter_servicer.py
(1 hunks)nebula/src/nebula/gateway/faq_handler.py
(1 hunks)nebula/src/nebula/gateway/grpc_server.py
(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- nebula/docker-compose.yml
🧰 Additional context used
🪛 Flake8 (7.2.0)
nebula/src/nebula/faq/faq_server.py
[error] 18-18: expected 2 blank lines, found 1
(E302)
[error] 29-29: expected 2 blank lines after class or function definition, found 1
(E305)
nebula/src/nebula/faq/rewriter_servicer.py
[error] 11-11: expected 2 blank lines, found 1
(E302)
nebula/src/nebula/gateway/grpc_server.py
[error] 18-18: expected 2 blank lines, found 1
(E302)
🪛 Pylint (3.3.7)
nebula/src/nebula/faq/faq_server.py
[error] 6-6: No name 'faq_pb2_grpc' in module 'nebula.grpc_stubs'
(E0611)
nebula/src/nebula/faq/rewriter_servicer.py
[error] 3-3: No name 'faq_pb2' in module 'nebula.grpc_stubs'
(E0611)
[error] 3-3: No name 'faq_pb2_grpc' in module 'nebula.grpc_stubs'
(E0611)
[refactor] 11-11: Too few public methods (1/2)
(R0903)
nebula/src/nebula/gateway/faq_handler.py
[error] 5-5: No name 'faq_pb2' in module 'nebula.grpc_stubs'
(E0611)
[error] 5-5: No name 'faq_pb2_grpc' in module 'nebula.grpc_stubs'
(E0611)
[refactor] 15-15: Too few public methods (1/2)
(R0903)
nebula/src/nebula/gateway/grpc_server.py
[error] 6-6: No name 'faq_pb2_grpc' in module 'nebula.grpc_stubs'
(E0611)
⏰ Context from checks skipped due to timeout of 90000ms (4)
- GitHub Check: Build and Push to GitHub Container Registry / Build linux/arm64 Docker Image for ghcr.io/ls1intum/edutelligence/iris
- GitHub Check: Build and Push to GitHub Container Registry / Build linux/amd64 Docker Image for ghcr.io/ls1intum/edutelligence/iris
- GitHub Check: Lint
- GitHub Check: Lint
🔇 Additional comments (4)
nebula/src/nebula/faq/rewriter_servicer.py (1)
15-18
: Mock implementation is appropriate for initial development.The placeholder implementation with mocked response is suitable for the initial development phase. The TODO comment clearly indicates future enhancement plans.
nebula/src/nebula/gateway/faq_handler.py (3)
7-12
: Good configuration management with environment variables.The implementation properly uses environment variables for service configuration, addressing previous feedback about hardcoded addresses. The default values provide sensible fallbacks.
19-30
: Excellent error handling and gRPC status management.The error handling implementation is comprehensive:
- Proper exception catching and logging
- Appropriate gRPC status codes and details
- Graceful fallback with meaningful error response
This demonstrates good practices for gRPC service integration.
3-5
: ```bash
#!/bin/bash
set -eecho "==> Full protoc invocation in nebula/docker/gateway/Dockerfile (lines 1–30):"
sed -n '1,30p' nebula/docker/gateway/Dockerfileecho -e "\n==> Full protoc invocation in nebula/docker/faq/Dockerfile (lines 1–30):"
sed -n '1,30p' nebula/docker/faq/Dockerfile</details> </blockquote></details> </details> <!-- This is an auto-generated comment by CodeRabbit for review status -->
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
♻️ Duplicate comments (2)
nebula/src/nebula/gateway/grpc_server.py (2)
9-13
: Avoid overriding global logging configuration.The
basicConfig
call may interfere with other logging configurations in the application. Consider using a more targeted approach.-logger = logging.getLogger("nebula.gateway") -logging.basicConfig( - level=logging.INFO, - format="%(asctime)s [%(levelname)s] %(name)s: %(message)s" -) +logger = logging.getLogger("nebula.gateway") +if not logger.handlers: + handler = logging.StreamHandler() + handler.setFormatter(logging.Formatter('%(asctime)s [%(levelname)s] %(name)s: %(message)s')) + logger.addHandler(handler) + logger.setLevel(logging.INFO)
18-26
: Add graceful shutdown handling and error management.The server lacks proper shutdown mechanisms and error handling for startup failures.
+import signal +import sys + def serve(): server = grpc.server(futures.ThreadPoolExecutor(max_workers=10)) - # Register FAQ handler faq_pb2_grpc.add_FAQServiceServicer_to_server(FAQServiceHandler(), server) logger.info("Registered gRPC handler for FAQ rewriting") - server.add_insecure_port(f"[::]:{GATEWAY_SERVICE_PORT}") - logger.info(f"gRPC server running on port {GATEWAY_SERVICE_PORT}") - server.start() - server.wait_for_termination() + + # Bind to localhost by default for security + bind_address = os.getenv("GATEWAY_BIND_ADDRESS", "127.0.0.1") + server.add_insecure_port(f"{bind_address}:{GATEWAY_SERVICE_PORT}") + + def handle_shutdown(signum, frame): + logger.info("Received shutdown signal, stopping server...") + server.stop(grace=5) + sys.exit(0) + + signal.signal(signal.SIGINT, handle_shutdown) + signal.signal(signal.SIGTERM, handle_shutdown) + + try: + server.start() + logger.info(f"gRPC server running on {bind_address}:{GATEWAY_SERVICE_PORT}") + server.wait_for_termination() + except Exception as e: + logger.error("Failed to start gRPC server: %s", str(e)) + raise
🧹 Nitpick comments (6)
nebula/src/nebula/gateway/grpc_server.py (1)
17-18
: Add missing blank line for PEP 8 compliance.GATEWAY_SERVICE_PORT = os.getenv("GATEWAY_SERVICE_PORT", "50051") + def serve():
nebula/src/nebula/llm/llm_config.py (1)
1-25
: Apply formatting fixes for pipeline compliance.The file needs formatting adjustments as indicated by pipeline failures.
Run the following to fix formatting issues:
isort nebula/src/nebula/llm/llm_config.py echo >> nebula/src/nebula/llm/llm_config.py # Add newline at end of filenebula/src/nebula/faq/faq_llm_utils.py (1)
1-3
: Fix formatting issues.from nebula.llm.openai_client import get_openai_client + def rewrite_faq_text(input_text: str) -> str:
nebula/src/nebula/llm/openai_client.py (1)
1-16
: Apply formatting fixes.-from openai import AzureOpenAI from nebula.llm.llm_config import load_llm_config +from openai import AzureOpenAI + def get_openai_client(llm_id="azure-gpt-4-omni"):Add a newline at the end of the file as well.
nebula/src/nebula/faq/rewriter_servicer.py (2)
14-15
: Add class docstring and fix formatting.+ + class FAQRewriterService(faq_pb2_grpc.FAQServiceServicer): + """gRPC service for rewriting FAQ text using Azure OpenAI.""" + def RewriteFAQ(self, request: faq_pb2.FaqRewritingRequest, context: ServicerContext) -> faq_pb2.FaqRewritingResponse:
16-16
: Use lazy formatting in logging.- logger.info(f"Received RewriteFAQ request with input text: '{request.input_text}' and {len(request.faqs)} FAQ(s)") + logger.info("Received RewriteFAQ request with input text: '%s' and %d FAQ(s)", + request.input_text, len(request.faqs))
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (8)
nebula/docker-compose.yml
(1 hunks)nebula/docker/faq/Dockerfile
(1 hunks)nebula/src/nebula/faq/faq_llm_utils.py
(1 hunks)nebula/src/nebula/faq/prompts/rewrite_faq_prompt.py
(1 hunks)nebula/src/nebula/faq/rewriter_servicer.py
(1 hunks)nebula/src/nebula/gateway/grpc_server.py
(1 hunks)nebula/src/nebula/llm/llm_config.py
(1 hunks)nebula/src/nebula/llm/openai_client.py
(1 hunks)
✅ Files skipped from review due to trivial changes (1)
- nebula/src/nebula/faq/prompts/rewrite_faq_prompt.py
🚧 Files skipped from review as they are similar to previous changes (2)
- nebula/docker/faq/Dockerfile
- nebula/docker-compose.yml
🧰 Additional context used
🪛 Flake8 (7.2.0)
nebula/src/nebula/llm/openai_client.py
[error] 4-4: expected 2 blank lines, found 1
(E302)
[error] 14-14: expected 2 blank lines, found 1
(E302)
nebula/src/nebula/faq/rewriter_servicer.py
[error] 14-14: expected 2 blank lines, found 1
(E302)
nebula/src/nebula/faq/faq_llm_utils.py
[error] 3-3: expected 2 blank lines, found 1
(E302)
nebula/src/nebula/gateway/grpc_server.py
[error] 18-18: expected 2 blank lines, found 1
(E302)
🪛 GitHub Actions: Nebula - Test
nebula/src/nebula/llm/openai_client.py
[error] 1-1: Black formatting check failed. File was reformatted by black.
[error] 1-1: isort hook modified this file to fix import sorting.
nebula/src/nebula/faq/rewriter_servicer.py
[error] 1-1: Black formatting check failed. File was reformatted by black.
[error] 1-1: isort hook modified this file to fix import sorting.
[warning] 15-15: pylint C0115: Missing class docstring (missing-class-docstring).
[warning] 19-19: pylint W1203: Use lazy % formatting in logging functions (logging-fstring-interpolation).
[error] 3-3: mypy error: Skipping analyzing "grpc": module is installed, but missing library stubs or py.typed marker (import-untyped).
[error] 6-6: mypy error: Module "nebula.grpc_stubs" has no attribute "faq_pb2" or "faq_pb2_grpc" (attr-defined).
nebula/src/nebula/llm/llm_config.py
[error] 1-1: end-of-file-fixer hook modified this file to fix end of file issues.
[error] 1-1: isort hook modified this file to fix import sorting.
nebula/src/nebula/faq/faq_llm_utils.py
[error] 1-1: Black formatting check failed. File was reformatted by black.
nebula/src/nebula/gateway/grpc_server.py
[error] 1-1: Black formatting check failed. File was reformatted by black.
[error] 1-1: isort hook modified this file to fix import sorting.
[warning] 25-25: pylint W1203: Use lazy % formatting in logging functions (logging-fstring-interpolation).
[error] 5-5: mypy error: Skipping analyzing "grpc": module is installed, but missing library stubs or py.typed marker (import-untyped).
[error] 8-8: mypy error: Module "nebula.grpc_stubs" has no attribute "faq_pb2_grpc" (attr-defined).
🪛 Pylint (3.3.7)
nebula/src/nebula/faq/rewriter_servicer.py
[error] 3-3: No name 'faq_pb2' in module 'nebula.grpc_stubs'
(E0611)
[error] 3-3: No name 'faq_pb2_grpc' in module 'nebula.grpc_stubs'
(E0611)
[refactor] 14-14: Too few public methods (1/2)
(R0903)
nebula/src/nebula/gateway/grpc_server.py
[error] 6-6: No name 'faq_pb2_grpc' in module 'nebula.grpc_stubs'
(E0611)
⏰ Context from checks skipped due to timeout of 90000ms (2)
- GitHub Check: Build and Push to GitHub Container Registry / Build linux/amd64 Docker Image for ghcr.io/ls1intum/edutelligence/iris
- GitHub Check: Build and Push to GitHub Container Registry / Build linux/arm64 Docker Image for ghcr.io/ls1intum/edutelligence/iris
🔇 Additional comments (3)
nebula/src/nebula/llm/llm_config.py (1)
5-10
: Use the filename parameter consistently.The function accepts a
filename
parameter but doesn't use it when constructing the config path.def load_llm_config(filename="llm_config.nebula.yml", llm_id="azure-gpt-4-omni"): """Load LLM configuration from a YAML file.""" config_path = os.getenv("LLM_CONFIG_PATH") if not config_path: this_dir = os.path.dirname(os.path.abspath(__file__)) - config_path = os.path.abspath(os.path.join(this_dir, "..", filename)) + config_path = os.path.abspath(os.path.join(this_dir, "..", "..", filename))Likely an incorrect or invalid review comment.
nebula/src/nebula/llm/openai_client.py (1)
14-15
: Fix function call parameters.def get_azure_whisper_config(llm_id="azure-whisper"): - return load_llm_config(llm_id) + """Get Azure Whisper configuration. + + Args: + llm_id: The LLM configuration ID to load + + Returns: + The Whisper configuration dictionary + """ + return load_llm_config(llm_id=llm_id)Likely an incorrect or invalid review comment.
nebula/src/nebula/faq/rewriter_servicer.py (1)
33-38
: Good error handling implementation!The error handling with proper logging and gRPC context error details is well implemented.
def rewrite_faq_text(input_text: str) -> str: | ||
client, deployment = get_openai_client("azure-gpt-4-omni") | ||
|
||
response = client.chat.completions.create( | ||
model=deployment, | ||
messages=[ | ||
{"role": "system", "content": "You are a helpful assistant that rewrites FAQ input clearly and concisely."}, | ||
{"role": "user", "content": input_text}, | ||
], | ||
) | ||
|
||
return response.choices[0].message.content.strip() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Add error handling and use consistent prompts.
The function lacks error handling and uses a different prompt than the gRPC service implementation.
+from nebula.faq.prompts.rewrite_faq_prompt import system_prompt_faq_rewriting
+import logging
+
+logger = logging.getLogger(__name__)
+
+
def rewrite_faq_text(input_text: str) -> str:
- client, deployment = get_openai_client("azure-gpt-4-omni")
-
- response = client.chat.completions.create(
- model=deployment,
- messages=[
- {"role": "system", "content": "You are a helpful assistant that rewrites FAQ input clearly and concisely."},
- {"role": "user", "content": input_text},
- ],
- )
-
- return response.choices[0].message.content.strip()
+ """Rewrite FAQ text using OpenAI API.
+
+ Args:
+ input_text: The FAQ text to rewrite
+
+ Returns:
+ The rewritten FAQ text
+
+ Raises:
+ Exception: If the OpenAI API call fails
+ """
+ try:
+ client, deployment = get_openai_client("azure-gpt-4-omni")
+
+ response = client.chat.completions.create(
+ model=deployment,
+ messages=[
+ {
+ "role": "system",
+ "content": system_prompt_faq_rewriting.format(rewritten_text=input_text),
+ },
+ ],
+ )
+
+ return response.choices[0].message.content.strip()
+ except Exception as e:
+ logger.error("Failed to rewrite FAQ text: %s", str(e))
+ raise
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
def rewrite_faq_text(input_text: str) -> str: | |
client, deployment = get_openai_client("azure-gpt-4-omni") | |
response = client.chat.completions.create( | |
model=deployment, | |
messages=[ | |
{"role": "system", "content": "You are a helpful assistant that rewrites FAQ input clearly and concisely."}, | |
{"role": "user", "content": input_text}, | |
], | |
) | |
return response.choices[0].message.content.strip() | |
from nebula.faq.prompts.rewrite_faq_prompt import system_prompt_faq_rewriting | |
import logging | |
logger = logging.getLogger(__name__) | |
def rewrite_faq_text(input_text: str) -> str: | |
"""Rewrite FAQ text using OpenAI API. | |
Args: | |
input_text: The FAQ text to rewrite | |
Returns: | |
The rewritten FAQ text | |
Raises: | |
Exception: If the OpenAI API call fails | |
""" | |
try: | |
client, deployment = get_openai_client("azure-gpt-4-omni") | |
response = client.chat.completions.create( | |
model=deployment, | |
messages=[ | |
{ | |
"role": "system", | |
"content": system_prompt_faq_rewriting.format(rewritten_text=input_text), | |
}, | |
], | |
) | |
return response.choices[0].message.content.strip() | |
except Exception as e: | |
logger.error("Failed to rewrite FAQ text: %s", e) | |
raise |
🧰 Tools
🪛 Flake8 (7.2.0)
[error] 3-3: expected 2 blank lines, found 1
(E302)
🤖 Prompt for AI Agents
In nebula/src/nebula/faq/faq_llm_utils.py around lines 3 to 14, the
rewrite_faq_text function lacks error handling and uses a prompt that differs
from the gRPC service implementation. To fix this, add try-except blocks to
catch and handle potential exceptions from the OpenAI client call, and update
the system prompt to match the wording used in the gRPC service for consistency.
def get_openai_client(llm_id="azure-gpt-4-omni"): | ||
config = load_llm_config(llm_id) | ||
client = AzureOpenAI( | ||
azure_endpoint=config["endpoint"], | ||
azure_deployment=config["azure_deployment"], | ||
api_version=config["api_version"], | ||
api_key=config["api_key"], | ||
) | ||
return client, config["azure_deployment"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix function call and add error handling.
The load_llm_config
call is incorrect and there's no error handling for missing configuration keys.
def get_openai_client(llm_id="azure-gpt-4-omni"):
- config = load_llm_config(llm_id)
- client = AzureOpenAI(
- azure_endpoint=config["endpoint"],
- azure_deployment=config["azure_deployment"],
- api_version=config["api_version"],
- api_key=config["api_key"],
- )
- return client, config["azure_deployment"]
+ """Get an Azure OpenAI client for the specified LLM ID.
+
+ Args:
+ llm_id: The LLM configuration ID to load
+
+ Returns:
+ Tuple of (client, deployment_name)
+
+ Raises:
+ ValueError: If configuration is missing required fields
+ """
+ config = load_llm_config(llm_id=llm_id)
+
+ required_fields = ["endpoint", "azure_deployment", "api_version", "api_key"]
+ missing_fields = [field for field in required_fields if field not in config]
+ if missing_fields:
+ raise ValueError(f"Missing required fields in LLM config: {missing_fields}")
+
+ client = AzureOpenAI(
+ azure_endpoint=config["endpoint"],
+ azure_deployment=config["azure_deployment"],
+ api_version=config["api_version"],
+ api_key=config["api_key"],
+ )
+ return client, config["azure_deployment"]
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
def get_openai_client(llm_id="azure-gpt-4-omni"): | |
config = load_llm_config(llm_id) | |
client = AzureOpenAI( | |
azure_endpoint=config["endpoint"], | |
azure_deployment=config["azure_deployment"], | |
api_version=config["api_version"], | |
api_key=config["api_key"], | |
) | |
return client, config["azure_deployment"] | |
def get_openai_client(llm_id="azure-gpt-4-omni"): | |
"""Get an Azure OpenAI client for the specified LLM ID. | |
Args: | |
llm_id: The LLM configuration ID to load | |
Returns: | |
Tuple of (client, deployment_name) | |
Raises: | |
ValueError: If configuration is missing required fields | |
""" | |
config = load_llm_config(llm_id=llm_id) | |
required_fields = ["endpoint", "azure_deployment", "api_version", "api_key"] | |
missing_fields = [field for field in required_fields if field not in config] | |
if missing_fields: | |
raise ValueError(f"Missing required fields in LLM config: {missing_fields}") | |
client = AzureOpenAI( | |
azure_endpoint=config["endpoint"], | |
azure_deployment=config["azure_deployment"], | |
api_version=config["api_version"], | |
api_key=config["api_key"], | |
) | |
return client, config["azure_deployment"] |
🧰 Tools
🪛 Flake8 (7.2.0)
[error] 4-4: expected 2 blank lines, found 1
(E302)
🤖 Prompt for AI Agents
In nebula/src/nebula/llm/openai_client.py around lines 4 to 12, the call to
load_llm_config is incorrect and lacks error handling for missing keys in the
configuration. Fix the function call to correctly load the LLM configuration,
then add error handling to check for the presence of required keys like
"endpoint", "azure_deployment", "api_version", and "api_key". If any key is
missing, raise an appropriate exception or handle the error gracefully before
creating the AzureOpenAI client.
logger = logging.getLogger("nebula.faq.rewriter") | ||
logging.basicConfig( | ||
level=logging.INFO, | ||
format="%(asctime)s [%(levelname)s] %(name)s: %(message)s" | ||
) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Avoid overriding global logging configuration.
Similar to the grpc_server.py file, this uses basicConfig
which can interfere with other logging configurations.
logger = logging.getLogger("nebula.faq.rewriter")
-logging.basicConfig(
- level=logging.INFO,
- format="%(asctime)s [%(levelname)s] %(name)s: %(message)s"
-)
+if not logger.handlers:
+ handler = logging.StreamHandler()
+ handler.setFormatter(logging.Formatter('%(asctime)s [%(levelname)s] %(name)s: %(message)s'))
+ logger.addHandler(handler)
+ logger.setLevel(logging.INFO)
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
logger = logging.getLogger("nebula.faq.rewriter") | |
logging.basicConfig( | |
level=logging.INFO, | |
format="%(asctime)s [%(levelname)s] %(name)s: %(message)s" | |
) | |
logger = logging.getLogger("nebula.faq.rewriter") | |
if not logger.handlers: | |
handler = logging.StreamHandler() | |
handler.setFormatter(logging.Formatter('%(asctime)s [%(levelname)s] %(name)s: %(message)s')) | |
logger.addHandler(handler) | |
logger.setLevel(logging.INFO) |
🤖 Prompt for AI Agents
In nebula/src/nebula/faq/rewriter_servicer.py between lines 8 and 13, remove the
call to logging.basicConfig to avoid overriding the global logging
configuration. Instead, configure the logger instance directly by setting its
level and adding handlers if needed, ensuring it does not interfere with other
parts of the application that may configure logging.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 9
♻️ Duplicate comments (5)
nebula/src/nebula/gateway/grpc_server.py (2)
10-13
: Avoid overriding global logging configuration.The
basicConfig
call may interfere with other logging configurations in the application.
25-26
: Add graceful shutdown handling and proper error handling.The server lacks graceful shutdown mechanisms and error handling for startup failures.
nebula/src/nebula/faq/faq_server.py (2)
11-14
: Avoid overriding global logging configuration.Same issue as in the gateway server - the
basicConfig
call may interfere with application-wide logging setup.
27-28
: Add error handling and improve server binding security.Consider binding to localhost instead of all interfaces for better security in containerized environments.
nebula/src/nebula/faq/services/rewriter_servicer.py (1)
9-12
: Avoid overriding global logging configuration.The
basicConfig
call may interfere with application-wide logging setup.
🧹 Nitpick comments (1)
nebula/src/nebula/gateway/handlers/faq_handler.py (1)
15-15
: Add class docstring.The class is missing documentation explaining its purpose and functionality.
class FAQServiceHandler(faq_pb2_grpc.FAQServiceServicer): + """gRPC handler that forwards FAQ rewriting requests to the downstream FAQ service."""
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
nebula/src/nebula/faq/faq_server.py
(1 hunks)nebula/src/nebula/faq/services/rewriter_servicer.py
(1 hunks)nebula/src/nebula/gateway/grpc_server.py
(1 hunks)nebula/src/nebula/gateway/handlers/faq_handler.py
(1 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (1)
nebula/src/nebula/faq/faq_server.py (2)
nebula/src/nebula/faq/services/rewriter_servicer.py (1)
FAQRewriterService
(14-38)nebula/src/nebula/gateway/grpc_server.py (1)
serve
(18-26)
🪛 Pylint (3.3.7)
nebula/src/nebula/gateway/handlers/faq_handler.py
[error] 5-5: No name 'faq_pb2' in module 'nebula.grpc_stubs'
(E0611)
[error] 5-5: No name 'faq_pb2_grpc' in module 'nebula.grpc_stubs'
(E0611)
[refactor] 15-15: Too few public methods (1/2)
(R0903)
nebula/src/nebula/faq/services/rewriter_servicer.py
[refactor] 14-14: Too few public methods (1/2)
(R0903)
nebula/src/nebula/gateway/grpc_server.py
[error] 6-6: No name 'faq_pb2_grpc' in module 'nebula.grpc_stubs'
(E0611)
🪛 GitHub Actions: Nebula - Test
nebula/src/nebula/gateway/handlers/faq_handler.py
[error] 1-1: Black formatting check failed. File was reformatted by the black hook.
[error] 1-1: isort hook modified this file to fix import sorting issues.
[warning] 17-17: pylint C0115: Missing class docstring (missing-class-docstring).
[warning] 21-21: pylint W1203: Use lazy % formatting in logging functions (logging-fstring-interpolation).
[error] 4-7: mypy import errors: Skipping analyzing 'grpc' (missing library stubs or py.typed marker); Module 'nebula.grpc_stubs' missing attributes 'faq_pb2' and 'faq_pb2_grpc'.
nebula/src/nebula/faq/services/rewriter_servicer.py
[error] 1-1: Black formatting check failed. File was reformatted by the black hook.
[error] 1-1: isort hook modified this file to fix import sorting issues.
[warning] 15-15: pylint C0115: Missing class docstring (missing-class-docstring).
[warning] 19-19: pylint W1203: Use lazy % formatting in logging functions (logging-fstring-interpolation).
[error] 3-6: mypy import errors: Skipping analyzing 'grpc' (missing library stubs or py.typed marker); Module 'nebula.grpc_stubs' missing attributes 'faq_pb2' and 'faq_pb2_grpc'.
nebula/src/nebula/faq/faq_server.py
[error] 1-1: Black formatting check failed. File was reformatted by the black hook.
[error] 1-1: isort hook modified this file to fix import sorting issues.
[warning] 26-26: pylint W1203: Use lazy % formatting in logging functions (logging-fstring-interpolation).
[error] 5-8: mypy import errors: Skipping analyzing 'grpc' (missing library stubs or py.typed marker); Module 'nebula.grpc_stubs' has no attribute 'faq_pb2_grpc'.
nebula/src/nebula/gateway/grpc_server.py
[error] 1-1: Black formatting check failed. File was reformatted by the black hook.
[error] 1-1: isort hook modified this file to fix import sorting issues.
[warning] 25-25: pylint W1203: Use lazy % formatting in logging functions (logging-fstring-interpolation).
[error] 5-8: mypy import errors: Skipping analyzing 'grpc' (missing library stubs or py.typed marker); Module 'nebula.grpc_stubs' has no attribute 'faq_pb2_grpc'.
🪛 Flake8 (7.2.0)
nebula/src/nebula/faq/services/rewriter_servicer.py
[error] 14-14: expected 2 blank lines, found 1
(E302)
nebula/src/nebula/faq/faq_server.py
[error] 19-19: expected 2 blank lines, found 1
(E302)
[error] 30-30: expected 2 blank lines after class or function definition, found 1
(E305)
nebula/src/nebula/gateway/grpc_server.py
[error] 18-18: expected 2 blank lines, found 1
(E302)
⏰ Context from checks skipped due to timeout of 90000ms (2)
- GitHub Check: Build and Push to GitHub Container Registry / Build linux/arm64 Docker Image for ghcr.io/ls1intum/edutelligence/iris
- GitHub Check: Build and Push to GitHub Container Registry / Build linux/amd64 Docker Image for ghcr.io/ls1intum/edutelligence/iris
🔇 Additional comments (2)
nebula/src/nebula/gateway/handlers/faq_handler.py (1)
19-30
: Excellent error handling implementation.The gRPC error handling follows best practices with proper status codes and error details.
nebula/src/nebula/faq/services/rewriter_servicer.py (1)
18-38
: Well-structured error handling and OpenAI integration.The try-catch block properly handles exceptions and sets appropriate gRPC error details.
|
||
class FAQServiceHandler(faq_pb2_grpc.FAQServiceServicer): | ||
def RewriteFAQ(self, request: faq_pb2.FaqRewritingRequest, context: ServicerContext) -> faq_pb2.FaqRewritingResponse: | ||
logger.info(f"Received RewriteFAQ request – forwarding to downstream service at {FAQ_SERVICE_ADDRESS}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use lazy formatting in logging.
Replace f-string with lazy formatting for better performance and consistency.
- logger.info(f"Received RewriteFAQ request – forwarding to downstream service at {FAQ_SERVICE_ADDRESS}")
+ logger.info("Received RewriteFAQ request – forwarding to downstream service at %s", FAQ_SERVICE_ADDRESS)
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
logger.info(f"Received RewriteFAQ request – forwarding to downstream service at {FAQ_SERVICE_ADDRESS}") | |
logger.info("Received RewriteFAQ request – forwarding to downstream service at %s", FAQ_SERVICE_ADDRESS) |
🧰 Tools
🪛 GitHub Actions: Nebula - Test
[warning] 17-17: pylint C0115: Missing class docstring (missing-class-docstring).
🤖 Prompt for AI Agents
In nebula/src/nebula/gateway/handlers/faq_handler.py at line 17, replace the
f-string used in the logger.info call with lazy formatting by passing the format
string and arguments separately. This means changing the log statement to use a
format string with placeholders and passing FAQ_SERVICE_ADDRESS as an argument
to defer string interpolation until it is needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
♻️ Duplicate comments (3)
nebula/src/nebula/gateway/grpc_server.py (2)
10-13
: Avoid overriding global logging configuration.This issue was already raised in a previous review. The
basicConfig
call may interfere with other logging configurations in the application.
19-29
: Fix PEP 8 formatting and duplicate issues.
- Missing blank lines before function definitions (PEP 8)
- Language consistency and security issues were already raised in a previous review
GATEWAY_SERVICE_PORT = os.getenv("GATEWAY_SERVICE_PORT", "50051") -server = grpc.server(futures.ThreadPoolExecutor(max_workers=10)) + + def serve():server.wait_for_termination() + + def stop(): - server.stop(grace=None) # grace=None = sofort + server.stop(grace=None) # grace=None = immediatelynebula/src/nebula/transcript/routes/transcribe_routes.py (1)
71-89
: Improve cleanup error handling and visibility.This cleanup logic issue was already raised in a previous review. The current implementation only logs at warning level and could hide critical cleanup failures.
🧹 Nitpick comments (5)
nebula/src/nebula/transcript/app.py (1)
20-22
: Add blank lines before function definition.PEP 8 requires two blank lines before top-level function definitions.
app.include_router(transcribe_router, prefix="/transcribe", tags=["Transcription"]) + # Root endpoint @app.get("/", include_in_schema=False) async def root():
nebula/src/nebula/transcript/routes/transcribe_routes.py (3)
23-90
: Consider refactoring to reduce function complexity.The function has 21 local variables (exceeds 15). Consider extracting the slide detection logic into a separate function.
+async def detect_slide_numbers(frames: list[tuple[float, str]]) -> list[tuple[float, int]]: + """Detect slide numbers from frames using GPT Vision.""" + slide_timestamps = [] + for ts, img_b64 in frames: + slide_number = ask_gpt_for_slide_number(img_b64) + if slide_number is not None: + slide_timestamps.append((ts, slide_number)) + await asyncio.sleep(2) # Rate limiting + return slide_timestamps + def run_transcription(req: TranscribeRequestDTO, job_id: str):Then replace lines 42-47 with:
slide_timestamps = asyncio.run(detect_slide_numbers(frames))
47-47
: Optimize the GPT API rate limiting.The fixed 2-second sleep between GPT calls can accumulate to significant delays. Consider implementing a more sophisticated rate limiting approach.
- time.sleep(2) + # Implement adaptive rate limiting based on API response headers + time.sleep(0.5) # Reduced delay, adjust based on your API limitsAlternatively, consider processing frames in parallel batches with proper rate limiting.
23-23
: Add missing blank lines for PEP 8 compliance.Multiple functions are missing the required two blank lines before their definitions.
router = APIRouter() + def run_transcription(req: TranscribeRequestDTO, job_id: str):
logging.warning("⚠️ Failed to remove chunk directories: %s", cleanup_err) + @router.post("/start-transcribe", tags=["internal"])
return {"status": "processing", "transcriptionId": job_id} + @router.get("/status/{job_id}", tags=["internal"])
return get_job_status(job_id) + @router.get("/test")
Also applies to: 91-91, 100-100, 104-104
nebula/envoy.yaml (1)
119-119
: Add newline at end of file.YAML files should end with a newline character for POSIX compliance.
address: 0.0.0.0 - port_value: 9901 + port_value: 9901 +
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (6)
nebula/docker-compose.yml
(1 hunks)nebula/envoy.yaml
(1 hunks)nebula/src/nebula/gateway/grpc_server.py
(1 hunks)nebula/src/nebula/gateway/main.py
(1 hunks)nebula/src/nebula/transcript/app.py
(1 hunks)nebula/src/nebula/transcript/routes/transcribe_routes.py
(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
- nebula/src/nebula/gateway/main.py
- nebula/docker-compose.yml
🧰 Additional context used
🧬 Code Graph Analysis (2)
nebula/src/nebula/transcript/routes/transcribe_routes.py (6)
nebula/src/nebula/transcript/align_utils.py (1)
align_slides_with_segments
(4-26)nebula/src/nebula/transcript/dto.py (2)
TranscribeRequestDTO
(7-9)TranscriptionSegmentDTO
(12-18)nebula/src/nebula/transcript/jobs.py (4)
create_job
(11-19)fail_job
(28-35)get_job_status
(38-41)save_job_result
(22-25)nebula/src/nebula/transcript/slide_utils.py (1)
ask_gpt_for_slide_number
(6-48)nebula/src/nebula/transcript/video_utils.py (3)
download_video
(8-30)extract_audio
(33-43)extract_frames_at_timestamps
(46-66)nebula/src/nebula/transcript/whisper_utils.py (1)
transcribe_with_azure_whisper
(17-66)
nebula/src/nebula/transcript/app.py (2)
nebula/src/nebula/transcript/config.py (3)
Config
(11-59)get_log_level
(53-55)ensure_dirs
(58-59)nebula/src/nebula/health.py (1)
health
(8-9)
🪛 Flake8 (7.2.0)
nebula/src/nebula/transcript/routes/transcribe_routes.py
[error] 23-23: expected 2 blank lines, found 1
(E302)
[error] 91-91: expected 2 blank lines, found 1
(E302)
[error] 100-100: expected 2 blank lines, found 1
(E302)
[error] 104-104: expected 2 blank lines, found 1
(E302)
nebula/src/nebula/gateway/grpc_server.py
[error] 19-19: expected 2 blank lines, found 1
(E302)
[error] 28-28: expected 2 blank lines, found 1
(E302)
nebula/src/nebula/transcript/app.py
[error] 20-20: expected 2 blank lines, found 1
(E302)
🪛 Pylint (3.3.7)
nebula/src/nebula/transcript/routes/transcribe_routes.py
[refactor] 23-23: Too many local variables (21/15)
(R0914)
nebula/src/nebula/gateway/grpc_server.py
[error] 6-6: No name 'faq_pb2_grpc' in module 'nebula.grpc_stubs'
(E0611)
🪛 YAMLlint (1.37.1)
nebula/envoy.yaml
[error] 119-119: no new line character at the end of file
(new-line-at-end-of-file)
⏰ Context from checks skipped due to timeout of 90000ms (4)
- GitHub Check: Build and Push to GitHub Container Registry / Build linux/arm64 Docker Image for ghcr.io/ls1intum/edutelligence/iris
- GitHub Check: Build and Push to GitHub Container Registry / Build linux/amd64 Docker Image for ghcr.io/ls1intum/edutelligence/iris
- GitHub Check: Lint
- GitHub Check: Lint
|
||
# Read port from environment variable (default: 50051) | ||
GATEWAY_SERVICE_PORT = os.getenv("GATEWAY_SERVICE_PORT", "50051") | ||
server = grpc.server(futures.ThreadPoolExecutor(max_workers=10)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Encapsulate the global server variable.
The global server
variable at module level can cause issues with testing and concurrent usage. Consider encapsulating it within a class or function scope.
-server = grpc.server(futures.ThreadPoolExecutor(max_workers=10))
+_server = None
+
+def get_server():
+ global _server
+ if _server is None:
+ _server = grpc.server(futures.ThreadPoolExecutor(max_workers=10))
+ return _server
Then update the functions to use get_server()
instead of the global server
.
Committable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In nebula/src/nebula/gateway/grpc_server.py at line 17, the global variable
'server' should be encapsulated to avoid issues with testing and concurrency.
Refactor by moving the server creation inside a function named 'get_server()'
that returns the server instance. Then update all functions in the module to
call 'get_server()' instead of directly accessing the global 'server' variable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
♻️ Duplicate comments (4)
nebula/src/nebula/faq/services/rewriter_servicer.py (4)
17-17
: Use lazy formatting in logging.This issue was already identified in a previous review. Replace f-string with lazy formatting for consistency and performance.
32-32
: Add validation for OpenAI response structure.This issue was already identified in a previous review. Accessing
response.choices[0]
without checking if choices exist could cause an IndexError.
14-15
: Fix formatting and add class docstring.This issue was already identified in a previous review. Add required blank lines and class documentation.
72-72
: Add validation for OpenAI response structure in streaming method.Similar to the non-streaming method, this line also needs validation before accessing
response.choices[0]
.- rewritten_text = response.choices[0].message.content.strip() + if not response.choices: + raise ValueError("No response choices returned from OpenAI") + rewritten_text = response.choices[0].message.content.strip()
🧹 Nitpick comments (3)
nebula/src/nebula/faq/services/rewriter_servicer.py (3)
20-30
: Extract OpenAI client interaction into a helper method.The OpenAI client setup and chat completion logic is duplicated between
RewriteFAQ
andRewriteFAQStream
methods. This violates the DRY principle and makes maintenance harder.+ def _rewrite_with_openai(self, input_text: str) -> str: + """Helper method to rewrite text using OpenAI client.""" + client, deployment = get_openai_client("azure-gpt-4-omni") + + response = client.chat.completions.create( + model=deployment, + messages=[ + { + "role": "system", + "content": system_prompt_faq_rewriting.format(rewritten_text=input_text), + }, + ], + ) + + if not response.choices: + raise ValueError("No response choices returned from OpenAI") + return response.choices[0].message.content.strip()Also applies to: 55-71
20-20
: Consider making the deployment name configurable.The hardcoded deployment name "azure-gpt-4-omni" reduces flexibility for different environments or model versions.
Consider making this configurable through environment variables or configuration files to support different deployments across environments.
Also applies to: 55-55
36-37
: Add error code for consistency with streaming method.The non-streaming method doesn't set a gRPC error code, while the streaming method does (line 77). For consistency in error handling, consider adding an appropriate error code.
logger.error("Failed to rewrite FAQ: %s", str(e), exc_info=True) context.set_details("Error while contacting the LLM") + context.set_code(13) # INTERNAL error code return faq_pb2.FaqRewritingResponse(result="")
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
nebula/src/nebula/faq/services/rewriter_servicer.py
(1 hunks)nebula/src/nebula/protos/faq.proto
(1 hunks)
🧰 Additional context used
🪛 Flake8 (7.2.0)
nebula/src/nebula/faq/services/rewriter_servicer.py
[error] 14-14: expected 2 blank lines, found 1
(E302)
🪛 Buf (1.54.0)
nebula/src/nebula/protos/faq.proto
3-3: Files with package "faqservice" must be within a directory "faqservice" relative to root but were in directory "nebula/src/nebula/protos".
(PACKAGE_DIRECTORY_MATCH)
⏰ Context from checks skipped due to timeout of 90000ms (4)
- GitHub Check: Build and Push to GitHub Container Registry / Build linux/amd64 Docker Image for ghcr.io/ls1intum/edutelligence/iris
- GitHub Check: Build and Push to GitHub Container Registry / Build linux/arm64 Docker Image for ghcr.io/ls1intum/edutelligence/iris
- GitHub Check: Lint
- GitHub Check: Lint
🔇 Additional comments (3)
nebula/src/nebula/protos/faq.proto (3)
5-9
: LGTM! Well-defined gRPC service interface.The service definition clearly separates unary and streaming operations for FAQ rewriting, providing flexibility for different client needs.
25-30
: LGTM! Comprehensive status update message design.The
FaqRewriteStatusUpdate
message provides good coverage for streaming progress updates with status messages, progress percentage, completion flag, and final result.
3-3
: Ignore package–directory mismatch for .proto files.Both hyperion.proto and faq.proto declare their packages yet reside together under a common protos folder by project convention. The build relies on explicit
java_package
(and similar) options, so enforcing a one-to-one folder name for each proto package is unnecessary and would break consistency with existing files. Leave the directory structure as-is.Likely an incorrect or invalid review comment.
Nebula Transcription Service - Pull Request Description
✨ Add Nebula Transcription Service (Initial Version)
This PR introduces the Nebula Transcription System — a modular FastAPI-based backend for transcribing lecture videos and detecting slide numbers using AI.
🚀 Main Components
🎥 Transcription Service (nebula.transcript)
🌉 Gateway (nebula.gateway)
Exposed Endpoints:
🧠 Job Management
🧪 API Usage
How to run (Local)
See README.md for full setup and config.
Summary by CodeRabbit
New Features
Bug Fixes
Documentation
Tests
Chores
.gitignore
and code style configurations.