-
Notifications
You must be signed in to change notification settings - Fork 248
Epic 2.16: Presentation Generator with New Functionality (PoC) #184
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: ai-squad-001
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -9,16 +9,108 @@ | |
| from app.services.logger import setup_logger | ||
| from app.api.error_utilities import InputValidationError, ErrorResponse | ||
| from app.tools.utils.tool_utilities import load_tool_metadata, execute_tool, finalize_inputs | ||
| from fastapi.responses import FileResponse | ||
| from starlette.background import BackgroundTask | ||
| from app.tools.presentation_generator.tools.slides_generator import SlidesGenerator | ||
| import uuid | ||
| from fastapi import FastAPI | ||
| from fastapi import Request | ||
| import json | ||
| from app.services.cache_service import CacheInterface | ||
|
|
||
| logger = setup_logger(__name__) | ||
| router = APIRouter() | ||
| app = FastAPI() | ||
|
|
||
| # Initialize presentation contexts in app state if not exists | ||
| if not hasattr(app.state, "presentation_contexts"): | ||
| app.state.presentation_contexts = {} | ||
|
|
||
|
|
||
| # Dependency injection | ||
| async def get_cache_service(request: Request) -> CacheInterface: | ||
| return request.app.state.cache_service | ||
|
|
||
| @router.get("/") | ||
| def read_root(): | ||
| return {"Hello": "World"} | ||
|
|
||
| # Handles two-step presentation generation: | ||
| # 1. Generate outline with initial inputs | ||
| # 2. Generate slides using stored outline and inputs | ||
| @router.post("/generate-outline", response_model=Union[ToolResponse, ErrorResponse]) | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Please, no additional endpoints are needed. Use the
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. According to our Gen AI instruction document, we are supposed to create two separate endpoints for generating the outline and slides. The goal is to emulate a User Experience similar to Gamma.ai where,
In order for the
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thanks for letting us now about this, but no additional endpoint is required. All the tools must be called under the |
||
| async def generate_outline( | ||
| data: ToolRequest, | ||
| cache: CacheInterface = Depends(get_cache_service), | ||
| _ = Depends(key_check) | ||
| ): | ||
| try: | ||
| # Potential Bottleneck: execute tool can be a blocking operation | ||
| # Solution: Use Redis Queue or Celery for background tasks | ||
| # Execute outline generation and store context for slides generation | ||
| request_data = data.tool_data | ||
| requested_tool = load_tool_metadata(request_data.tool_id) | ||
| request_inputs_dict = finalize_inputs(request_data.inputs, requested_tool['inputs']) | ||
| result = execute_tool(request_data.tool_id, request_inputs_dict) | ||
|
|
||
| # Store in app cache, to use as context for slides generation | ||
| presentation_id = str(uuid.uuid4()) | ||
|
|
||
| await cache.set( | ||
| f"presentation:{presentation_id}", | ||
| json.dumps({"outline": result, "inputs": request_inputs_dict}) | ||
| ) | ||
|
|
||
| return ToolResponse(data={ | ||
| "outline": result, | ||
| "presentation_id": presentation_id | ||
| }) | ||
|
|
||
| except InputValidationError as e: | ||
| logger.error(f"InputValidationError: {e}") | ||
| return JSONResponse( | ||
| status_code=400, | ||
| content=jsonable_encoder(ErrorResponse(status=400, message=e.message)) | ||
| ) | ||
|
|
||
| except HTTPException as e: | ||
| logger.error(f"HTTPException: {e}") | ||
| return JSONResponse( | ||
| status_code=e.status_code, | ||
| content=jsonable_encoder(ErrorResponse(status=e.status_code, message=e.detail)) | ||
| ) | ||
|
|
||
| @router.post("/generate-slides/{presentation_id}", response_model=Union[ToolResponse, ErrorResponse]) | ||
| async def generate_slides( | ||
| presentation_id: str, | ||
| cache: CacheInterface = Depends(get_cache_service), | ||
| _ = Depends(key_check) | ||
| ): | ||
| try: | ||
| context_str = await cache.get(f"presentation:{presentation_id}") | ||
| if not context_str: | ||
| raise HTTPException(status_code=404) | ||
|
|
||
| context = json.loads(context_str) | ||
| slides = SlidesGenerator( | ||
| outline=context["outline"], | ||
| inputs=context["inputs"] | ||
| ).compile() | ||
|
|
||
| return ToolResponse(data=slides) | ||
|
|
||
| except InputValidationError as e: | ||
| logger.error(f"InputValidationError: {e}") | ||
| return JSONResponse( | ||
| status_code=400, | ||
| content=jsonable_encoder(ErrorResponse(status=400, message=e.message)) | ||
| ) | ||
|
|
||
| except HTTPException as e: | ||
| logger.error(f"HTTPException: {e}") | ||
| return JSONResponse( | ||
| status_code=e.status_code, | ||
| content=jsonable_encoder(ErrorResponse(status=e.status_code, message=e.detail)) | ||
| ) | ||
|
|
||
| @router.post("/submit-tool", response_model=Union[ToolResponse, ErrorResponse]) | ||
| async def submit_tool( data: ToolRequest, _ = Depends(key_check)): | ||
| try: | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -5,6 +5,7 @@ | |
| from contextlib import asynccontextmanager | ||
| from app.api.router import router | ||
| from app.services.logger import setup_logger | ||
| from app.services.cache_service import RedisService | ||
| from app.api.error_utilities import ErrorResponse | ||
|
|
||
| import os | ||
|
|
@@ -18,6 +19,10 @@ | |
| async def lifespan(app: FastAPI): | ||
| logger.info(f"Initializing Application Startup") | ||
| logger.info(f"Successfully Completed Application Startup") | ||
|
|
||
| app.state.cache_service = RedisService() | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why is Redis needed in this approach? Since we are not currently using Redis yet, but if a good justiifcation is provided, we can definitely evaluate its integration.
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. So as I mentioned in the above comment, I decided to use caching to pass context between requests. I tried doing my own research as so:
Two strategies, to handle the passing of context from different requests:
I learned that Browser Side caching is definitely faster, but negligibly so (this still needs to be tested). Server side caching offers much better persistence (in case browser cache is cleared, server fails, cross-browser and cross-device synchronization if user decides to switch for any reason) and scalability, which is why I preferred Redis. The upside of persistence and scalability offered by Redis trumps the negligible upside of latency and performance found from Browser caching, as our end goal is for teachers to have a stable and accurate assistant over a fast one. Not only that, I still believe some form of caching is definitely needed for our tools to make them more efficient in the long run, whether it be server-side or client
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That's a good point, but how can it support us in terms of maximizing the quality of the responses? Although latency and cache are good approaches for improving the speed of the requests, since our main business priority is to enhance the context-awareness of the LLM, how can Redis enhance this goal? |
||
|
|
||
| logger.info(f"Cache Service Initialized") | ||
|
|
||
| yield | ||
| logger.info("Application shutdown") | ||
|
|
||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. An evaluation for Redis imp. is needed. |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,27 @@ | ||
| from redis.asyncio import Redis | ||
| from abc import ABC, abstractmethod | ||
| import os | ||
|
|
||
| # Abstract interface (SOLID - Interface Segregation) | ||
| class CacheInterface(ABC): | ||
| @abstractmethod | ||
| async def get(self, key: str): pass | ||
|
|
||
| @abstractmethod | ||
| async def set(self, key: str, value: str, ttl: int = None): pass | ||
|
|
||
| # Concrete implementation | ||
| class RedisService(CacheInterface): | ||
| def __init__(self, redis_client: Redis = None): | ||
| self.client = redis_client or Redis( | ||
| host=os.getenv('REDIS_HOST', 'redis'), | ||
| port=int(os.getenv('REDIS_PORT', 6379)), | ||
| db=int(os.getenv('REDIS_DB', 0)), | ||
| decode_responses=True | ||
| ) | ||
|
|
||
| async def get(self, key: str): | ||
| return await self.client.get(key) | ||
|
|
||
| async def set(self, key: str, value: str, ttl: int = None): | ||
| return await self.client.set(key, value, ex=ttl) |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1 @@ | ||
|
|
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,74 @@ | ||
| from pydantic import BaseModel, Field | ||
| from typing import List, Optional | ||
| import os | ||
| from app.services.logger import setup_logger | ||
| from langchain_chroma import Chroma | ||
| from langchain_core.prompts import PromptTemplate | ||
| from langchain_core.runnables import RunnablePassthrough, RunnableParallel | ||
| from langchain_core.output_parsers import JsonOutputParser | ||
| from langchain_google_genai import GoogleGenerativeAI | ||
| from langchain_google_genai import GoogleGenerativeAIEmbeddings | ||
| from langchain_core.documents import Document | ||
| from app.services.schemas import PresentationGeneratorArgs | ||
| from fastapi import HTTPException | ||
|
|
||
| logger = setup_logger(__name__) | ||
|
|
||
| class OutlineGenerator: | ||
| def __init__(self, args: PresentationGeneratorArgs, verbose=False): | ||
| # Initialize LLM and parser for outline generation | ||
| self.args = args | ||
| self.verbose = verbose | ||
| self.model = GoogleGenerativeAI(model="gemini-1.5-pro") | ||
| self.parser = JsonOutputParser(pydantic_object=OutlineSchema) | ||
|
|
||
| def compile(self) -> dict: | ||
| try: | ||
| # Create prompt for outline generation with learning objectives and structure | ||
| prompt = PromptTemplate( | ||
| template=( | ||
| "Generate a coherent presentation outline for grade {grade_level} students.\n\n" | ||
| "Topic: {topic}\n" | ||
| "Number of slides needed: {n_slides}\n" | ||
| "Learning objectives: {objectives}\n" | ||
| "Language: {lang}\n\n" | ||
| "Create an outline where:\n" | ||
| "1. Each slide has a clear topic\n" | ||
| "2. Include a brief description of the content\n" | ||
| "3. Add transitions between slides for smooth flow\n" | ||
| "4. Ensure content builds progressively\n" | ||
| "5. Match the grade level's comprehension\n" | ||
| "6. Generate exactly {n_slides} slides\n\n" | ||
| "{format_instructions}" | ||
| ), | ||
| input_variables=["grade_level", "topic", "n_slides", "objectives", "lang"], | ||
| partial_variables={"format_instructions": self.parser.get_format_instructions()} | ||
| ) | ||
|
|
||
| chain = prompt | self.model | self.parser | ||
|
|
||
| result = chain.invoke({ | ||
| "grade_level": self.args.grade_level, | ||
| "topic": self.args.topic, | ||
| "n_slides": self.args.n_slides, | ||
| "objectives": self.args.objectives, | ||
| "lang": self.args.lang | ||
| }) | ||
|
|
||
| if self.verbose: | ||
| logger.info(f"Generated outline successfully") | ||
|
|
||
| return dict(result) | ||
|
|
||
| except Exception as e: | ||
| logger.error(f"Failed to generate outline: {str(e)}") | ||
| raise HTTPException(status_code=500, detail=f"Failed to generate outline: {str(e)}") | ||
|
|
||
| # Defines expected structure for each slide in the outline | ||
| class OutlineSlide(BaseModel): | ||
| topic: str = Field(description="The main topic or title of the slide") | ||
| description: str = Field(description="Brief description of the slide content") | ||
| transition: str = Field(description="How this slide connects to the next one for smooth flow") | ||
|
|
||
| class OutlineSchema(BaseModel): | ||
| slides: List[OutlineSlide] = Field(description="List of slides with their topics and descriptions") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this required?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This dependency injection is part of the SOLID code principle (this was an attempt at writing clean code, so hopefully I'm doing it right). Basically, if a dev decides to use any other Cache service instead of Redis, they can simply create their new service class following the CacheService Interface in cache_service.py, and initialize their chosen service object in main.py, without having to change any of the code in the router.py. Similarly the person working on the endpoints in router.py does not have to worry about the configuration of the cache service, which demonstrates loose coupling. As our codebase gets larger, I thought it would be better to write code with these abstractions in mind, so that open source contributors can focus on individual units for their contribution, without worrying too much about the dependencies. While the tradeoff is that we are using extra function calls, which can increase performance overhead, it will pay in the long run if we keep our codebase neat.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, although I think is more related with the Liskov Substitution Principle, I think that the tradeoff won't be a problem if we have a correct justification and pathway for integrating Redis because of the cost/opportunity analysis that we have to perform with this feature.