This document serves as the internal guide for the AI PDF Generator application.
This project is a full-stack web application that allows users to generate styled PDF documents from natural language prompts. It uses a Next.js frontend and API layer that communicates with a Python/FastAPI backend for all AI-powered heavy lifting. The system supports context-aware generation by allowing users to upload their own documents, which are stored and queried from a vector database.
- AI-Powered PDF Generation: Users can write natural language prompts to generate complete, styled documents.
- Context-Aware Generation: Ability to upload personal PDF files, which are vectorized and stored, allowing the AI to use them as a knowledge base for generation.
- PDF Management Dashboard: A user-specific dashboard that lists all previously generated documents.
- User Authentication & Plans: Secure user authentication with a credit-based system for free and premium tiers.
- Live Preview: A WYSIWYG editor that displays the generated HTML in real-time before downloading the final PDF.
| Category | Technology | Purpose |
|---|---|---|
| Frontend | Next.js, React, Tailwind CSS | UI, client-side logic, and previewing documents. |
| Backend (BFF) | Next.js API Routes | User auth, CRUD operations, proxy to Python. |
| AI Backend | FastAPI (Python) | AI logic, PDF generation, context processing. |
| AI / LangChain | LangChain, Gemini API | LLM calls, document processing, embeddings. |
| Authentication | BetterAuth | User authentication and management. |
| Database (Primary) | Supabase (PostgreSQL) | Storing user data, PDF metadata, credits. |
| Database (Vector) | Pinecone | Storing text embeddings for context search. |
The application is split into two main services: the Next.js App and the FastAPI Server.
-
Next.js App: Serves the user-facing UI and a "Backend for Frontend" (BFF) via its API routes. The BFF handles all direct communication from the client, manages authentication with BetterAuth, performs CRUD operations on the Supabase database (e.g., fetching PDF lists), and acts as a secure proxy for all requests to the FastAPI server.
-
FastAPI Server: A stateless Python service that exposes endpoints for all AI-related tasks. It is never directly accessed by the end-user. Its sole purpose is to receive requests from the Next.js BFF, perform complex operations using LangChain and the Gemini API, and return the result.
The generation process is a multi-layered pipeline designed to produce high-quality, well-formatted documents.
- Request: The Next.js API sends a user prompt to the
/generateendpoint on the FastAPI server. - Workflow Execution: The main
workflowfunction calls four distinct layers in sequence:- Layer 1: Prompt Refinement (
prompt_refine.py): The initial prompt is sent to Gemini to be broken down into three parts: a detailed content description, specific formatting instructions (e.g., colors, fonts), and structural instructions (e.g., "use a table"). - Layer 2: Content Draft (
content_draft.py): The content description and structural instructions are used to generate the initial Markdown draft. If context is required, the workflow performs a similarity search against Pinecone and injects the results into the prompt. - Layer 3: Structure Refinement (
refine_structure.py): The Markdown draft is passed through another LLM call to clean up the structure, improve flow, and ensure adherence to the structural instructions. - Layer 4: Formatting & Rendering (
formatting.py,renderer.py): The formatting instructions are used to generate a JSON object of CSS styles. The refined Markdown is converted to HTML, and the CSS styles are injected as inline styles on each HTML element.
- Layer 1: Prompt Refinement (
- Response: The final, self-contained HTML string is returned to the Next.js BFF, which then sends it to the frontend for rendering via
dangerouslySetInnerHTML.
- Upload: The user uploads a PDF. The Next.js BFF streams this file to the
/context/uploadendpoint on the FastAPI server. - Processing: The FastAPI server uses LangChain's document loaders to parse the PDF, split it into text chunks, generate embeddings for each chunk using an embedding model, and upsert the resulting vectors into the Pinecone database, tagged with the user's ID.
This section defines the primary communication contract between the services.
POST /api/generateHTML: Receives prompt from client, proxies to FastAPI, saves metadata to Supabase.POST /api/(context): Homes addContext and removeContext apis.POST /api/editHTML: Edit. Proxies to fastapiGET /api/(pdfs): All pdf crud api's.POST /api/downloadPdf: download PDF.
POST /ai/generate: Expects a JSON payload withuserPrompt,userId, etc. Executes the full AI workflow and returns a final HTML string.POST /context/upload: Expects a file upload. Processes the PDF and stores it in Pinecone.POST /context/remove: Expects a file upload. Processes the PDF and deletes it in Pinecone.POST /ai/edit:userPrompt. Returns a modified HTML string.
- Free Plan: Users start on a free plan with 20 credits.
- Premium Plan: Users can upgrade to a premium plan, which also provides 20 credits. (Note: Clarify if this is a monthly refresh or a one-time allocation).
- Credit Deduction: One credit is deducted for each successful call to the
/generateor/editendpoints. Context uploads are free. - Management: The credit count for each user is stored and managed in the Supabase
usertable.