Edit images intelligently using natural language! This application combines a standard Streamlit image editor with a conversational AI assistant powered by LangGraph. The agent understands your requests and uses the editor's functions (like brightness, contrast, filters) as tools to modify the image.
- 🖼️ AI-Powered Image Editor with LangGraph Agent
- Adjustments: Brightness, Contrast, Rotation.
- Operations: Zoom/Crop, Binarization, Negative (Invert), RGB Channel Selection, Highlight Light/Dark Areas.
- Merging: Alpha blend the primary image with a second uploaded image (automatic resizing).
- Analysis: View RGB & Luminosity Histogram.
- Export: Download the edited image (default: PNG).
- Natural Language Control: Instruct the editor via chat (e.g., "Make it brighter", "Apply a blur filter then rotate 90 degrees").
- LangGraph Agent: Manages the conversation flow and decides which image editing tools to use based on your request.
- Tool Integration: Uses functions from
core/processing.py
and potentiallycore/ai_services.py
as tools (e.g.,adjust_brightness
,apply_filter
,remove_background_ai
). - Multi-Step Execution: Handles sequential commands within a single conversational turn.
- Context Aware: Remembers the conversation history within the current session using LangGraph's state management.
- (Optional) Voice Interaction: Includes foundations for Text-to-Speech (TTS) output. Voice input (STT) can be integrated.
- Python 3.8+
- Streamlit (
pip install streamlit
) - Checkrequirements.txt
for specific version. - Core Libraries:
pillow>=9.5.0
,numpy
,matplotlib
,opencv-python>=4.7.0
. - AI & Agent Libraries:
langchain
,langchain-core
,langgraph
,openai
. - An OpenAI API Key is required for the AI Assistant.
- Potentially other API keys (e.g., Stability AI) if using features from
core/ai_services.py
. ffmpeg
might be needed if audio features are extended (check specific library requirements).
See requirements.txt
for a detailed list of Python packages and versions.
- Clone the Repository:
# Replace with your actual repository URL git clone https://github.com/yourusername/image-editor.git cd image-editor
- Create and Activate Virtual Environment (Recommended):
python -m venv .venv # On Linux/macOS: source .venv/bin/activate # On Windows: # .venv\Scripts\activate
- Install Dependencies:
pip install -r requirements.txt
- Configure API Keys:
- The AI Assistant requires an OpenAI API key.
- Method 1: Environment Variable: Set the
OPENAI_API_KEY
environment variable in your system or terminal session. - Method 2: Streamlit Secrets: Create a file named
.streamlit/secrets.toml
in your project root and add the key:(This method is recommended, especially for deployment).# .streamlit/secrets.toml OPENAI_API_KEY="sk-..."
- Add any other required keys (e.g.,
STABILITY_API_KEY
) similarly if using those AI services.
- Ensure your virtual environment is active and API keys are configured.
- Run the Streamlit app from the project's root directory:
streamlit run app.py
- The application will open in your browser. Use the sidebar to navigate between the "Image Editor Pro" (manual editor) and "AI Assistant" pages.
- First, load an image using the "Image Editor Pro" page via the sidebar uploader.
- Navigate to the "🤖 AI Assistant" page using the sidebar.
- In the chat input, type your image editing commands using natural language (e.g., "Increase contrast by 0.5", "Rotate 90 degrees clockwise", "Apply a sharpen filter", "Remove the background").
- Press Enter.
- Observe the chat history for the conversation flow: your request, the assistant's status/tool usage messages, and the final response.
- The image preview on the AI Assistant page updates automatically after the agent successfully applies edits.
- (If TTS implemented) Use controls (likely in the sidebar) to manage voice output.
image-editor/
├── app.py # Main Streamlit app entry point
├── pages/
│ └── 1_🤖_AI_Assistant.py # UI & Logic for the AI Assistant page
├── agent/ # LangGraph Agent implementation
│ ├── __init__.py
│ ├── agent_graph.py # Defines & compiles the LangGraph graph, nodes, edges
│ ├── graph_state.py # Defines the AgentState schema for the graph
│ └── tools.py # Defines tools (schemas & mapping to implementations) for the LLM
├── core/ # Core image processing & AI service logic
│ ├── __init__.py
│ ├── ai_services.py # Functions calling external AI APIs (Stability, rembg)
│ ├── histogram.py # Histogram generation logic
│ ├── image_io.py # Image loading/saving utilities
│ └── processing.py # Image manipulation functions (used as agent tools)
├── state/ # Streamlit session state management
│ ├── __init__.py
│ └── session_state_manager.py # Helper functions for managing st.session_state
├── ui/ # UI components (potentially reusable)
│ ├── __init__.py
│ └── interface.py # UI elements primarily for the manual editor page
├── utils/ # Utility functions and constants
│ ├── __init__.py
│ └── constants.py # Shared constants (e.g., image types)
├── tests/ # Unit/Integration tests (pytest setup included)
│ ├── ... # Test files mirroring project structure
├── requirements.txt # Python dependencies
├── pytest.ini # Pytest configuration
├── .gitignore # Git ignore rules
└── README.md # This file
- The user interacts with the Streamlit UI (
app.py
andpages/1_🤖_AI_Assistant.py
). - Session state (
st.session_state
) holds the current images, widget values, and chat history, managed via helpers instate/session_state_manager.py
. - On the AI Assistant page, user chat input is formatted and passed to the compiled LangGraph graph (
agent/agent_graph.py
). - The graph, using the AgentState schema (
agent/graph_state.py
), manages the execution flow:- An LLM node (using an OpenAI model) interprets the user request and decides the next action (respond directly or use a tool).
- Conditional edges route the flow based on the LLM's decision.
- Tool nodes execute specific actions defined in
agent/tools.py
. These tools call underlying image processing functions (core/processing.py
,core/ai_services.py
). - Image modifications happen by updating the image data within
st.session_state
. - Tool results are passed back into the graph.
- The graph continues until the task is complete, culminating in a final response from the LLM node.
- The Streamlit UI updates reactively based on changes in
st.session_state
(displaying new chat messages and the modified image).
- Edit
agent/agent_graph.py
to change:- The agent's prompt or LLM models
- Node functions or routing logic
- Graph structure and flow
- Tools: Modify tool definitions (schemas) or how they map to implementation functions in
agent/tools.py
. - State: Adjust
agent/graph_state.py
if the information flowing through the graph needs to change.
- Implement the core image processing logic in
core/processing.py
orcore/ai_services.py
. - Define the tool schema and its link to the implementation function within
agent/tools.py
. - Ensure the new tool is registered and available to the agent graph in
agent/agent_graph.py
. - (Optional but recommended) Update the agent's system prompt to make it aware of the new tool's capabilities.
- Test thoroughly by asking the assistant to use the new tool.
- Found a bug? Open an issue on GitHub.
- Have an idea? Suggest it via GitHub Issues.
- Want to contribute code? Fork the repository and submit a pull request.
- Streamlit
- Pillow
- LangChain & LangGraph
- OpenCV
- NumPy
- Matplotlib
- OpenAI (or other LLM/TTS providers)
Made with ❤️ by @josefdc @Esteban8482