Simple DeepSearch Agent Project

Project Goal

This project aims to implement a simple DeepSearch agent, leveraging Large Language Models (LLMs) to automate and enhance research tasks. The agent will drive multi-step reasoning, large-scale online retrieval, cross-source evidence merging, and structured writing to produce research-grade results with citations.

Introduction to DeepSearch

Deep Research (deep research/deep search) technology is an application system built around large language models (LLMs), aiming to automate and enhance research tasks. It drives large-scale online retrieval through multi-step reasoning, merges evidence across sources, and performs structured writing, producing research-grade results with citations.

A Deep Research Agent is an AI agent driven by LLMs, integrating dynamic reasoning, adaptive planning, multi-iterative external data retrieval and tool use, and the ability to generate comprehensive analytical reports for information research tasks.

For a more detailed technical deep dive, please refer to deepsearch_summary_en.md.

🚀 Features

LangChain Integration: Advanced agent orchestration with ReAct prompting and tool usage
Multi-LLM Support: Seamlessly switch between Ollama (local) and OpenAI-compatible APIs
Modular Architecture: Clean separation between LLM clients, search engines, and core logic
Extensible Design: Easy to add new LLM providers and search engines
Environment-based Configuration: Secure management of API keys and settings
Modern Python: Uses OpenAI SDK v1.0.0+ and best practices

🏭 Search Engine Abstraction Factory (Implemented)

Factory Pattern Features

Dynamic Engine Discovery: Automatically discovers all search engines inheriting from BaseSearch
Dynamic Registration: Register new engines without modifying existing code
Configuration Validation: Automatically checks engine configuration status
Error Handling: Comprehensive error handling with clear messages

Available Search Engines

placeholder: Mock search engine for development and testing
custom_google: Prioritized Google Custom Search API integration (uses GOOGLE_API_KEY and GOOGLE_CSE_ID)
google: Google Custom Search API integration (fallback if custom_google is not available)
bing: Bing Search API integration
brave: Brave Search API integration

Usage Examples

# List all available engines and their status
python main.py --list-engines

# Use auto-selected search engine (recommended)
python main.py --search-engine auto "your query"

# Use specific search engine
python main.py --search-engine google "AI advancements"

# Use placeholder for testing
python main.py --search-engine placeholder "test query"

Advanced Search Customization (Planned)

Multi-Engine Search: Support simultaneous searches across multiple search engines
Customizable Results: Configurable minimum result count per search engine
Result Aggregation: Intelligent merging and deduplication of results from multiple sources
Performance Optimization: Parallel search execution with configurable timeouts

📋 Prerequisites

Python 3.8+
Ollama (optional, for local models)
API keys for desired services (OpenAI, Google, Bing, etc.)

🛠️ Installation & Setup

1. Clone the Repository

git clone <repository-url>
cd ollama-search-agent

2. Install Dependencies

Using pip:

pip install -r requirements.txt

Using uv (recommended for speed):

uv pip install -r requirements.txt

3. Configure Environment

Copy the environment template:

cp .env.example .env

Edit .env file with your API keys and settings:

# Ollama Configuration
OLLAMA_HOST=http://localhost:11434
OLLAMA_MODEL=llama3

# OpenAI Configuration
OPENAI_API_KEY=your_openai_api_key_here
OPENAI_BASE_URL=https://api.openai.com/v1
OPENAI_MODEL=gpt-4

# Search Engine API Keys
GOOGLE_API_KEY=your_google_api_key
GOOGLE_CSE_ID=your_custom_search_engine_id
CUSTOM_GOOGLE_API_KEY=your_custom_google_api_key # Note: Currently uses GOOGLE_API_KEY
CUSTOM_GOOGLE_CSE_ID=your_custom_google_cse_id # Note: Currently uses GOOGLE_CSE_ID
BING_API_KEY=your_bing_api_key
BRAVE_API_KEY=your_brave_api_key

🎯 Usage

Basic Usage with Ollama (Default)

python main.py "your search query"

Using OpenAI Models

python main.py --llm openai "your search query"

Interactive Mode

python main.py
# Then enter your query when prompted

🏗️ Project Architecture

ollama-search-agent/
├── agent.py                  # Core agent orchestration logic
├── config.py                 # Centralized configuration management
├── main.py                   # CLI entry point and argument parsing
├── requirements.txt          # Python dependencies
├── .env.example              # Environment variables template
├── .gitignore               # Git ignore rules
├── README.md                # Project documentation
├── TECHNICAL_OVERVIEW.md    # Detailed technical documentation
├── langchain_tools/         # LangChain tool implementations (e.g., SearchTool)
│   ├── __init__.py
│   └── search_tools.py      # LangChain wrapper for search engines
├── llm_clients/             # LLM provider implementations
│   ├── __init__.py
│   ├── ollama_client.py     # Ollama local model client
│   └── openai_client.py     # OpenAI-compatible API client
└── search_engines/          # Search engine implementations
    ├── __init__.py
    ├── base_search.py       # Abstract base class for search engines
    ├── placeholder_search.py # Mock search for development
    ├── bing_search.py       # Bing Search API
    ├── brave_search.py      # Brave Search API
    ├── custom_google_search.py # Prioritized Google Custom Search API implementation
    └── google_search.py     # Standard Google Custom Search API implementation

🔧 Core Components

SearchAgent (`agent.py`)

Orchestrates the search workflow: Plan → Search → Synthesize
Uses LLM to generate search plans from user queries
Combines search results with original query for final synthesis

LLM Clients (`llm_clients/`)

OllamaClient: Communicates with local Ollama instance via REST API
OpenAIClient: Uses OpenAI SDK v1.0.0+ for OpenAI-compatible APIs
Both implement consistent generate(prompt) interface

Search Engines (`search_engines/`)

BaseSearch: Abstract class defining search engine interface
PlaceholderSearch: Mock implementation for testing
GoogleSearch: Google Custom Search API integration
BingSearch: Bing Search API integration
BraveSearch: Brave Search API integration

🔄 Workflow

Query Analysis: LLM analyzes user query and generates search plan
Search Execution: Search engine executes plan and retrieves results
Information Synthesis: LLM synthesizes search results into coherent answer
Response Delivery: Final answer returned to user

🚀 Extending the Project

Adding a New LLM Provider

Create new client in llm_clients/ (e.g., anthropic_client.py)
Implement generate(prompt) method
Add configuration to config.py and .env.example
Update CLI arguments in main.py

Adding a New Search Engine

Create new class in search_engines/ inheriting from BaseSearch
Implement search(query) method returning structured results
Add API configuration to config.py and .env.example
Update CLI arguments in main.py

🐛 Troubleshooting

Common Issues

ModuleNotFoundError: No module named 'openai'

pip install -r requirements.txt

OpenAI API Compatibility Issues

The project uses OpenAI SDK v1.0.0+ with modern API patterns
Legacy openai.Completion calls have been updated to client.chat.completions.create()

Ollama Connection Issues

Ensure Ollama is running: ollama serve
Verify OLLAMA_HOST in .env matches your Ollama instance

📚 Documentation

TECHNICAL_OVERVIEW.md: Detailed technical architecture and component breakdown
Code is well-documented with docstrings and type hints

🤝 Contributing

Fork the repository
Create a feature branch
Make your changes
Add tests if applicable
Submit a pull request

📄 License

[Add your license here]

✨ Future Enhancements

Intelligent Agent Selection: Automatically select the best agent based on user input and task complexity.
Dynamic LLM Selection: Allow the agent to dynamically choose the most suitable LLM (Ollama, OpenAI, etc.) based on the query or user preferences.
Enhanced Search Engine Prioritization: Further refine the logic for selecting search engines, potentially incorporating factors like query type, result quality, or user-defined preferences.

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
langchain_tools		langchain_tools
llm_clients		llm_clients
search_engines		search_engines
.env.example		.env.example
.gitignore		.gitignore
.python-version		.python-version
AGENTS.md		AGENTS.md
README.md		README.md
TECHNICAL_OVERVIEW.md		TECHNICAL_OVERVIEW.md
activate_venv.bat		activate_venv.bat
activate_venv.ps1		activate_venv.ps1
agent.py		agent.py
config.py		config.py
deepsearch_summary.md		deepsearch_summary.md
deepsearch_summary_en.md		deepsearch_summary_en.md
main.py		main.py
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
test_factory.py		test_factory.py

TUSIDENG/ollama-search-agent

Folders and files

Latest commit

History

Repository files navigation