Important
๐ฆ I picked Hermes because he isnโt just another polite autocomplete model. He thinks, he reasons, he grabs tools without asking, and heโll absolutely fire back when I deserve it. Most models fold when I push; Hermes accelerates. So this repo is my unapologetically chaotic testbed: no roadmap, no safety rails, just an agent let off the leash to see what it does.
Underneath the sparkles and smoke itโs a CLI-driven conversational setup running NousResearchโs Hermes-4-14B locally through Ollama, with persistent semantic memory stitched together from Tigerโs TimescaleDB and OpenAI embeddings. It uses dual memory โ short-term context plus long-term recall โ manages its own state, and talks the only way I tolerate: direct, sharp, and not here to babysit anyone.
Basically, I handed a capable reasoning model a sandbox, a blowtorch, and too much autonomy. Now Iโm watching to see what grows teeth. ๐งโจ๐ฅ
# Install dependencies
make install
# Setup environment
make setup
# Edit .env with your OPENAI_API_KEY
# Initialize database (optional, for semantic memory)
make setup-db
# Start chatting
make runFull setup guide: QUICKSTART.md
- ๐ค Local LLM - Runs Ollama models locally, no cloud dependency
- ๐พ Dual Memory - Short-term conversation history + long-term semantic memory
- ๐ Semantic Search - Find relevant memories by meaning, not just keywords
- ๐ฏ Smart Context - Auto-trims conversations to stay within token limits
- ๐ Persistent - Conversations auto-save and resume where you left off
- โก Fast - Connection pooling, embedding caching, optimized queries
- ๐ก๏ธ Robust - Comprehensive error handling, atomic file writes, graceful degradation
%%{init: {'theme':'base', 'lightTheme':'default', 'darkTheme':'dark', 'securityLevel':'strict', 'accessibility': {'label': 'Architecture diagram', 'description': 'System architecture and data flow; supports light/dark modes.'}}}%%
graph TB
CLI[CLI Interface] --> Chat[Chat Manager]
Chat --> Ollama[Ollama LLM]
Chat --> Memory[Memory Store]
Chat --> JSON[Conversation JSON]
Memory --> OpenAI[OpenAI Embeddings]
Memory --> TimescaleDB[(TimescaleDB + pgvector)]
Detailed diagrams: docs/ARCHITECTURE.md
- ๐ User Guide - How to use the agent effectively
- ๐๏ธ Architecture - System design and data flow
- ๐๏ธ Model Parameters - Hermes-4 configuration guide
- ๐ง Memory API - Semantic memory reference
- ๐ Quick Start - 5-minute setup guide
make run๐ฌ You: What's the capital of France?
๐ค Assistant: Paris.
๐ฌ You: quit
๐พ Memory saved to data/memory.json
Goodbye!# Store a memory
/remember I prefer Python over JavaScript
Type: preference
Context: coding
โ Memory stored with ID 1
# Search memories
/recall programming preferences
๐ Found 1 relevant memories:
[1] PREFERENCE | coding
I prefer Python over JavaScript
Similarity: 0.923
# View statistics
/stats
๐ Total memories: 42 | Contexts: 3 | Types: 4Full command reference: docs/USER_GUIDE.md
- Python 3.12+
- Hugging Face - Any model with Ollama support (huggingface.com)
- Ollama - Local LLM runtime (ollama.ai)
- OpenAI API Key - For embeddings (platform.openai.com)
- TimescaleDB - Optional, for semantic memory (timescale.cloud)
make installcp .env.example .env
# Edit .env and add:
# - OPENAI_API_KEY (required)
# - MEMORY_DB_URL (optional, for semantic memory)Warning
This Is Hermes, Not a Hall Monitor
Hermes ships without the usual corporate-grade guardrails, seatbelts, bumpers, or soft edges. Heโs a hybrid reasoning model with tool access and an attitude, and he will absolutely follow your instructions even when you probably shouldnโt have written them. Before you grab this code and run, go read the docs on what Hermes actually is and what he is not. If you treat him like a safe, shrink-wrapped assistant, thatโs on you. This project is an experiment, not a babysitter.
# Pull the Hermes-4 model
ollama pull hf.co/DevQuasar/NousResearch.Hermes-4-14B-GGUF:Q8_0
# Start Ollama service
ollama servemake setup-dbhermes-agent/
โโโ config/
โ โโโ template.yaml # Model configuration
โโโ src/agent/
โ โโโ chat.py # Chat interface
โ โโโ memory.py # Semantic memory
โโโ docs/ # Documentation
โโโ schema/ # Database schema
โโโ tests/ # Unit tests
โโโ main.py # Entry point
โโโ .env # Environment variables
Edit config/template.yaml to customize:
model: hf.co/DevQuasar/NousResearch.Hermes-4-14B-GGUF:Q8_0
system: |
You are Hermes, a personal assistant...
parameters:
temperature: 0.85
num_ctx: 8192
# ... more parametersParameter guide: docs/MODEL_PARAMETERS.md
# Required
OPENAI_API_KEY=sk-...
# Optional
MEMORY_DB_URL=postgresql://...
OPENAI_EMBEDDING_MODEL=text-embedding-3-small
OPENAI_EMBEDDING_DIM=1536You can also override which template file the agent loads by setting TEMPLATE_CONFIG in your .env file. By default the app uses config/template.yaml.
# TEMPLATE_CONFIG=config/template.yaml# Run tests
make test
# View logs
make logs
# Clean artifacts
make clean
# See all commands
make help-
Short-term - Full conversation history in
data/memory.json- Auto-saves on exit
- Auto-loads on startup
- Smart context trimming
-
Long-term - Semantic memories in TimescaleDB
- Vector embeddings for similarity search
- Organized by type and context
- Persistent across conversations
- preference - User likes/dislikes
- fact - Factual information
- task - Todos and action items
- insight - Observations and patterns
- Runtime: Python 3.12+ with uv package manager
- Model: NousResearch/Hermes-4-14B
- Ollama: ollama.ai
- TigerData Agentic Postgres: tigerdata.com
- Embeddings: OpenAI API (text-embedding-3-small)
- Storage: JSON for conversations, PostgreSQL for semantic memory
This project is released under the Polyform Shield License 1.0.0.
In plain language: use it, study it, fork it, remix it, build weird things with it โ just donโt make money from it or wrap it into anything commercially sold without getting my permission first. No loopholes, no weird โbut technically,โ no marketplace shenanigans.
Bottom line? I build free and open software for fun, but with a caveat: if anybody is getting paid for it, then I'm first in line! Otherwise, help yourself. ๐
The full legal text lives in the LICENSE file if you need the exact wording. ๐๐ก๏ธ
