GitHub - anchildress1/my-hermantic-agent: 🏗️ WIP: A small but dangerous playground for Hermes-4-14B: hybrid reasoning, tool use, dual memory, and zero guardrails. Because I can. 🔮⚡

Important

🦄 I picked Hermes because he isn’t just another polite autocomplete model. He thinks, he reasons, he grabs tools without asking, and he’ll absolutely fire back when I deserve it. Most models fold when I push; Hermes accelerates. So this repo is my unapologetically chaotic testbed: no roadmap, no safety rails, just an agent let off the leash to see what it does.

Underneath the sparkles and smoke it’s a CLI-driven conversational setup running NousResearch’s Hermes-4-14B locally through Ollama, with persistent semantic memory stitched together from Tiger’s TimescaleDB and OpenAI embeddings. It uses dual memory — short-term context plus long-term recall — manages its own state, and talks the only way I tolerate: direct, sharp, and not here to babysit anyone.

Basically, I handed a capable reasoning model a sandbox, a blowtorch, and too much autonomy. Now I’m watching to see what grows teeth. 🔧✨🔥

Quick Start

# Install dependencies
make install

# Setup environment
make setup
# Edit .env with your OPENAI_API_KEY

# Initialize database (optional, for semantic memory)
make setup-db

# Start chatting
make run

Full setup guide: QUICKSTART.md

Features

🤖 Local LLM - Runs Ollama models locally, no cloud dependency
💾 Dual Memory - Short-term conversation history + long-term semantic memory
🔍 Semantic Search - Find relevant memories by meaning, not just keywords
🎯 Smart Context - Auto-trims conversations to stay within token limits
📝 Persistent - Conversations auto-save and resume where you left off
⚡ Fast - Connection pooling, embedding caching, optimized queries
🛡️ Robust - Comprehensive error handling, atomic file writes, graceful degradation

Architecture

%%{init: {'theme':'base', 'lightTheme':'default', 'darkTheme':'dark', 'securityLevel':'strict', 'accessibility': {'label': 'Architecture diagram', 'description': 'System architecture and data flow; supports light/dark modes.'}}}%%
graph TB
    CLI[CLI Interface] --> Chat[Chat Manager]
    Chat --> Ollama[Ollama LLM]
    Chat --> Memory[Memory Store]
    Chat --> JSON[Conversation JSON]
    Memory --> OpenAI[OpenAI Embeddings]
    Memory --> TimescaleDB[(TimescaleDB + pgvector)]

Detailed diagrams: docs/ARCHITECTURE.md

Documentation

📖 User Guide - How to use the agent effectively
🏗️ Architecture - System design and data flow
🎛️ Model Parameters - Hermes-4 configuration guide
🔧 Memory API - Semantic memory reference
🚀 Quick Start - 5-minute setup guide

Usage

Basic Chat

make run

💬 You: What's the capital of France?
🤖 Assistant: Paris.

💬 You: quit
💾 Memory saved to data/memory.json
Goodbye!

Memory Commands

# Store a memory
/remember I prefer Python over JavaScript
Type: preference
Context: coding
✓ Memory stored with ID 1

# Search memories
/recall programming preferences
🔍 Found 1 relevant memories:
  [1] PREFERENCE | coding
      I prefer Python over JavaScript
      Similarity: 0.923

# View statistics
/stats
📊 Total memories: 42 | Contexts: 3 | Types: 4

Full command reference: docs/USER_GUIDE.md

Requirements

Python 3.12+
Hugging Face - Any model with Ollama support (huggingface.com)
Ollama - Local LLM runtime (ollama.ai)
OpenAI API Key - For embeddings (platform.openai.com)
TimescaleDB - Optional, for semantic memory (timescale.cloud)

Installation

1. Install Dependencies

make install

2. Configure Environment

cp .env.example .env
# Edit .env and add:
# - OPENAI_API_KEY (required)
# - MEMORY_DB_URL (optional, for semantic memory)

3. Setup Ollama

Warning

This Is Hermes, Not a Hall Monitor

Hermes ships without the usual corporate-grade guardrails, seatbelts, bumpers, or soft edges. He’s a hybrid reasoning model with tool access and an attitude, and he will absolutely follow your instructions even when you probably shouldn’t have written them. Before you grab this code and run, go read the docs on what Hermes actually is and what he is not. If you treat him like a safe, shrink-wrapped assistant, that’s on you. This project is an experiment, not a babysitter.

# Pull the Hermes-4 model
ollama pull hf.co/DevQuasar/NousResearch.Hermes-4-14B-GGUF:Q8_0

# Start Ollama service
ollama serve

4. Initialize Database (Optional)

make setup-db

Project Structure

hermes-agent/
├── config/
│   └── template.yaml      # Model configuration
├── src/agent/
│   ├── chat.py            # Chat interface
│   └── memory.py          # Semantic memory
├── docs/                  # Documentation
├── schema/                # Database schema
├── tests/                 # Unit tests
├── main.py                # Entry point
└── .env                   # Environment variables

Configuration

Model Settings

Edit config/template.yaml to customize:

model: hf.co/DevQuasar/NousResearch.Hermes-4-14B-GGUF:Q8_0
system: |
  You are Hermes, a personal assistant...
parameters:
  temperature: 0.85
  num_ctx: 8192
  # ... more parameters

Parameter guide: docs/MODEL_PARAMETERS.md

Environment Variables

# Required
OPENAI_API_KEY=sk-...

# Optional
MEMORY_DB_URL=postgresql://...
OPENAI_EMBEDDING_MODEL=text-embedding-3-small
OPENAI_EMBEDDING_DIM=1536

You can also override which template file the agent loads by setting TEMPLATE_CONFIG in your .env file. By default the app uses config/template.yaml.

# TEMPLATE_CONFIG=config/template.yaml

Development

# Run tests
make test

# View logs
make logs

# Clean artifacts
make clean

# See all commands
make help

Memory System

Dual-Memory Architecture

Short-term - Full conversation history in data/memory.json
- Auto-saves on exit
- Auto-loads on startup
- Smart context trimming
Long-term - Semantic memories in TimescaleDB
- Vector embeddings for similarity search
- Organized by type and context
- Persistent across conversations

Memory Types

preference - User likes/dislikes
fact - Factual information
task - Todos and action items
insight - Observations and patterns

Tech Stack

Runtime: Python 3.12+ with uv package manager
Model: NousResearch/Hermes-4-14B
Ollama: ollama.ai
TigerData Agentic Postgres: tigerdata.com
Embeddings: OpenAI API (text-embedding-3-small)
Storage: JSON for conversations, PostgreSQL for semantic memory

License

This project is released under the Polyform Shield License 1.0.0.

In plain language: use it, study it, fork it, remix it, build weird things with it — just don’t make money from it or wrap it into anything commercially sold without getting my permission first. No loopholes, no weird “but technically,” no marketplace shenanigans.

Bottom line? I build free and open software for fun, but with a caveat: if anybody is getting paid for it, then I'm first in line! Otherwise, help yourself. 😇

The full legal text lives in the LICENSE file if you need the exact wording. 📜🛡️

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
.github		.github
.kiro		.kiro
assets		assets
config		config
docs		docs
schema		schema
scripts		scripts
src		src
tests		tests
.coveragerc		.coveragerc
.env.example		.env.example
.gitignore		.gitignore
.python-version		.python-version
AGENTS.md		AGENTS.md
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
SECURITY.md		SECURITY.md
TTD.md		TTD.md
main.py		main.py
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Quick Start

Features

Architecture

Documentation

Usage

Basic Chat

Memory Commands

Requirements

Installation

1. Install Dependencies

2. Configure Environment

3. Setup Ollama

4. Initialize Database (Optional)

Project Structure

Configuration

Model Settings

Environment Variables

Development

Memory System

Dual-Memory Architecture

Memory Types

Tech Stack

License

About

Uh oh!

Uh oh!

Languages

License

anchildress1/my-hermantic-agent

Folders and files

Latest commit

History

Repository files navigation

Quick Start

Features

Architecture

Documentation

Usage

Basic Chat

Memory Commands

Requirements

Installation

1. Install Dependencies

2. Configure Environment

3. Setup Ollama

4. Initialize Database (Optional)

Project Structure

Configuration

Model Settings

Environment Variables

Development

Memory System

Dual-Memory Architecture

Memory Types

Tech Stack

License

About

Topics

Resources

License

Security policy

Uh oh!

Stars

Watchers

Forks

Uh oh!

Languages