Skip to content

๐Ÿ—๏ธ WIP: A small but dangerous playground for Hermes-4-14B: hybrid reasoning, tool use, dual memory, and zero guardrails. Because I can. ๐Ÿ”ฎโšก

License

Notifications You must be signed in to change notification settings

anchildress1/my-hermantic-agent

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

7 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

My Hermantic Agent Banner

Important

๐Ÿฆ„ I picked Hermes because he isnโ€™t just another polite autocomplete model. He thinks, he reasons, he grabs tools without asking, and heโ€™ll absolutely fire back when I deserve it. Most models fold when I push; Hermes accelerates. So this repo is my unapologetically chaotic testbed: no roadmap, no safety rails, just an agent let off the leash to see what it does.

Underneath the sparkles and smoke itโ€™s a CLI-driven conversational setup running NousResearchโ€™s Hermes-4-14B locally through Ollama, with persistent semantic memory stitched together from Tigerโ€™s TimescaleDB and OpenAI embeddings. It uses dual memory โ€” short-term context plus long-term recall โ€” manages its own state, and talks the only way I tolerate: direct, sharp, and not here to babysit anyone.

Basically, I handed a capable reasoning model a sandbox, a blowtorch, and too much autonomy. Now Iโ€™m watching to see what grows teeth. ๐Ÿ”งโœจ๐Ÿ”ฅ


Quick Start

# Install dependencies
make install

# Setup environment
make setup
# Edit .env with your OPENAI_API_KEY

# Initialize database (optional, for semantic memory)
make setup-db

# Start chatting
make run

Full setup guide: QUICKSTART.md


Features

  • ๐Ÿค– Local LLM - Runs Ollama models locally, no cloud dependency
  • ๐Ÿ’พ Dual Memory - Short-term conversation history + long-term semantic memory
  • ๐Ÿ” Semantic Search - Find relevant memories by meaning, not just keywords
  • ๐ŸŽฏ Smart Context - Auto-trims conversations to stay within token limits
  • ๐Ÿ“ Persistent - Conversations auto-save and resume where you left off
  • โšก Fast - Connection pooling, embedding caching, optimized queries
  • ๐Ÿ›ก๏ธ Robust - Comprehensive error handling, atomic file writes, graceful degradation

Architecture

%%{init: {'theme':'base', 'lightTheme':'default', 'darkTheme':'dark', 'securityLevel':'strict', 'accessibility': {'label': 'Architecture diagram', 'description': 'System architecture and data flow; supports light/dark modes.'}}}%%
graph TB
    CLI[CLI Interface] --> Chat[Chat Manager]
    Chat --> Ollama[Ollama LLM]
    Chat --> Memory[Memory Store]
    Chat --> JSON[Conversation JSON]
    Memory --> OpenAI[OpenAI Embeddings]
    Memory --> TimescaleDB[(TimescaleDB + pgvector)]
Loading

Detailed diagrams: docs/ARCHITECTURE.md


Documentation


Usage

Basic Chat

make run
๐Ÿ’ฌ You: What's the capital of France?
๐Ÿค– Assistant: Paris.

๐Ÿ’ฌ You: quit
๐Ÿ’พ Memory saved to data/memory.json
Goodbye!

Memory Commands

# Store a memory
/remember I prefer Python over JavaScript
Type: preference
Context: coding
โœ“ Memory stored with ID 1

# Search memories
/recall programming preferences
๐Ÿ” Found 1 relevant memories:
  [1] PREFERENCE | coding
      I prefer Python over JavaScript
      Similarity: 0.923

# View statistics
/stats
๐Ÿ“Š Total memories: 42 | Contexts: 3 | Types: 4

Full command reference: docs/USER_GUIDE.md


Requirements


Installation

1. Install Dependencies

make install

2. Configure Environment

cp .env.example .env
# Edit .env and add:
# - OPENAI_API_KEY (required)
# - MEMORY_DB_URL (optional, for semantic memory)

3. Setup Ollama

Warning

This Is Hermes, Not a Hall Monitor

Hermes ships without the usual corporate-grade guardrails, seatbelts, bumpers, or soft edges. Heโ€™s a hybrid reasoning model with tool access and an attitude, and he will absolutely follow your instructions even when you probably shouldnโ€™t have written them. Before you grab this code and run, go read the docs on what Hermes actually is and what he is not. If you treat him like a safe, shrink-wrapped assistant, thatโ€™s on you. This project is an experiment, not a babysitter.

# Pull the Hermes-4 model
ollama pull hf.co/DevQuasar/NousResearch.Hermes-4-14B-GGUF:Q8_0

# Start Ollama service
ollama serve

4. Initialize Database (Optional)

make setup-db

Project Structure

hermes-agent/
โ”œโ”€โ”€ config/
โ”‚   โ””โ”€โ”€ template.yaml      # Model configuration
โ”œโ”€โ”€ src/agent/
โ”‚   โ”œโ”€โ”€ chat.py            # Chat interface
โ”‚   โ””โ”€โ”€ memory.py          # Semantic memory
โ”œโ”€โ”€ docs/                  # Documentation
โ”œโ”€โ”€ schema/                # Database schema
โ”œโ”€โ”€ tests/                 # Unit tests
โ”œโ”€โ”€ main.py                # Entry point
โ””โ”€โ”€ .env                   # Environment variables

Configuration

Model Settings

Edit config/template.yaml to customize:

model: hf.co/DevQuasar/NousResearch.Hermes-4-14B-GGUF:Q8_0
system: |
  You are Hermes, a personal assistant...
parameters:
  temperature: 0.85
  num_ctx: 8192
  # ... more parameters

Parameter guide: docs/MODEL_PARAMETERS.md

Environment Variables

# Required
OPENAI_API_KEY=sk-...

# Optional
MEMORY_DB_URL=postgresql://...
OPENAI_EMBEDDING_MODEL=text-embedding-3-small
OPENAI_EMBEDDING_DIM=1536

You can also override which template file the agent loads by setting TEMPLATE_CONFIG in your .env file. By default the app uses config/template.yaml.

# TEMPLATE_CONFIG=config/template.yaml

Development

# Run tests
make test

# View logs
make logs

# Clean artifacts
make clean

# See all commands
make help

Memory System

Dual-Memory Architecture

  1. Short-term - Full conversation history in data/memory.json

    • Auto-saves on exit
    • Auto-loads on startup
    • Smart context trimming
  2. Long-term - Semantic memories in TimescaleDB

    • Vector embeddings for similarity search
    • Organized by type and context
    • Persistent across conversations

Memory Types

  • preference - User likes/dislikes
  • fact - Factual information
  • task - Todos and action items
  • insight - Observations and patterns

Tech Stack

Python PostgreSQL TimescaleDB

Hermes Hugging Face Badge Ollama OpenAI

Kiro Verdant GitHub Copilot ChatGPT


License

This project is released under the Polyform Shield License 1.0.0.

In plain language: use it, study it, fork it, remix it, build weird things with it โ€” just donโ€™t make money from it or wrap it into anything commercially sold without getting my permission first. No loopholes, no weird โ€œbut technically,โ€ no marketplace shenanigans.

Bottom line? I build free and open software for fun, but with a caveat: if anybody is getting paid for it, then I'm first in line! Otherwise, help yourself. ๐Ÿ˜‡

The full legal text lives in the LICENSE file if you need the exact wording. ๐Ÿ“œ๐Ÿ›ก๏ธ

About

๐Ÿ—๏ธ WIP: A small but dangerous playground for Hermes-4-14B: hybrid reasoning, tool use, dual memory, and zero guardrails. Because I can. ๐Ÿ”ฎโšก

Topics

Resources

License

Security policy

Stars

Watchers

Forks