-
Notifications
You must be signed in to change notification settings - Fork 0
Open
Description
Idea: Fine-tuning Gemma 3 270M to Offload Information Extraction
Goal: use a tiny local model (Gemma 3 270M) to extract memory-worthy facts/preferences/commitments from conversations and emit structured JSON for save_memory
without relying on large LLMs.
- Model: Gemma 3 270M (text-only) – runs on CPU; easy to quantize and deploy locally
- Framework: Unsloth fine-tuning (supports Gemma 3; works on consumer GPUs/CPUs)
- Reference: Unsloth Gemma 3 guide link
Dataset and Labels (minimal JSONL)
- Each record contains a conversation window and expected extraction payload
- Focus labels:
facts
,preferences
,promises
,entities
,projects
with spans and confidence
Example (JSONL):
{"input": "User: I prefer dark theme. Also, remind me to ship v1 next Friday.\nAssistant: noted.",
"output": {
"preferences": [{"key": "ui_theme", "value": "dark"}],
"promises": [{"who": "user", "what": "ship v1", "when": "next Friday"}],
"entities": ["v1"],
"facts": [],
"projects": ["v1-release"]
}}
Prompt/Output Contract
- Prompt template (few-shot) encourages strict JSON only
- Require the model to return a single JSON object with the above keys
Inference prompt (example):
You are an extractor. From the conversation below, extract memory-worthy items.
Return STRICT JSON with keys: preferences, promises, entities, facts, projects.
CONVERSATION:\n{conversation}
Training Notes (Unsloth)
- Use LoRA fine-tuning with small batch sizes
- Recommended inference defaults for Gemma 3: temperature 1.0, top_k 64, top_p 0.95 (per Unsloth guide)
- Export to GGUF for local runtimes (Ollama/llama.cpp) if desired
- See Unsloth Gemma 3 guide for detailed steps and notebooks: link
gau-nernst and DESU-CLUB