An LLM-powered agent that analyzes top Kaggle competition solutions and accumulates ML expert knowledge over time.
Phase 1 (current): Find and analyze the highest-voted solution notebook for any competition — extracting task type, data characteristics, evaluation metric rationale, model architecture, and key insights.
Phase 2 (planned): Auto-research + automated solution building.
- Human-in-the-loop competition selection — search by rough keyword, pick from a list before the agent starts
- Smart kernel filtering — automatically skips tutorial/EDA-only notebooks and finds the first real ML solution
- Structured analysis — task type framing, data characteristics, evaluation metric alignment, model details, key insights
- Knowledge accumulation — distilled takeaways are saved to
memory/knowledge/after each analysis - Expert chat mode — chat with an agent that draws on all accumulated competition knowledge
- Token usage tracking — displayed after every run
- Python 3.12+
- uv package manager
- A Kaggle account with API token
- An LLM API key (Azure OpenAI, Anthropic, or OpenAI)
git clone https://github.com/your-username/llm-kaggle-agent.git
cd llm-kaggle-agent
uv syncGo to kaggle.com/settings → API → Create New Token.
This downloads a kaggle.json file. You have two options:
Add the token to your .env file:
KAGGLE_API_TOKEN=KGAT_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxCopy .env.example to .env and fill in your API key:
cp .env.example .envAzure OpenAI:
AZURE_OPENAI_ENDPOINT=https://your-resource.openai.azure.com/
AZURE_OPENAI_API_VERSION=2024-12-01-preview
AZURE_OPENAI_DEPLOYMENT=gpt-4o
AZURE_OPENAI_API_KEY=your_key_here
KAGGLE_API_TOKEN=KGAT_your_token_hereAnthropic:
ANTHROPIC_API_KEY=sk-ant-your_key_here
KAGGLE_API_TOKEN=KGAT_your_token_hereOpenAI:
OPENAI_API_KEY=sk-your_key_here
KAGGLE_API_TOKEN=KGAT_your_token_hereSearch by keyword. The agent shows matching competitions for you to confirm before starting.
uv run kaggle-agent analyze "digit recognizer" --provider azureExample session:
╭─ Kaggle Agent — competition solution analyzer ─╮
Searching Kaggle for 'digit recognizer'...
Found 20 competition(s):
1. Digit Recognizer
slug: digit-recognizer | deadline: 2030-01-01 | teams: 1299
2. MNIST Digit Recognizer
slug: mnist-digit-recognizer | deadline: 2023-06-25 | teams: 149
...
Select competition 1–10 (or 'q' to quit): 1
✓ Selected: Digit Recognizer (digit-recognizer)
Running analysis for Digit Recognizer...
→ list_competition_kernels(competition='digit-recognizer', sort_by='voteCount')
→ pull_kernel(kernel_ref='yassineghouzam/introduction-to-cnn-keras-0-997-top-6')
→ read_file(...)
→ think_tool(...)
→ save_analysis_report(...)
→ save_knowledge(...)
# Digit Recognizer Competition — Solution Analysis
## Task Type
Multi-class classification — recognizing which digit (0–9) is in a 28×28 grayscale image.
## Data Characteristics
- 42,000 training images, balanced across 10 classes
- No missing values; all features are pixel intensities (0–255)
...
╭───── Token Usage ──────╮
│ Input tokens: 125,791 │
│ Output tokens: 2,891 │
│ Total: 128,682 │
╰────────────────────────╯
The report is saved to reports/<competition-slug>/analysis.md.
Distilled knowledge is saved to memory/knowledge/<competition-slug>.md.
Options:
| Flag | Default | Description |
|---|---|---|
--provider |
anthropic |
LLM provider: anthropic, openai, or azure |
--model |
provider default | Override model name (e.g. gpt-4.1, claude-opus-4-6) |
After analyzing one or more competitions, start a chat session where the agent answers questions as an ML expert drawing on all saved knowledge.
uv run kaggle-agent chat --provider azureThe agent loads all memory/knowledge/*.md files into its context and answers as an ML expert grounded in real competition solutions.
Example session:
You: 你可以告訴我 binary classification 要怎麼開始,還有後續的優化方向嗎
Agent:
當然可以!以下是針對 binary classification 問題的實戰流程與優化方向,
並結合我從 Kaggle 競賽解法中學到的具體經驗:
1. 問題理解與資料探索
• 明確任務:確認目標是預測 0/1 或 True/False 的標籤
• 資料探索(EDA):檢查缺失值、類別分布(是否平衡)、特徵分布
• 例如:Spaceship Titanic 競賽中,目標是預測乘客是否被傳送(Transported: 0/1)
2. 基礎建模
• 缺失值處理:數值型用中位數,布林型用 0/1,類別型用 "missing"
• 選擇基礎模型:樹模型(RandomForest、LightGBM、TFDF)是 tabular 資料的強力基線
• 評估指標:類別平衡用 Accuracy;不平衡用 AUC、F1-score、Recall
3. 優化方向
特徵工程
• 特徵拆解/組合:如 Spaceship Titanic 將 Cabin 拆成多欄位,提升表現
• 根據領域知識創造新特徵
模型優化
• K-fold cross-validation 評估泛化能力
• Grid Search / Bayesian Optimization 調參
• 多模型 Stacking / Ensembling
處理不平衡資料(如有)
• 上採樣少數類別(SMOTE)或下採樣多數類別
• 給少數類別更高的 class weight
參考 Kaggle 經驗:
• Spaceship Titanic:TFDF + 簡單前處理 + 特徵拆解 → 快速建立強基線
• Housing Prices:先用乾淨數值特徵建立基線,再逐步加入複雜特徵
tokens this turn: 2,357 (in 2,337 / out 20) │ session total: 2,357
The more competitions you analyze, the richer the agent's answers become — each analysis adds a new knowledge file that gets referenced in future chat sessions.
Each analyze run produces two outputs:
reports/
└── digit-recognizer/
└── analysis.md ← full structured report
memory/
└── knowledge/
└── digit-recognizer.md ← distilled ML takeaways (loaded in chat)
The knowledge file captures generalizable insights:
# Competition: Digit Recognizer
## Task Type
Multi-class image classification (10 classes)...
## Key Data Characteristics
- Balanced dataset — accuracy is a valid metric
- Pixel normalization to [0,1] is essential before feeding to CNN
...
## Generalizable Takeaways
- Data augmentation (rotation, zoom, shift) is the single biggest accuracy booster for image tasks
- For digit recognition, avoid horizontal/vertical flips — they create label noise (6↔9)
- ReduceLROnPlateau is a reliable, low-effort way to squeeze out extra accuracyOn the next chat session, all memory/knowledge/*.md files are loaded into the agent's context automatically.
llm-kaggle-agent/
├── kaggle_agent/
│ ├── agent.py # LangGraph graph (analyze / chat modes)
│ ├── main.py # CLI entry point
│ ├── tools/
│ │ ├── kaggle_tools.py # Kaggle API: list competitions, leaderboard, kernels, pull
│ │ └── analysis_tools.py # think, read_file, save_report, save_knowledge, write_todos
│ ├── middleware/
│ │ ├── memory.py # Loads competition history + knowledge base
│ │ ├── todo.py # Task list tracking via write_todos tool
│ │ └── skills.py # Loads skills/ directory into system prompt
│ └── prompts/
│ ├── system.md # Analyze mode system prompt
│ └── chat_system.md # Chat mode system prompt
├── skills/
│ └── kaggle-solution-analysis/
│ └── SKILL.md # Structured checklist for analyzing a kernel
├── reports/ # Analysis reports (gitignored)
├── memory/
│ └── knowledge/ # Accumulated ML expert knowledge (gitignored)
└── pyproject.toml
memory/andreports/are gitignored — they contain your personal analysis data- The agent filters out tutorial/EDA-only notebooks automatically; if the top-voted kernel has no ML model, it moves to the next one
- Kernel source is downloaded to a temp directory (
/tmp/kaggle_kernels/) and read locally — no external LLM calls on the raw notebook