Skip to content

jason8745/llm-kaggle-agent

Repository files navigation

llm-kaggle-agent

An LLM-powered agent that analyzes top Kaggle competition solutions and accumulates ML expert knowledge over time.

Phase 1 (current): Find and analyze the highest-voted solution notebook for any competition — extracting task type, data characteristics, evaluation metric rationale, model architecture, and key insights.

Phase 2 (planned): Auto-research + automated solution building.


Features

  • Human-in-the-loop competition selection — search by rough keyword, pick from a list before the agent starts
  • Smart kernel filtering — automatically skips tutorial/EDA-only notebooks and finds the first real ML solution
  • Structured analysis — task type framing, data characteristics, evaluation metric alignment, model details, key insights
  • Knowledge accumulation — distilled takeaways are saved to memory/knowledge/ after each analysis
  • Expert chat mode — chat with an agent that draws on all accumulated competition knowledge
  • Token usage tracking — displayed after every run

Requirements

  • Python 3.12+
  • uv package manager
  • A Kaggle account with API token
  • An LLM API key (Azure OpenAI, Anthropic, or OpenAI)

Installation

git clone https://github.com/your-username/llm-kaggle-agent.git
cd llm-kaggle-agent
uv sync

Configuration

1. Kaggle API Token

Go to kaggle.com/settingsAPICreate New Token.

This downloads a kaggle.json file. You have two options:

Add the token to your .env file:

KAGGLE_API_TOKEN=KGAT_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

2. LLM Provider

Copy .env.example to .env and fill in your API key:

cp .env.example .env

Azure OpenAI:

AZURE_OPENAI_ENDPOINT=https://your-resource.openai.azure.com/
AZURE_OPENAI_API_VERSION=2024-12-01-preview
AZURE_OPENAI_DEPLOYMENT=gpt-4o
AZURE_OPENAI_API_KEY=your_key_here
KAGGLE_API_TOKEN=KGAT_your_token_here

Anthropic:

ANTHROPIC_API_KEY=sk-ant-your_key_here
KAGGLE_API_TOKEN=KGAT_your_token_here

OpenAI:

OPENAI_API_KEY=sk-your_key_here
KAGGLE_API_TOKEN=KGAT_your_token_here

Usage

analyze — Analyze a competition's top solution

Search by keyword. The agent shows matching competitions for you to confirm before starting.

uv run kaggle-agent analyze "digit recognizer" --provider azure

Example session:

╭─ Kaggle Agent — competition solution analyzer ─╮

  Searching Kaggle for 'digit recognizer'...

  Found 20 competition(s):

  1. Digit Recognizer
     slug: digit-recognizer  |  deadline: 2030-01-01  |  teams: 1299
  2. MNIST Digit Recognizer
     slug: mnist-digit-recognizer  |  deadline: 2023-06-25  |  teams: 149
  ...

  Select competition 1–10 (or 'q' to quit): 1

  ✓ Selected: Digit Recognizer (digit-recognizer)

Running analysis for Digit Recognizer...

→ list_competition_kernels(competition='digit-recognizer', sort_by='voteCount')
→ pull_kernel(kernel_ref='yassineghouzam/introduction-to-cnn-keras-0-997-top-6')
→ read_file(...)
→ think_tool(...)
→ save_analysis_report(...)
→ save_knowledge(...)

# Digit Recognizer Competition — Solution Analysis

## Task Type
Multi-class classification — recognizing which digit (0–9) is in a 28×28 grayscale image.

## Data Characteristics
- 42,000 training images, balanced across 10 classes
- No missing values; all features are pixel intensities (0–255)
...

╭───── Token Usage ──────╮
│ Input tokens:  125,791 │
│ Output tokens:  2,891  │
│ Total:         128,682 │
╰────────────────────────╯

The report is saved to reports/<competition-slug>/analysis.md. Distilled knowledge is saved to memory/knowledge/<competition-slug>.md.

Options:

Flag Default Description
--provider anthropic LLM provider: anthropic, openai, or azure
--model provider default Override model name (e.g. gpt-4.1, claude-opus-4-6)

chat — Chat with your accumulated ML knowledge

After analyzing one or more competitions, start a chat session where the agent answers questions as an ML expert drawing on all saved knowledge.

uv run kaggle-agent chat --provider azure

The agent loads all memory/knowledge/*.md files into its context and answers as an ML expert grounded in real competition solutions.

Example session:

You: 你可以告訴我 binary classification 要怎麼開始,還有後續的優化方向嗎

Agent:
當然可以!以下是針對 binary classification 問題的實戰流程與優化方向,
並結合我從 Kaggle 競賽解法中學到的具體經驗:

1. 問題理解與資料探索
   • 明確任務:確認目標是預測 0/1 或 True/False 的標籤
   • 資料探索(EDA):檢查缺失值、類別分布(是否平衡)、特徵分布
   • 例如:Spaceship Titanic 競賽中,目標是預測乘客是否被傳送(Transported: 0/1)

2. 基礎建模
   • 缺失值處理:數值型用中位數,布林型用 0/1,類別型用 "missing"
   • 選擇基礎模型:樹模型(RandomForest、LightGBM、TFDF)是 tabular 資料的強力基線
   • 評估指標:類別平衡用 Accuracy;不平衡用 AUC、F1-score、Recall

3. 優化方向
   特徵工程
   • 特徵拆解/組合:如 Spaceship Titanic 將 Cabin 拆成多欄位,提升表現
   • 根據領域知識創造新特徵

   模型優化
   • K-fold cross-validation 評估泛化能力
   • Grid Search / Bayesian Optimization 調參
   • 多模型 Stacking / Ensembling

   處理不平衡資料(如有)
   • 上採樣少數類別(SMOTE)或下採樣多數類別
   • 給少數類別更高的 class weight

參考 Kaggle 經驗:
• Spaceship Titanic:TFDF + 簡單前處理 + 特徵拆解 → 快速建立強基線
• Housing Prices:先用乾淨數值特徵建立基線,再逐步加入複雜特徵

  tokens this turn: 2,357  (in 2,337 / out 20)  │  session total: 2,357

The more competitions you analyze, the richer the agent's answers become — each analysis adds a new knowledge file that gets referenced in future chat sessions.


How Knowledge Accumulates

Each analyze run produces two outputs:

reports/
└── digit-recognizer/
    └── analysis.md          ← full structured report

memory/
└── knowledge/
    └── digit-recognizer.md  ← distilled ML takeaways (loaded in chat)

The knowledge file captures generalizable insights:

# Competition: Digit Recognizer

## Task Type
Multi-class image classification (10 classes)...

## Key Data Characteristics
- Balanced dataset — accuracy is a valid metric
- Pixel normalization to [0,1] is essential before feeding to CNN
...

## Generalizable Takeaways
- Data augmentation (rotation, zoom, shift) is the single biggest accuracy booster for image tasks
- For digit recognition, avoid horizontal/vertical flips — they create label noise (6↔9)
- ReduceLROnPlateau is a reliable, low-effort way to squeeze out extra accuracy

On the next chat session, all memory/knowledge/*.md files are loaded into the agent's context automatically.


Project Structure

llm-kaggle-agent/
├── kaggle_agent/
│   ├── agent.py              # LangGraph graph (analyze / chat modes)
│   ├── main.py               # CLI entry point
│   ├── tools/
│   │   ├── kaggle_tools.py   # Kaggle API: list competitions, leaderboard, kernels, pull
│   │   └── analysis_tools.py # think, read_file, save_report, save_knowledge, write_todos
│   ├── middleware/
│   │   ├── memory.py         # Loads competition history + knowledge base
│   │   ├── todo.py           # Task list tracking via write_todos tool
│   │   └── skills.py         # Loads skills/ directory into system prompt
│   └── prompts/
│       ├── system.md         # Analyze mode system prompt
│       └── chat_system.md    # Chat mode system prompt
├── skills/
│   └── kaggle-solution-analysis/
│       └── SKILL.md          # Structured checklist for analyzing a kernel
├── reports/                  # Analysis reports (gitignored)
├── memory/
│   └── knowledge/            # Accumulated ML expert knowledge (gitignored)
└── pyproject.toml

Notes

  • memory/ and reports/ are gitignored — they contain your personal analysis data
  • The agent filters out tutorial/EDA-only notebooks automatically; if the top-voted kernel has no ML model, it moves to the next one
  • Kernel source is downloaded to a temp directory (/tmp/kaggle_kernels/) and read locally — no external LLM calls on the raw notebook

About

LLM agent that analyzes top Kaggle competition solutions and accumulates ML expert knowledge over time.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages