feat: subset tool #2

vintrocode · 2025-06-17T01:32:13Z

looking for a way to iterate faster and be more targeted with the data we ingest so it doesn't take 10 hours. one solid session with claude 4 sonnet in cursor came up with this locomo_tool.py script that helps us do that. instructions on how to use are in the readme, but some example outputs:

$ (locomo) ➜  locomo git:(vince/subset-tool) python3 locomo_tool.py explore --list-conversations

Available Conversations:
==================================================
 0: Caroline ↔ Melanie
 1: Jon ↔ Gina
 2: John ↔ Maria
 3: Joanna ↔ Nate
 4: Tim ↔ John
 5: Audrey ↔ Andrew
 6: James ↔ John
 7: Deborah ↔ Jolene
 8: Evan ↔ Sam
 9: Calvin ↔ Dave

$ (locomo) ➜  locomo git:(vince/subset-tool) python3 locomo_tool.py explore --conversation 0 --category 1 --n 5 --preview

Subset Preview: Caroline ↔ Melanie | Category 1 | Top 5 questions
================================================================================
✓ Found 5 questions (requested 5)

Selected Questions:
   1. "What did Caroline research?"
      Evidence: D2:8
   2. "What is Caroline's identity?"
      Evidence: D1:5
   3. "What is Caroline's relationship status?"
      Evidence: D3:13, D2:14
   4. "Where did Caroline move from 4 years ago?"
      Evidence: D3:13, D4:3
   5. "What career path has Caroline decided to persue?"
      Evidence: D4:13, D1:11

Latest Evidence: D4:13
Sessions to include: 1 to 4
  Session 1: All 18 messages
  Session 2: All 17 messages
  Session 3: All 23 messages
  Session 4: First 13 messages

Total messages in subset: 71

pretty neat. to subset you'd just run something like python3 locomo_tool.py subset --conversation 0 --category 1 --n 10 --output experiment_cat1.json and it'd output something that follows the (super messy) existing data structure so it should work downstream in all the evaluate scripts... just change the data path in e.g., the evaluate_honcho.sh script to point to your newly subsetted data file.

feat: subset tool

48aa220

vintrocode assigned danibalcells Jun 17, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: subset tool #2

feat: subset tool #2

Uh oh!

vintrocode commented Jun 17, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

feat: subset tool #2

Are you sure you want to change the base?

feat: subset tool #2

Uh oh!

Conversation

vintrocode commented Jun 17, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants