Skip to content

docs: add pipeline concepts page explaining build, reduce, and score#152

Open
luciaquirke wants to merge 3 commits intomainfrom
claude/issue-142-20260217-0914
Open

docs: add pipeline concepts page explaining build, reduce, and score#152
luciaquirke wants to merge 3 commits intomainfrom
claude/issue-142-20260217-0914

Conversation

@luciaquirke
Copy link
Collaborator

Adds docs/pipeline.rst with explanations of what each CLI command does, what it produces, when to use each one, and how they fit together. Also includes a note on preconditioner ordering and a worked LESS-style example.

Closes #142

Generated with Claude Code

Adds docs/pipeline.rst with explanations of what each CLI command does,
what it produces, when to use each one, and how they fit together. Also
includes a note on preconditioner ordering and a worked LESS-style example.

Closes #142

Co-authored-by: Lucia Quirke <luciaquirke@users.noreply.github.com>
@luciaquirke
Copy link
Collaborator Author

@claude can you specify that reduce is on-disk too?

@claude
Copy link

claude bot commented Feb 17, 2026

Claude finished @luciaquirke's task in 27s —— View job


Tasks

  • Gather context
  • Update reduce section to specify it's on-disk
  • Commit and push

Updated the overview bullet in docs/pipeline.rst (line 21) from:

reduce aggregates all gradients from a dataset into a single vector.

to:

reduce aggregates all gradients from a dataset into a single vector and writes it to an on-disk file.

Co-authored-by: Lucia Quirke <luciaquirke@users.noreply.github.com>

Do you want to search a gradient index interactively (e.g. per-prompt)?
├── Yes → use build + query
└── No → Do you want to rank a dataset by influence against a fixed query?
Copy link
Collaborator Author

@luciaquirke luciaquirke Feb 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This question should be Do you want to use full gradients without random projection or preconditioning?

--unit_normalize \
--projection_dim 0

The resulting ``runs/scores/scores.npy`` contains one score per training example.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

scores.bin

@luciaquirke
Copy link
Collaborator Author

@claude could you please address my two comments?

@claude
Copy link

claude bot commented Feb 17, 2026

Claude finished @luciaquirke's task in 40s —— View job


Tasks

  • Gather context
  • Fix question text at line 197
  • Fix filename at line 259
  • Commit and push

Both review comments addressed:

  • Line 197: Updated decision tree question to "Do you want to use full gradients without random projection or preconditioning?"
  • Line 259: Changed scores.npy to scores.bin

Co-authored-by: Lucia Quirke <luciaquirke@users.noreply.github.com>

A directory at ``run_path`` containing:

- ``data.npy`` — a memory-mapped array of shape ``[num_examples, projection_dim]``
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the shape is wrong, it's a structured array


- ``data.npy`` — a memory-mapped array of shape ``[num_examples, projection_dim]``
(or ``[num_examples, param_dim]`` if no projection).
- ``indices.jsonl`` — per-example metadata.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wrong i think


A directory at ``run_path`` containing:

- ``data.npy`` — a single aggregated gradient vector of shape ``[1, param_dim]``.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wrong


A directory at ``run_path`` containing:

- ``scores.npy`` — a memory-mapped array of shape ``[num_examples]`` (or
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wrong

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Trackstar does not work correctly with preconditioners

1 participant