Skip to content

Add onboarding reproduction logs for @KhanShaheb34#3166

Merged
lintool merged 3 commits intocastorini:masterfrom
KhanShaheb34:onboarding-repro-logs
Apr 5, 2026
Merged

Add onboarding reproduction logs for @KhanShaheb34#3166
lintool merged 3 commits intocastorini:masterfrom
KhanShaheb34:onboarding-repro-logs

Conversation

@KhanShaheb34
Copy link
Copy Markdown
Contributor

@KhanShaheb34 KhanShaheb34 commented Mar 20, 2026

Reproduction log entries for onboarding steps 1-3 (start-here, BM25 baselines, dense retrieval).

Setup:

  • OS: macOS (Tahoe 26.4)
  • Machine: Mac (M1 Pro) with 16 GB RAM
  • Java: JDK 21
  • Python: 3.11.4 (via pyenv)

Notes:

  • Had to change -Xmx192G to -Xmx8G in bin/run.sh, the default 192 GB max heap caused the OS to kill the indexing process on a 16 GB machine.
  • All three steps completed successfully:
    • start-here: Data prep and exploration, all file counts and checksums matched.
    • experiments-msmarco-passage: MRR@10 = 0.1874, MAP = 0.1957, Recall@1000 = 0.8573
    • experiments-msmarco-passage2: BM25 prebuilt MRR@10 = 0.1875, BGE-base dense MRR@10 = 0.3521

@lintool
Copy link
Copy Markdown
Member

lintool commented Mar 25, 2026

Please provide more details on OS, system setup, etc. Did everything okay?

@KhanShaheb34
Copy link
Copy Markdown
Contributor Author

Please provide more details on OS, system setup, etc. Did everything okay?

I've added the machine details and a few notes from my reproduction experience. Please review now @lintool .

@lintool
Copy link
Copy Markdown
Member

lintool commented Apr 5, 2026

@KhanShaheb34 please resolve conflicts.

@KhanShaheb34
Copy link
Copy Markdown
Contributor Author

KhanShaheb34 commented Apr 5, 2026

I've resolved the conflicts @lintool

@lintool lintool merged commit 295fdbf into castorini:master Apr 5, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants