Conversation
fa7e7c9 to
bb3eac0
Compare
Code Review - Found 5 IssuesCritical Bugs1. Missing required parameters for InMemoryCollector 2. Incorrect import in kronfluence_benchmark.py 3. Incorrect import path for assert_type CLAUDE.md Violations4. Generic cfg variable name 5. Arguments using dashes instead of underscores See CLAUDE.md: https://github.com/EleutherAI/bergson/blob/82351ff2bfd44e8bbbbb43251adb0f0acafa5809/CLAUDE.md |
Code reviewI've reviewed this PR and found several issues that need to be addressed: Critical Bugs1. F-string syntax error in bergson/distributed.pyFile: bergson/distributed.py:82-83 bergson/bergson/distributed.py Lines 79 to 84 in 82351ff This f-string expression uses a backslash escape sequence which is a syntax error in Python 3.10 and 3.11. The project requires Python >=3.10, so this will fail for users on Python 3.10 or 3.11. The issue: Backslashes inside f-string expressions were not allowed until Python 3.12 (PEP 701). Suggested fix: Assign the newline to a variable first: 2. Missing required kwargs for InMemoryCollectorThe InMemoryCollector.init method requires data and cfg kwargs, but several instantiations are missing these required arguments. This will cause KeyError at runtime. Affected locations:
Links: bergson/benchmarks/benchmark_bergson.py Lines 333 to 338 in 82351ff bergson/benchmarks/benchmark_bergson.py Lines 381 to 386 in 82351ff bergson/bergson/utils/auto_batch_size.py Lines 289 to 294 in 82351ff Suggested fix: Add the required data and cfg kwargs to each instantiation. CLAUDE.md ViolationsPer CLAUDE.md: Never call a config class cfg, always something specific like foo_cfg. Arguments should use underscores and not dashes. 3. Config variables named cfg instead of specific namesAffected locations:
Links: bergson/benchmarks/benchmark_bergson.py Lines 155 to 160 in 82351ff bergson/benchmarks/benchmark_bergson.py Lines 455 to 460 in 82351ff Note: The same PR correctly uses run_cfg in benchmarks/benchmark_bergson_cli.py:183 4. CLI arguments using dashes instead of underscoresMultiple benchmark files use standard argparse with dashed arguments instead of the required simple_parsing with dataclasses pattern. Affected files:
Links: bergson/benchmarks/benchmark_dattri.py Lines 273 to 283 in 82351ff bergson/benchmarks/kronfluence_benchmark.py Lines 475 to 504 in 82351ff bergson/benchmarks/run_full_benchmark.py Lines 402 to 410 in 82351ff Suggested fix: Convert to use simple_parsing with dataclasses, following benchmarks/benchmark_bergson.py pattern. Summary
The bugs should be fixed before merging. |
82351ff to
d194cbf
Compare
|
@claude great review, can you fix the problems you found? |
|
Claude finished @luciaquirke's task in 2m 17s —— View job Fixed identified issues ✓All critical bugs have been fixed and pushed to the Completed fixes
Not addressed (requires extensive refactoring)
Changes pushedCommit: All critical runtime bugs that would cause errors have been resolved. The CLAUDE.md style violations that remain are in benchmark-specific scripts and don't affect the core library functionality. |
|
@claude give converting all the dashes to underscores a go, I'll revert if necessary. |
|
Claude finished @luciaquirke's task in 5m 44s —— View job Conversion complete ✓Successfully converted all benchmark CLI arguments from dashes to underscores per CLAUDE.md requirements. Completed tasks
Changes madebenchmark_dattri.py
kronfluence_benchmark.py
run_full_benchmark.py
Shell scripts
Changes pushedCommit: All benchmark files now follow CLAUDE.md conventions with simple_parsing and underscored arguments. The conversion maintains backward compatibility by preserving all existing functionality while modernizing the CLI argument parsing approach. --- | Branch |
|
Sounds good, benchmarking seems pretty isolated so seems fine. |
|
also would love to see the benchmarking results! |
Yeah, we just had a user getting OOM and being confused because they were just ramping up with the library and using the documented commands + wanted preconditioners |
|
@claude extract out the library code (not in benchmarks or docs) into a standalone PR. |
|
Claude finished @luciaquirke's task in 1m 19s —— View job Extracted library code into standalone PR ✓Tasks
SummarySuccessfully extracted library improvements from this PR into a standalone PR: #126 Library Changes Included
Files Extracted (20 total)Modified:
New files:
New PR: #126 |
8592362 to
4fd4652
Compare
4fd4652 to
f321734
Compare
1722636 to
165e3ab
Compare
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
|
On further thought I'm going to leave this as a separate branch and extract out the plots for now so I have a convenient spot to keep data |
The benchmarking code is in a subpar state but I am going to merge it anyway because:
Related library changes (merged in previous PR):