-
Notifications
You must be signed in to change notification settings - Fork 8
Feat/ci #14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
- Add GitHub Actions workflows for CI and Docker publishing - CI workflow runs tests with pytest and lints with ruff - Docker publish workflow builds on releases and manual triggers - Multi-architecture support (amd64/arm64) for Docker images - Organize Docker files in dedicated docker/ directory - Move all Docker-related files to docker/ folder - Add docker-compose.validator.yml for production deployment - Include Watchtower for automatic container updates - Configure ruff linter in pyproject.toml - Add ruff configuration with sensible defaults - Include in dev dependencies - Add comprehensive validator deployment documentation - Document Watchtower auto-update setup - Provide environment configuration examples - Include troubleshooting and best practices 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <[email protected]>
- Remove GitHub, HF, and Axon configuration options - Update Watchtower interval to 30 seconds - Simplify environment variable documentation - Reflect actual docker-compose.validator.yml state
|
Caution Review failedThe pull request is closed. WalkthroughAdds CI and Docker publish workflows, Docker validator compose and deployment docs; updates docker paths in scripts; adds Ruff config; modernizes type hints across many modules; expands package exports; refactors monitoring manager and WandB backend; introduces an emission-burn config and test for MonitoringConfig. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
actor User
participant CLI as grail.cli (Typer)
participant Env as .env loader
participant MonCfg as MonitoringConfig
participant MonMgr as MonitoringManager
participant Backend as MonitoringBackend (WandB/Null)
User->>CLI: invoke command
CLI->>Env: load_dotenv(override=True)
Env-->>CLI: env vars (best-effort)
CLI->>MonCfg: from_environment()
MonCfg-->>CLI: config dict
CLI->>MonMgr: initialize(config)
alt monitoring enabled
MonMgr->>Backend: initialize(config)
Backend-->>MonMgr: ok / fallback
else disabled/failed
MonMgr->>MonMgr: install NullBackend
end
CLI-->>User: run subcommand
sequenceDiagram
autonumber
participant GH as GitHub (release/dispatch)
participant WF as docker-publish.yml
participant Test as job:test
participant Build as job:build-and-push
participant Meta as docker/metadata-action
participant Reg as ghcr.io
GH->>WF: trigger (release/workflow_dispatch or release)
alt skip_tests != 'true'
WF->>Test: run pytest via uv
Test-->>WF: success/fail
else skip
WF-->>WF: tests skipped
end
WF->>Build: proceed
Build->>Meta: extract tags (release or dispatch)
Build->>Reg: docker login
Build->>Reg: buildx build+push (amd64) with cache & provenance
Reg-->>Build: image digest & tags
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Suggested labels
Suggested reviewers
Poem
📜 Recent review detailsConfiguration used: CodeRabbit UI Review profile: CHILL Plan: Pro 📒 Files selected for processing (6)
✨ Finishing touches
🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Changed --dev-dependencies to --dev to match current uv CLI
- Add proper exception chaining with 'from e' for B904 errors - Remove unused Optional import from grail/cli/__init__.py - All 269 ruff errors now fixed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
docs/local-subnet-testing.md (1)
106-113: Make compose path consistent throughout the doc.Normalize all occurrences to docker/docker-compose.local-subnet.yml.
- docker compose -f docker-compose.local-subnet.yml logs -f miner-1 + docker compose -f docker/docker-compose.local-subnet.yml logs -f miner-1 - docker compose -f docker-compose.local-subnet.yml logs -f validator + docker compose -f docker/docker-compose.local-subnet.yml logs -f validator - docker compose -f docker-compose.local-subnet.yml logs -f alice + docker compose -f docker/docker-compose.local-subnet.yml logs -f alice - docker compose -f docker-compose.local-subnet.yml down + docker compose -f docker/docker-compose.local-subnet.yml down - docker compose -f docker-compose.local-subnet.yml down -v + docker compose -f docker/docker-compose.local-subnet.yml down -vAlso applies to: 177-183
🧹 Nitpick comments (17)
scripts/setup-local-subnet.sh (2)
73-75: Minor typo in user-facing text.-# No need to fund wallets - Alice already has 2funds +# No need to fund wallets — Alice already has funds
4-4: Harden script error handling for pipelines.Piped commands bypass set -e; add pipefail.
-set -e +set -euo pipefailpyproject.toml (2)
189-191: Ruff added only to dependency-groups; dev extra lacks it and flake8 versions diverge.Keep a single source of truth for “dev” deps to avoid drift.
[project.optional-dependencies] dev = [ - "pytest>=7.0.0", + "pytest>=7.0.0", "pytest-asyncio>=0.21.0", "black>=22.0.0", - "flake8>=5.0.0", + "flake8==7.3.0", + "ruff>=0.8.0", "mypy>=1.0.0", "pre-commit>=2.20.0", "bittensor-cli>=9.10.2", "zstandard>=0.21.0", ]Optionally drop the duplicate [dependency-groups].dev or keep both in sync.
11-14: Author name typo.- {name = "distriburedstatemachine"}, + {name = "distributedstatemachine"},.github/workflows/ci.yml (1)
1-58: Trim trailing spaces and add EOF newline to satisfy yamllint.Also consider a concurrency group to auto-cancel superseded runs on PRs:
name: CI +concurrency: + group: ${{ github.workflow }}-${{ github.ref }} + cancel-in-progress: truedocker/.env.validator.example (1)
9-10: Optional ordering and newline per dotenv-linter.-WALLET_NAME=your_wallet_name -WALLET_HOTKEY=your_hotkey_name +WALLET_HOTKEY=your_hotkey_name +WALLET_NAME=your_wallet_nameAnd add a trailing newline at file end.
docs/VALIDATOR_DEPLOYMENT.md (2)
49-50: Preferdocker composeoverdocker-composefor consistency with other docs.-docker-compose -f docker/docker-compose.validator.yml --env-file docker/.env.validator up -d +docker compose -f docker/docker-compose.validator.yml --env-file docker/.env.validator up -d ... -docker-compose -f docker/docker-compose.validator.yml restart validator +docker compose -f docker/docker-compose.validator.yml restart validatorAlso applies to: 143-144
212-213: Wrap bare URLs in Markdown links (mdlint).-- GitHub Issues: https://github.com/tplr-ai/grail/issues -- Documentation: https://github.com/tplr-ai/grail/blob/main/SPEC.md +- GitHub Issues: [https://github.com/tplr-ai/grail/issues](https://github.com/tplr-ai/grail/issues) +- Documentation: [https://github.com/tplr-ai/grail/blob/main/SPEC.md](https://github.com/tplr-ai/grail/blob/main/SPEC.md).github/workflows/docker-publish.yml (1)
1-127: Solid multi-arch build/push with provenance; clean tag strategy.Minor: remove trailing spaces, add EOF newline to satisfy yamllint. Optionally add concurrency to avoid duplicate manual runs.
name: Docker Publish +concurrency: + group: ${{ github.workflow }}-${{ github.ref }} + cancel-in-progress: truedocker/docker-compose.validator.yml (2)
55-60: Healthcheck is simplistic but acceptable. Consider a first-party health endpoint if available.If grail exposes a health command, prefer:
healthcheck: test: ["CMD", "uv", "run", "grail", "health"]
1-86: Trim trailing spaces and add EOF newline.docker/README.md (6)
25-27: Use modern Compose CLI (“docker compose”) and note the minimum version.The legacy
docker-composeis deprecated. Preferdocker composeand document the version requirement.-# Start validator with auto-updates -docker-compose -f docker/docker-compose.validator.yml --env-file docker/.env.validator up -d +# Start validator with auto-updates (Docker Engine 20.10+ and Compose v2) +docker compose -f docker/docker-compose.validator.yml --env-file docker/.env.validator up -d
12-13: Auto-updates with Watchtower: pin tags and scope updates.Unpinned tags can pull breaking changes. Recommend pinned image tags and selective Watchtower enablement via labels.
You can add to README:
- - **`docker-compose.validator.yml`** - Production deployment with Watchtower for automatic updates + - **`docker-compose.validator.yml`** - Production deployment; enable Watchtower for selective, pinned updates. + - Pin image tags (e.g., `grail:<major>.<minor>`), and use `com.centurylinklabs.watchtower.enable="true"` labels to opt-in containers.
16-20: Add prerequisites before Quick Start.Clarifies required tooling and optional GPU setup.
## Quick Start + +### Prerequisites +- Docker Engine 20.10+ and Docker Compose v2.x +- Optional: NVIDIA Container Toolkit if running with GPUs + ### Deploy a Validator
32-37: Add teardown/cleanup commands for local runs.Helps avoid orphaned volumes/networks.
docker-compose -f docker/docker-compose.local-subnet.yml up -d + +# Teardown and remove volumes when done +# (use 'docker compose' if on Compose v2) +docker-compose -f docker/docker-compose.local-subnet.yml down -v
47-48: Clarify build context and enable BuildKit for speed.Minor doc tweak for reliability/perf when building from repo root.
-docker build -f docker/Dockerfile -t grail:local . +DOCKER_BUILDKIT=1 docker build -f docker/Dockerfile -t grail:local . +# Run from the repo root so build context (.) includes the source tree.
7-9: Clarify image scope (miners vs validators).If miners and validators diverge, consider separate targets/images or multi-stage targets documented here to avoid bloated images.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (9)
.github/workflows/ci.yml(1 hunks).github/workflows/docker-publish.yml(1 hunks)docker/.env.validator.example(1 hunks)docker/README.md(1 hunks)docker/docker-compose.validator.yml(1 hunks)docs/VALIDATOR_DEPLOYMENT.md(1 hunks)docs/local-subnet-testing.md(2 hunks)pyproject.toml(1 hunks)scripts/setup-local-subnet.sh(3 hunks)
🧰 Additional context used
🪛 dotenv-linter (3.3.0)
docker/.env.validator.example
[warning] 10-10: [UnorderedKey] The WALLET_HOTKEY key should go before the WALLET_NAME key
(UnorderedKey)
[warning] 22-22: [EndingBlankLine] No blank line at the end of the file
(EndingBlankLine)
🪛 YAMLlint (1.37.1)
.github/workflows/ci.yml
[error] 15-15: trailing spaces
(trailing-spaces)
[error] 21-21: trailing spaces
(trailing-spaces)
[error] 24-24: trailing spaces
(trailing-spaces)
[error] 27-27: trailing spaces
(trailing-spaces)
[error] 30-30: trailing spaces
(trailing-spaces)
[error] 42-42: trailing spaces
(trailing-spaces)
[error] 48-48: trailing spaces
(trailing-spaces)
[error] 51-51: trailing spaces
(trailing-spaces)
[error] 54-54: trailing spaces
(trailing-spaces)
[error] 58-58: no new line character at the end of file
(new-line-at-end-of-file)
docker/docker-compose.validator.yml
[error] 16-16: trailing spaces
(trailing-spaces)
[error] 21-21: trailing spaces
(trailing-spaces)
[error] 25-25: trailing spaces
(trailing-spaces)
[error] 29-29: trailing spaces
(trailing-spaces)
[error] 35-35: trailing spaces
(trailing-spaces)
[error] 41-41: trailing spaces
(trailing-spaces)
[error] 49-49: trailing spaces
(trailing-spaces)
[error] 52-52: trailing spaces
(trailing-spaces)
[error] 60-60: trailing spaces
(trailing-spaces)
[error] 86-86: no new line character at the end of file
(new-line-at-end-of-file)
.github/workflows/docker-publish.yml
[error] 33-33: trailing spaces
(trailing-spaces)
[error] 39-39: trailing spaces
(trailing-spaces)
[error] 42-42: trailing spaces
(trailing-spaces)
[error] 45-45: trailing spaces
(trailing-spaces)
[error] 59-59: trailing spaces
(trailing-spaces)
[error] 63-63: trailing spaces
(trailing-spaces)
[error] 70-70: trailing spaces
(trailing-spaces)
[error] 83-83: trailing spaces
(trailing-spaces)
[error] 94-94: trailing spaces
(trailing-spaces)
[error] 97-97: trailing spaces
(trailing-spaces)
[error] 114-114: trailing spaces
(trailing-spaces)
[error] 121-121: trailing spaces
(trailing-spaces)
[error] 127-127: no new line character at the end of file
(new-line-at-end-of-file)
🪛 markdownlint-cli2 (0.17.2)
docs/VALIDATOR_DEPLOYMENT.md
212-212: Bare URL used
(MD034, no-bare-urls)
213-213: Bare URL used
(MD034, no-bare-urls)
🔇 Additional comments (10)
scripts/setup-local-subnet.sh (2)
161-161: Compose path update looks good.
95-101: Registration success check is brittle (emoji/string match).btcli output may change; prefer exit code or a clearer grep token if available.
Do we have a stable success string or a --json flag to parse? If yes, I can patch this block accordingly.
pyproject.toml (1)
192-217: Ruff config LGTM and matches CI usage..github/workflows/ci.yml (2)
9-33: Lint + format checks are correct and align with pyproject.
34-58: Matrix tests look good; set PYTHONPATH is helpful.docker/.env.validator.example (1)
15-22: Sane defaults; clear REQUIRED/OPTIONAL split..github/workflows/docker-publish.yml (1)
71-114: Tag selection with fallback is fine; verify BUILD_DATE source on workflow_dispatch.On workflow_dispatch, github.event.release.* is undefined. Confirm BUILD_DATE fallback works, or set it explicitly via:
- BUILD_DATE=${{ github.event.release.created_at || github.event.repository.updated_at }} + BUILD_DATE=${{ github.event.release.created_at || github.event.repository.updated_at || github.run_id }}docker/docker-compose.validator.yml (2)
42-49: GPU reservation may be ignored by non‑Swarm docker compose.Compose often ignores deploy.resources.* unless in Swarm. Ensure GPUs are actually visible inside the container.
If not, switch to the compose-supported GPUs flag:
services: validator: + # Request one GPU in non‑Swarm compose + deploy: {} + gpus: 1 - deploy: - resources: - reservations: - devices: - - driver: nvidia - count: 1 - capabilities: [gpu]Or keep deploy.resources and add gpus: 1 for broad compatibility.
36-41: Mount points and persistence look right; wallet is read-only.docker/README.md (1)
52-54: Verified — linked docs and paths are correct.
File: docker/README.md (lines 52–54): docs/VALIDATOR_DEPLOYMENT.md, docs/local-subnet-testing.md and the referenced docker files exist at the stated paths; no broken links.
|
|
||
| ```bash | ||
| docker compose -f docker-compose.local-subnet.yml up -d alice bob s3 s3-setup | ||
| docker compose -f docker/docker-compose.local-subnet.yml up -d alice bob s3 s3-setup |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Path fix applied in two places; remaining references still point to old location.
Several compose commands below still use docker-compose.local-subnet.yml without the docker/ prefix; this will confuse users.
Apply:
- docker compose -f docker-compose.local-subnet.yml logs -f
+ docker compose -f docker/docker-compose.local-subnet.yml logs -fAnd similarly update Lines 109–113, 177, 182 to use docker/docker-compose.local-subnet.yml.
Also applies to: 98-98
🤖 Prompt for AI Agents
In docs/local-subnet-testing.md around lines 46, 98, 109-113, 177 and 182 update
any docker compose commands that reference docker-compose.local-subnet.yml to
include the docker/ prefix (i.e. change occurrences of
"docker-compose.local-subnet.yml" to "docker/docker-compose.local-subnet.yml")
so all references point to the new path; scan the file for any remaining uses of
the old filename and replace them consistently.
| echo " All services: docker compose -f docker/docker-compose.local-subnet.yml logs -f" | ||
| echo " Miner 1 only: docker compose -f docker/docker-compose.local-subnet.yml logs -f miner-1" | ||
| echo " Validator: docker compose -f docker/docker-compose.local-subnet.yml logs -f validator" | ||
| echo "" | ||
| echo "Rebuild and restart services (after code changes):" | ||
| echo " All GRAIL: docker compose -f docker-compose.local-subnet.yml up -d --build miner-1 miner-2 validator" | ||
| echo " Miner 1 only: docker compose -f docker-compose.local-subnet.yml up -d --build miner-1" | ||
| echo " Validator: docker compose -f docker-compose.local-subnet.yml up -d --build validator" | ||
| echo " All GRAIL: docker compose -f docker/docker-compose.local-subnet.yml up -d --build miner-1 miner-2 validator" | ||
| echo " Miner 1 only: docker compose -f docker/docker-compose.local-subnet.yml up -d --build miner-1" | ||
| echo " Validator: docker compose -f docker/docker-compose.local-subnet.yml up -d --build validator" | ||
| echo ""docker compose -nf docker-compose.local-subnet.yml logs -f miner-1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One malformed echo and stray flags; also one old compose path.
Line 187 has a broken quote and -nf; should be -f with the new docker/ path.
-echo ""docker compose -nf docker-compose.local-subnet.yml logs -f miner-1
+echo " Miner 1 only: docker compose -f docker/docker-compose.local-subnet.yml logs -f miner-1"📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| echo " All services: docker compose -f docker/docker-compose.local-subnet.yml logs -f" | |
| echo " Miner 1 only: docker compose -f docker/docker-compose.local-subnet.yml logs -f miner-1" | |
| echo " Validator: docker compose -f docker/docker-compose.local-subnet.yml logs -f validator" | |
| echo "" | |
| echo "Rebuild and restart services (after code changes):" | |
| echo " All GRAIL: docker compose -f docker-compose.local-subnet.yml up -d --build miner-1 miner-2 validator" | |
| echo " Miner 1 only: docker compose -f docker-compose.local-subnet.yml up -d --build miner-1" | |
| echo " Validator: docker compose -f docker-compose.local-subnet.yml up -d --build validator" | |
| echo " All GRAIL: docker compose -f docker/docker-compose.local-subnet.yml up -d --build miner-1 miner-2 validator" | |
| echo " Miner 1 only: docker compose -f docker/docker-compose.local-subnet.yml up -d --build miner-1" | |
| echo " Validator: docker compose -f docker/docker-compose.local-subnet.yml up -d --build validator" | |
| echo ""docker compose -nf docker-compose.local-subnet.yml logs -f miner-1 | |
| echo " All services: docker compose -f docker/docker-compose.local-subnet.yml logs -f" | |
| echo " Miner 1 only: docker compose -f docker/docker-compose.local-subnet.yml logs -f miner-1" | |
| echo " Validator: docker compose -f docker/docker-compose.local-subnet.yml logs -f validator" | |
| echo "" | |
| echo "Rebuild and restart services (after code changes):" | |
| echo " All GRAIL: docker compose -f docker/docker-compose.local-subnet.yml up -d --build miner-1 miner-2 validator" | |
| echo " Miner 1 only: docker compose -f docker/docker-compose.local-subnet.yml up -d --build miner-1" | |
| echo " Validator: docker compose -f docker/docker-compose.local-subnet.yml up -d --build validator" | |
| echo " Miner 1 only: docker compose -f docker/docker-compose.local-subnet.yml logs -f miner-1" |
🤖 Prompt for AI Agents
In scripts/setup-local-subnet.sh around lines 179 to 187, there is a malformed
echo on line 187 containing a stray quote and the wrong flag "-nf" and an
outdated compose path; replace that broken echo with a corrected echo that uses
the -f flag and the current docker/docker-compose.local-subnet.yml path and
remove the stray quote so the line consistently prints the "Miner 1 only" logs
command using the new path and proper -f flag.
Changed 'uv sync --dev' to 'uv sync --extra dev' to properly install pytest and other dev dependencies from project.optional-dependencies
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
…r docker config - Added all required R2/S3 storage credentials to docker-compose.validator.yml - Fixed environment variable naming (BT_WALLET_COLD/HOT, BT_NETWORK) - Added model configuration variables - Enhanced monitoring configuration with all WandB options - Updated .env.validator.example with required R2 credentials - Updated documentation to reflect all required variables Validators now have proper configuration parity with integration setup
- Remove linux/arm64 platform from docker build - Add QEMU setup step (though only building amd64 now) - Fixes buildx QEMU emulator error in CI
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (8)
tests/test_basics.py (1)
19-26: Test has a logic inconsistencyThe test expects
is_monitoring_enabled()to returnFalseon line 21, but then expectsbackend_typeto be"null"on line 24. However, when monitoring is disabled, the mode could still be any of the valid options. The test should either:
- Check that monitoring is disabled OR backend_type is "null"
- Verify the consistency between the two conditions
Apply this fix to make the test assertions consistent:
def test_monitoring_config_disabled_env() -> None: # With defaults from conftest, monitoring should be effectively disabled - assert MonitoringConfig.is_monitoring_enabled() is False cfg = MonitoringConfig.from_environment() - # Ensure the parsed configuration reflects disabled/NULL backend - assert cfg["backend_type"] == "null" - assert cfg["mode"] in {"disabled", "online", "offline"} + # Either monitoring is disabled OR using null backend + is_disabled = not MonitoringConfig.is_monitoring_enabled() + is_null = cfg["backend_type"] == "null" + assert is_disabled or is_null, "Monitoring should be disabled or using null backend" + assert cfg["mode"] in {"disabled", "online", "offline"}grail/grail.py (1)
123-126: Potential issue with dynamic cache attribute initializationThe function is adding a cache attribute to itself dynamically, but the docstring appears after this initialization. This is unconventional and could cause issues with documentation tools or linters. The cache initialization should be moved after the docstring or implemented differently.
Consider moving the cache initialization after the docstring or using a module-level cache:
def r_vec_from_randomness(rand_hex: str, d_model: int) -> torch.Tensor: - # Add cache attribute to function - if not hasattr(r_vec_from_randomness, "_cache"): - # Initialize a simple dict cache; attribute added dynamically - r_vec_from_randomness._cache = {} # type: ignore[attr-defined] """ Generate random projection vector from drand randomness. Takes drand randomness (32 bytes hex) and expands it deterministically into a d_model-dimensional vector using a PRF. This ensures everyone with the same drand value generates the same projection vector. Args: rand_hex: Hex string of drand randomness (typically from drand beacon) d_model: Model hidden dimension size Returns: Random projection vector of shape (d_model,) with int32 values Raises: ValueError: If rand_hex is invalid or d_model is invalid Note: Uses big-endian byte order for cross-platform consistency """ + # Add cache attribute to function + if not hasattr(r_vec_from_randomness, "_cache"): + # Initialize a simple dict cache; attribute added dynamically + r_vec_from_randomness._cache = {} # type: ignore[attr-defined]grail/monitoring/manager.py (1)
74-84: Potential race condition in async component initializationThe
_ensure_async_componentsmethod checks and creates the shutdown event and flush task, but there's a potential race condition if this method is called concurrently from multiple coroutines. Consider using a lock to ensure thread-safe initialization.Apply this diff to add thread-safe initialization:
def __init__(self, backend: MonitoringBackend | None = None): """Initialize the monitoring manager. Args: backend: The monitoring backend to use. If None, uses NullBackend. """ self.backend = backend or NullBackend() self._metric_buffer: list[MetricData] = [] self._buffer_size = 100 self._flush_interval = 30.0 # seconds self._flush_task: asyncio.Task | None = None self._shutdown_event: asyncio.Event | None = None self._initialized = False self._config: dict[str, Any] = {} self._current_block: int | None = None self._current_window: int | None = None self._start_time: float | None = None + self._init_lock = asyncio.Lock() -def _ensure_async_components(self) -> None: - """Ensure async components are initialized (lazy initialization).""" - if self._shutdown_event is None: - self._shutdown_event = asyncio.Event() - if self._flush_task is None or self._flush_task.done(): - self._flush_task = asyncio.create_task(self._periodic_flush()) +async def _ensure_async_components(self) -> None: + """Ensure async components are initialized (lazy initialization).""" + async with self._init_lock: + if self._shutdown_event is None: + self._shutdown_event = asyncio.Event() + if self._flush_task is None or self._flush_task.done(): + self._flush_task = asyncio.create_task(self._periodic_flush())Then update the call site at line 130:
- self._ensure_async_components() + await self._ensure_async_components()chain.py (4)
136-141: Commit format must be 128 chars; enforce pad/truncate before on-chain write.Current concat may produce wrong length and fail validation.
- concatenated = bucket.account_id + bucket.access_key_id + bucket.secret_access_key + # 32 account_id + 32 access_key_id + 64 secret_access_key + acc = (bucket.account_id or "")[:32].ljust(32) + key = (bucket.access_key_id or "")[:32].ljust(32) + sec = (bucket.secret_access_key or "")[:64].ljust(64) + concatenated = acc + key + sec + if len(concatenated) != 128: + raise ValueError("Commitment must be exactly 128 characters.")
162-175: Redact secrets in logs.Both comparison and commit logs expose access keys; mask or hash instead.
- logger.debug( - "Comparing current commitment to bucket details from the environment:\n" - f"Commitment: {commitment_str}\n" - f"Current: {bucket_details_from_env}" - ) + def _mask(s: str) -> str: + return (s[:4] + "…" + s[-4:]) if s else "" + logger.debug( + "Comparing current commitment to bucket details from the environment:\n" + f"Commitment: {_mask(commitment_str)} (len={len(commitment_str)})\n" + f"Current: {_mask(bucket_details_from_env)} (len={len(bucket_details_from_env)})" + )
226-235: Do not log full on-chain commitment; keep exception chaining.Good use of “raise … from e”, but the info log leaks secrets. Downgrade and redact.
- concatenated = self.subtensor.get_commitment(self.netuid, uid) - logger.info(f"Commitment fetched: {concatenated}") + concatenated = self.subtensor.get_commitment(self.netuid, uid) + logger.debug("Commitment fetched (len=%d)", len(concatenated))
304-309: Return empty dict on error to match annotation and callers.Currently returns None; keep type stable.
- except Exception: + except Exception: self.subtensor.substrate.close() self.subtensor.substrate.initialize() - return + return {}grail/cli/train.py (1)
293-296: grad_norm may be undefined; fix and correct avg_loss denominator.If training is skipped early (health check), grad_norm is never set; avg_loss divisor is also off by +1.
@@ - for epoch in range(2): # Two epochs for better learning + for epoch in range(2): # Two epochs for better learning total_loss = 0 batch_size = min(4, len(texts)) # Small batch size + last_grad_norm = 0.0 @@ - grad_norm = torch.nn.utils.clip_grad_norm_( + grad_norm = torch.nn.utils.clip_grad_norm_( self.model.parameters(), max_norm=0.5 ) # Reduced from 1.0 + last_grad_norm = float(grad_norm) @@ - avg_loss = total_loss / (len(texts) // batch_size + 1) + num_batches = (len(sorted_indices) + batch_size - 1) // batch_size + avg_loss = total_loss / max(1, num_batches) logger.info(f"Epoch {epoch + 1} completed - avg loss: {avg_loss:.4f}") @@ if monitor: await monitor.log_gauge("training.epoch_loss", avg_loss) await monitor.log_counter("training.epochs_completed") - await monitor.log_gauge("training.gradient_norm", float(grad_norm)) + await monitor.log_gauge("training.gradient_norm", float(last_grad_norm))Also applies to: 349-357, 383-392
🧹 Nitpick comments (18)
.github/workflows/docker-publish.yml (3)
33-33: Clean up trailing whitespace in YAML file.YAMLlint detected trailing spaces on multiple lines. While not critical, removing them improves consistency.
Apply this diff to remove trailing spaces:
- - uses: actions/checkout@v4 - + - uses: actions/checkout@v4 + - - - name: Set up Python + + - name: Set up PythonAnd similar changes for lines 39, 42, 45, 59, 63, 70, 83, 94, 97, 100, 117, 124.
Also applies to: 39-39, 42-42, 45-45, 59-59, 63-63, 70-70, 83-83, 94-94, 97-97, 100-100, 117-117, 124-124, 130-130
130-130: Add newline at end of file.YAMLlint recommends adding a newline character at the end of the file.
- echo "Digest: ${{ steps.push.outputs.digest }}" + echo "Digest: ${{ steps.push.outputs.digest }}" +
107-107: Consider adding support for ARM architecture.The workflow currently builds only for
linux/amd64. Consider addinglinux/arm64to support ARM-based deployments, which are becoming increasingly common.- platforms: linux/amd64 + platforms: linux/amd64,linux/arm64docker/docker-compose.validator.yml (3)
16-16: Remove trailing spaces from YAML file.YAMLlint identified trailing spaces that should be removed for consistency.
Apply similar cleanup as suggested for the GitHub workflow file - remove trailing spaces from the indicated lines.
Also applies to: 21-21, 31-31, 41-41, 46-46, 52-52, 58-58, 64-64, 72-72, 75-75, 83-83
109-109: Add newline at end of file.- driver: local + driver: local +
94-94: Consider increasing Watchtower interval for production.The current 30-second interval for checking updates is quite aggressive and may cause unnecessary load. Consider increasing this to 300 seconds (5 minutes) or more for production deployments.
- command: --interval 30 --cleanup --label-enable + command: --interval 300 --cleanup --label-enabledocs/VALIDATOR_DEPLOYMENT.md (2)
230-231: Format bare URLs as proper markdown links.Markdownlint detected bare URLs that should be formatted as proper links.
-- GitHub Issues: https://github.com/tplr-ai/grail/issues -- Documentation: https://github.com/tplr-ai/grail/blob/main/SPEC.md +- GitHub Issues: [https://github.com/tplr-ai/grail/issues](https://github.com/tplr-ai/grail/issues) +- Documentation: [https://github.com/tplr-ai/grail/blob/main/SPEC.md](https://github.com/tplr-ai/grail/blob/main/SPEC.md)
94-96: Document the aggressive update interval implications.The 30-second update check interval is very aggressive. Consider documenting the potential implications (increased network traffic, API rate limits) and suggesting production-appropriate values.
### Update Schedule - **Current Setting**: Checks every 30 seconds for rapid updates - This aggressive polling ensures validators stay up-to-date with the latest releases +- **Note**: For production deployments, consider increasing the interval to reduce load: + - Development/Testing: 30 seconds + - Staging: 5 minutes (300 seconds) + - Production: 30 minutes (1800 seconds) - The image is public on GitHub Container Registry, no authentication neededgrail/cli/__init__.py (1)
113-115: Redundant monitoring checkIn silent mode (verbosity == 0), the code checks
MonitoringConfig.is_monitoring_enabled()again on line 114, even though it was already checked on line 100. This seems redundant.Consider simplifying the logic:
elif verbosity == 0: # Silent mode - but still allow monitoring if explicitly enabled - if not MonitoringConfig.is_monitoring_enabled(): - config["backend_type"] = "null" + # Force null backend in silent mode unless explicitly configured + config["backend_type"] = "null"grail/infrastructure/comms.py (1)
76-77: Unused variable assignments can be removedLines 77 and 83 contain standalone expressions that appear to be extracting the bucket name but not using the result. These were likely left over from refactoring.
Remove these unused expressions:
- credentials.name.strip() elif isinstance(credentials, dict): # Handle dict format (from chain commitments or legacy) account_id = credentials.get("account_id", "").strip() access_key = credentials.get("access_key_id", "").strip() secret_key = credentials.get("secret_access_key", "").strip() - credentials.get("name", credentials.get("bucket_name", "")).strip()Also applies to: 82-83
grail/mining/rollout_generator.py (1)
266-266: Import placement could be improvedThe import statement for GRAIL functions is placed inside the method at runtime. While this works, it's generally better practice to have imports at the module level unless there's a specific reason for lazy loading (e.g., circular dependencies).
Consider moving this import to the top of the file with other imports:
import bittensor as bt import torch +from ..grail import dot_mod_q, r_vec_from_randomness, sign_s_vals from ..shared.constants import MAX_NEW_TOKENS from ..shared.hf_compat import resolve_hidden_size ... # Generate GRAIL proof components (using existing grail.py logic) - from ..grail import dot_mod_q, r_vec_from_randomness, sign_s_valsgrail/monitoring/manager.py (1)
202-202: Minor issue: Inconsistent return type annotationThe
timermethod's return type is annotated asAny, but it should be a context manager type. Consider usingGenerator[None, None, None]fromcollections.abcfor consistency with the base class.+from collections.abc import Generator @contextmanager -def timer(self, name: str, tags: dict[str, str] | None = None) -> Any: +def timer(self, name: str, tags: dict[str, str] | None = None) -> Generator[None, None, None]:grail/shared/hf_compat.py (2)
31-37: Tighten attribute access (simplify and avoid hasattr+try).Use getattr with a default to drop hasattr plus surrounding try.
- text_cfg = getattr(cfg, "text_config", None) - if text_cfg is not None and hasattr(text_cfg, "hidden_size"): - val = text_cfg.hidden_size + text_cfg = getattr(cfg, "text_config", None) + if text_cfg is not None: + val = getattr(text_cfg, "hidden_size", None) if isinstance(val, int) and val > 0: return val
74-96: Consider rope-scaled context and tokenizer fallback.Optional: also inspect config.rope_scaling.max_position_embeddings (if dict-like) and, if available in the calling context, tokenizer.model_max_length before defaulting to 16384.
chain.py (2)
207-214: Docstring order vs implementation.Docs say “bucket name, account ID, …” but code uses account_id as both name and id. Clarify to “account_id (also used as name), access_key_id, secret_access_key”.
- **Note:** The order of fields (bucket name, account ID, access key ID, + **Note:** The order of fields (account ID (also used as name), access key ID,
377-385: Comment threshold mismatch.Code filters stake <= 20000 but comment says 1000. Align the comment.
- # for peers still active with stake <= 1000. + # for peers still active with stake <= 20000..github/workflows/ci.yml (1)
14-33: Trim trailing spaces and add newline at EOF.YAMLlint errors; also add minimal hardening.
name: CI on: push: branches: [ '**' ] pull_request: branches: [ '**' ] jobs: lint: name: Lint with Ruff runs-on: ubuntu-latest steps: - uses: actions/checkout@v4 - + - name: Install uv uses: astral-sh/setup-uv@v4 with: enable-cache: true cache-dependency-glob: "uv.lock" - + - name: Set up Python run: uv python install 3.11 - + - name: Install dependencies run: uv sync --extra dev - + - name: Run Ruff linter run: uv run ruff check . - + - name: Run Ruff formatter check run: uv run ruff format --check . test: name: Run Tests runs-on: ubuntu-latest strategy: matrix: python-version: ["3.9", "3.10", "3.11"] steps: - uses: actions/checkout@v4 - + - name: Install uv uses: astral-sh/setup-uv@v4 with: enable-cache: true cache-dependency-glob: "uv.lock" - + - name: Set up Python ${{ matrix.python-version }} run: uv python install ${{ matrix.python-version }} - + - name: Install dependencies run: uv sync --extra dev - + - name: Run tests with pytest run: uv run pytest tests/ -v --tb=short env: PYTHONPATH: ${{ github.workspace }} +``` +```diff +--- a/.github/workflows/ci.yml ++++ b/.github/workflows/ci.yml +@@ + name: CI + +permissions: + + contents: read + +concurrency: + + group: ci-${{ github.ref }} + + cancel-in-progress: trueIf desired, I can push a follow-up PR to run yamllint in CI to catch these automatically.
Also applies to: 42-58
docker/.env.validator.example (1)
8-21: Reorder keys for readability and dotenv linters; add EOF newline.No functional change; improves DX.
-# Bittensor Wallet Configuration -WALLET_NAME=your_wallet_name -WALLET_HOTKEY=your_hotkey_name +# Bittensor Wallet Configuration +WALLET_HOTKEY=your_hotkey_name +WALLET_NAME=your_wallet_name @@ -# For Cloudflare R2: -R2_ENDPOINT_URL=https://your-account-id.r2.cloudflarestorage.com -R2_BUCKET_ID=your-bucket-name -R2_ACCOUNT_ID=your-account-id -R2_WRITE_ACCESS_KEY_ID=your-write-access-key -R2_WRITE_SECRET_ACCESS_KEY=your-write-secret-key -R2_READ_ACCESS_KEY_ID=your-read-access-key -R2_READ_SECRET_ACCESS_KEY=your-read-secret-key +# For Cloudflare R2: +R2_ACCOUNT_ID=your-account-id +R2_BUCKET_ID=your-bucket-name +R2_ENDPOINT_URL=https://your-account-id.r2.cloudflarestorage.com +R2_READ_ACCESS_KEY_ID=your-read-access-key +R2_READ_SECRET_ACCESS_KEY=your-read-secret-key +R2_WRITE_ACCESS_KEY_ID=your-write-access-key +R2_WRITE_SECRET_ACCESS_KEY=your-write-secret-keyAlso add a trailing newline at the end of the file.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
uv.lockis excluded by!**/*.lock
📒 Files selected for processing (37)
.github/workflows/ci.yml(1 hunks).github/workflows/docker-publish.yml(1 hunks)chain.py(3 hunks)docker/.env.validator.example(1 hunks)docker/README.md(1 hunks)docker/docker-compose.validator.yml(1 hunks)docs/VALIDATOR_DEPLOYMENT.md(1 hunks)experiments/run_all.py(0 hunks)grail/__init__.py(1 hunks)grail/cli/__init__.py(3 hunks)grail/cli/mine.py(12 hunks)grail/cli/train.py(6 hunks)grail/cli/validate.py(22 hunks)grail/drand.py(11 hunks)grail/environments/__init__.py(1 hunks)grail/environments/base.py(4 hunks)grail/environments/sat.py(13 hunks)grail/grail.py(8 hunks)grail/infrastructure/chain.py(4 hunks)grail/infrastructure/comms.py(13 hunks)grail/infrastructure/credentials.py(1 hunks)grail/infrastructure/drand.py(3 hunks)grail/infrastructure/network.py(1 hunks)grail/mining/rollout_generator.py(11 hunks)grail/monitoring/__init__.py(1 hunks)grail/monitoring/backends/__init__.py(1 hunks)grail/monitoring/backends/null_backend.py(5 hunks)grail/monitoring/backends/wandb_backend.py(10 hunks)grail/monitoring/base.py(6 hunks)grail/monitoring/config.py(7 hunks)grail/monitoring/manager.py(9 hunks)grail/shared/hf_compat.py(3 hunks)grail/shared/schemas.py(1 hunks)scripts/run_integration_experiment.py(10 hunks)scripts/test-s3-credentials.py(6 hunks)tests/conftest.py(1 hunks)tests/test_basics.py(1 hunks)
💤 Files with no reviewable changes (1)
- experiments/run_all.py
✅ Files skipped from review due to trivial changes (3)
- grail/shared/schemas.py
- grail/infrastructure/credentials.py
- tests/conftest.py
🚧 Files skipped from review as they are similar to previous changes (1)
- docker/README.md
🧰 Additional context used
🧬 Code graph analysis (17)
grail/monitoring/__init__.py (2)
grail/monitoring/base.py (3)
MetricData(29-43)MetricType(19-25)MonitoringBackend(46-165)grail/monitoring/config.py (1)
MonitoringConfig(14-236)
grail/monitoring/backends/__init__.py (1)
grail/monitoring/backends/wandb_backend.py (1)
WandBBackend(24-473)
grail/environments/__init__.py (1)
grail/environments/sat.py (3)
SATProblem(24-60)SATRolloutGenerator(337-523)create_sat_reward_vector(281-334)
grail/infrastructure/chain.py (1)
grail/shared/schemas.py (2)
Bucket(63-88)BucketCredentials(6-60)
grail/monitoring/base.py (3)
grail/monitoring/backends/null_backend.py (4)
initialize(26-32)log_metrics(42-48)timer(51-61)start_run(73-83)grail/monitoring/backends/wandb_backend.py (4)
initialize(43-81)log_metrics(180-218)timer(324-346)start_run(427-445)grail/monitoring/manager.py (3)
initialize(49-72)timer(202-218)start_run(268-285)
grail/infrastructure/drand.py (1)
grail/drand.py (1)
get_current_chain(249-263)
grail/monitoring/backends/wandb_backend.py (2)
grail/monitoring/base.py (7)
MetricData(29-43)MetricType(19-25)MonitoringBackend(46-165)initialize(55-70)log_metrics(86-96)timer(100-114)start_run(128-138)grail/monitoring/manager.py (3)
initialize(49-72)timer(202-218)start_run(268-285)
grail/monitoring/manager.py (3)
grail/monitoring/backends/null_backend.py (4)
NullBackend(18-103)initialize(26-32)timer(51-61)start_run(73-83)grail/monitoring/backends/wandb_backend.py (4)
WandBBackend(24-473)initialize(43-81)timer(324-346)start_run(427-445)grail/monitoring/base.py (6)
MetricData(29-43)MetricType(19-25)MonitoringBackend(46-165)initialize(55-70)timer(100-114)start_run(128-138)
grail/monitoring/backends/null_backend.py (3)
grail/monitoring/base.py (6)
MetricData(29-43)MonitoringBackend(46-165)initialize(55-70)log_metrics(86-96)timer(100-114)start_run(128-138)grail/monitoring/backends/wandb_backend.py (4)
initialize(43-81)log_metrics(180-218)timer(324-346)start_run(427-445)grail/monitoring/manager.py (3)
initialize(49-72)timer(202-218)start_run(268-285)
grail/mining/rollout_generator.py (2)
grail/grail.py (3)
dot_mod_q(293-303)r_vec_from_randomness(122-214)sign_s_vals(306-327)grail/environments/sat.py (2)
create_prompt(381-395)get_final_info(487-518)
tests/test_basics.py (1)
grail/monitoring/config.py (1)
MonitoringConfig(14-236)
grail/drand.py (1)
grail/infrastructure/drand.py (3)
get_current_chain(82-84)get_drand_beacon(87-137)get_mock_beacon(140-146)
grail/cli/__init__.py (3)
grail/cli/mine.py (1)
register(751-752)grail/cli/train.py (1)
register(419-420)grail/cli/validate.py (1)
register(232-233)
grail/environments/sat.py (2)
grail/mining/rollout_generator.py (3)
RolloutGenerator(102-379)reset_environment(334-336)parse_action(344-346)grail/environments/base.py (3)
Parser(7-22)RewardVector(25-129)parse(11-22)
grail/grail.py (2)
grail/environments/sat.py (1)
generate_sat_problem(63-93)grail/monitoring/manager.py (1)
get_monitoring_manager(338-347)
grail/cli/validate.py (3)
grail/cli/mine.py (2)
parse_filename(82-94)parse_window_filename(97-109)grail/cli/train.py (2)
parse_filename(65-77)parse_window_filename(80-92)grail/infrastructure/chain.py (1)
GrailChainManager(17-298)
grail/__init__.py (4)
grail/environments/sat.py (2)
create_sat_reward_vector(281-334)generate_sat_problem(63-93)grail/infrastructure/comms.py (10)
download_file_chunked(371-483)download_from_huggingface(996-1051)file_exists(525-547)get_file(568-586)get_valid_rollouts(754-785)list_bucket_files(550-565)login_huggingface(793-820)sink_window_inferences(594-622)upload_file_chunked(204-297)upload_to_huggingface(823-993)grail/infrastructure/drand.py (2)
get_drand_beacon(87-137)get_round_at_time(170-173)grail/mining/rollout_generator.py (1)
RolloutGenerator(102-379)
🪛 YAMLlint (1.37.1)
.github/workflows/docker-publish.yml
[error] 33-33: trailing spaces
(trailing-spaces)
[error] 39-39: trailing spaces
(trailing-spaces)
[error] 42-42: trailing spaces
(trailing-spaces)
[error] 45-45: trailing spaces
(trailing-spaces)
[error] 59-59: trailing spaces
(trailing-spaces)
[error] 63-63: trailing spaces
(trailing-spaces)
[error] 70-70: trailing spaces
(trailing-spaces)
[error] 83-83: trailing spaces
(trailing-spaces)
[error] 94-94: trailing spaces
(trailing-spaces)
[error] 97-97: trailing spaces
(trailing-spaces)
[error] 100-100: trailing spaces
(trailing-spaces)
[error] 117-117: trailing spaces
(trailing-spaces)
[error] 124-124: trailing spaces
(trailing-spaces)
[error] 130-130: no new line character at the end of file
(new-line-at-end-of-file)
docker/docker-compose.validator.yml
[error] 16-16: trailing spaces
(trailing-spaces)
[error] 21-21: trailing spaces
(trailing-spaces)
[error] 31-31: trailing spaces
(trailing-spaces)
[error] 41-41: trailing spaces
(trailing-spaces)
[error] 46-46: trailing spaces
(trailing-spaces)
[error] 52-52: trailing spaces
(trailing-spaces)
[error] 58-58: trailing spaces
(trailing-spaces)
[error] 64-64: trailing spaces
(trailing-spaces)
[error] 72-72: trailing spaces
(trailing-spaces)
[error] 75-75: trailing spaces
(trailing-spaces)
[error] 83-83: trailing spaces
(trailing-spaces)
[error] 109-109: no new line character at the end of file
(new-line-at-end-of-file)
.github/workflows/ci.yml
[error] 15-15: trailing spaces
(trailing-spaces)
[error] 21-21: trailing spaces
(trailing-spaces)
[error] 24-24: trailing spaces
(trailing-spaces)
[error] 27-27: trailing spaces
(trailing-spaces)
[error] 30-30: trailing spaces
(trailing-spaces)
[error] 42-42: trailing spaces
(trailing-spaces)
[error] 48-48: trailing spaces
(trailing-spaces)
[error] 51-51: trailing spaces
(trailing-spaces)
[error] 54-54: trailing spaces
(trailing-spaces)
[error] 58-58: no new line character at the end of file
(new-line-at-end-of-file)
🪛 dotenv-linter (3.3.0)
docker/.env.validator.example
[warning] 10-10: [UnorderedKey] The WALLET_HOTKEY key should go before the WALLET_NAME key
(UnorderedKey)
[warning] 15-15: [UnorderedKey] The R2_BUCKET_ID key should go before the R2_ENDPOINT_URL key
(UnorderedKey)
[warning] 16-16: [UnorderedKey] The R2_ACCOUNT_ID key should go before the R2_BUCKET_ID key
(UnorderedKey)
[warning] 19-19: [UnorderedKey] The R2_READ_ACCESS_KEY_ID key should go before the R2_WRITE_ACCESS_KEY_ID key
(UnorderedKey)
[warning] 20-20: [UnorderedKey] The R2_READ_SECRET_ACCESS_KEY key should go before the R2_WRITE_ACCESS_KEY_ID key
(UnorderedKey)
[warning] 53-53: [EndingBlankLine] No blank line at the end of the file
(EndingBlankLine)
🪛 markdownlint-cli2 (0.17.2)
docs/VALIDATOR_DEPLOYMENT.md
230-230: Bare URL used
(MD034, no-bare-urls)
231-231: Bare URL used
(MD034, no-bare-urls)
🔇 Additional comments (116)
scripts/test-s3-credentials.py (2)
5-5: Formatting improvement with blank lineAdding a blank line after the imports section improves code readability and follows Python style conventions.
21-21: Code formatting improvements maintain functionalityThe formatting changes (single-line environment variable access, consolidated print statements, and keyword argument formatting) improve readability while maintaining identical functionality. These changes align with the broader code hygiene improvements in this PR.
Also applies to: 37-37, 53-53, 61-61, 76-76, 79-79, 104-104
grail/environments/base.py (3)
4-4: LGTM! Type hint modernization is consistent.The update from
ListandTupleto built-in generics (list,tuple) aligns with Python 3.9+ capabilities and the broader PR modernization effort.
30-33: Type hints correctly updated to built-in generics.The parameter type annotations properly use
list[Callable[[Any, Any], float]],list[float], andOptional[list[tuple[float, float]]]following PEP 585.
77-77: Return type annotations consistently modernized.Both
compute_individual_rewardsandreward_boundsmethods now use built-in generics for their return types (list[float]andtuple[float, float]respectively).Also applies to: 104-104
.github/workflows/docker-publish.yml (1)
30-30: Verify the condition logic for skipping tests.The condition
if: github.event.inputs.skip_tests != 'true'works correctly for workflow dispatch events. However, for release events wheregithub.event.inputsis undefined, this will evaluate totrueand run tests (which is the desired behavior).grail/monitoring/config.py (1)
23-23: Type annotations successfully modernized to built-in generics.All method signatures have been updated to use
dict[str, Any]andstr | Noneconsistently, following PEP 585 and PEP 604 standards.Also applies to: 56-56, 87-87, 118-118, 171-171, 193-193
scripts/run_integration_experiment.py (4)
25-34: Import organization improved.The imports are now better organized with standard library imports (
argparse,logging,subprocess,sys) properly grouped.
48-56: Type annotations modernized to built-in generics.Instance variables and method return types correctly updated to use
dict,list, and union syntax (str | None).Also applies to: 62-62
270-272: Parameter type hints properly updated.The
runmethod signature now uses built-in generics:list[str]forminer_modelsandOptional[list[str]]forminer_hotkeys.
402-402: Help text improvements are clearer.The updated help strings for
--hotkeysand--validator-delayare more concise and properly formatted.Also applies to: 410-410
grail/drand.py (3)
12-20: Import cleanup and type modernization complete.Successfully removed unused typing imports (
Dict,Optional,Tuple) and retained onlyAnyas needed. Addedfrom __future__ import annotationsfor forward compatibility.
55-55: Module-level type annotations updated consistently.Global variables and cache now use built-in generics:
dict[str, dict[str, Any]]forDRAND_CHAINS,str | Nonefor chain parameters, and proper dict typing for the cache.Also applies to: 75-77, 80-80
95-95: All function signatures modernized to built-in generics.Public and private function signatures consistently updated to use
dict[str, Any],tuple[...], andint | Nonethroughout the module.Also applies to: 101-103, 130-130, 156-156, 249-249, 266-266, 320-320, 345-345, 382-382
docker/docker-compose.validator.yml (1)
23-29: R2 credential validation present — no action requiredValidator loads/checks dual and single R2 credential sets and logs an explicit, actionable error when missing. See grail/infrastructure/credentials.py (checks around lines 55–89; error handling/log message around lines 111–114).
grail/cli/mine.py (3)
15-15: Type imports properly modernized.Successfully removed
ListandTuplefrom typing imports, keeping onlyAnyandOptionalas needed.
84-84: Function signatures consistently updated to built-in generics.All function return types and parameter types now use
tuple[...]andlist[...]instead of the typing module equivalents.Also applies to: 99-99, 252-252, 288-288, 363-363, 370-370, 489-489, 534-534, 558-558, 271-271
335-335: Log message formatting improvements.The updated log messages with improved formatting (using parentheses for block ranges and consistent structure) enhance readability.
Also applies to: 600-600, 658-658
grail/monitoring/backends/__init__.py (1)
9-10: Import order change is reasonableThe reordering of imports (moving
WandBBackendafterNullBackend) doesn't affect functionality and maintains consistency with the__all__declaration order.tests/test_basics.py (1)
3-8: Verify import path and organizationThe new import of
MonitoringConfigand the reorganization of constant imports look good. The placement ofWINDOW_LENGTHwithin the constants import is consistent.grail/monitoring/__init__.py (1)
8-9: Good modernization of importsThe reordering of base imports and addition of
MonitoringConfigto the public API improves the module interface. This makesMonitoringConfigproperly accessible fromgrail.monitoring, which is needed by the new test.grail/environments/__init__.py (1)
7-14: Proper consolidation of SAT-related exportsThe addition of
SATProblem,SATRolloutGenerator, andcreate_sat_reward_vectorto the imports properly exposes these components through the environments module, making the API more complete and consistent.grail/infrastructure/drand.py (2)
4-6: Appropriate type hint modernizationThe typing imports are correctly updated to use modern Python typing with built-in generics. The addition of
osimport is necessary for theos.urandom()call on line 144.
20-20: Consistent use of built-in genericsThe changes from
Dict[str, ChainConfig]todict[str, ChainConfig]and similar for the return type ofget_current_chain()follow Python 3.9+ typing conventions consistently.Also applies to: 82-82
grail/monitoring/backends/null_backend.py (2)
11-16: Proper typing modernization with correct importsThe addition of
Generatorfromcollections.abcand the modernization to use built-in generics (dict,list) is correct and follows Python 3.9+ conventions.
26-26: All method signatures properly updatedThe type hints have been consistently updated across all methods to use built-in generics (
dict[str, Any],list[MetricData],dict[str, str] | None), maintaining API compatibility while modernizing the typing.Also applies to: 42-42, 51-51, 73-73
grail/cli/__init__.py (3)
14-14: Good defensive loading of dotenvMoving
load_dotenvto the top and wrapping it in a try-except block ensures environment variables are loaded early without risking startup failures if dotenv is unavailable or encounters errors.Also applies to: 24-28
92-133: Robust monitoring initialization with proper error handlingThe new
_initialize_monitoringfunction provides excellent separation of concerns and proper error handling:
- Checks if monitoring is enabled before initialization
- Adjusts configuration based on verbosity level
- Validates configuration and falls back to null backend on errors
- Ensures monitoring failures don't crash the CLI
This is a well-structured implementation that follows defensive programming principles.
183-191: Clean type annotation modernizationThe updated type annotation using the union operator (
Callable[[typer.Typer], None] | None) is cleaner and more readable than theOptionalwrapper.grail/__init__.py (2)
18-20: LGTM! The imports are correctly organized.The two imports
create_sat_reward_vectorandgenerate_sat_problemhave been properly added from the environments module, maintaining consistency with the public API exports declared in__all__.
21-43: LGTM! Module imports are well organized.The reorganization of imports from
grail,infrastructure.comms,infrastructure.drand, andmining.rollout_generatormodules is clean and follows a logical structure. The imports align with the module's public API as specified in__all__.grail/monitoring/base.py (3)
12-16: LGTM! Import modernization looks good.The migration from
typingmodule imports to usingcollections.abc.Generatorand built-in generics aligns with PEP 585 recommendations for Python 3.9+.
35-38: LGTM! Type annotations modernized correctly.The update from
Optional[Dict[str, str]]todict[str, str] | Nonefollows Python 3.10+ conventions for union types using the pipe operator.
99-100:@contextmanagerdecorator removed but method still yieldsThe
timermethod signature has been updated to use built-in generics, but the@contextmanagerdecorator on line 99 has been removed. This decorator is essential for the method to function correctly as a context manager when used withwithstatements.Apply this fix to restore the decorator:
@abstractmethod - @contextmanager + @contextmanager def timer(self, name: str, tags: dict[str, str] | None = None) -> Generator[None, None, None]:Likely an incorrect or invalid review comment.
grail/infrastructure/comms.py (3)
120-129: LGTM! Type annotation modernization.The update from
Dict[str, Any]todict[str, Any]fors3_configfollows the modern Python typing conventions.
220-221: LGTM! Improved formatting of log messages.The consolidation of multi-line f-strings into single lines improves readability while maintaining the same functionality.
Also applies to: 270-271
339-339: LGTM! Consistent type annotation updates across function signatures.All function signatures have been properly updated to use built-in generics (
list[dict],dict[str, Any]) instead of typing module imports. This modernization is consistent throughout the file.Also applies to: 554-554, 572-572, 597-597, 728-728, 758-758, 824-824, 1000-1000
grail/grail.py (4)
167-170: LGTM! Type annotation improvements for cache handling.The type annotations for the cache have been properly updated to use built-in generics
dict[tuple[str, int], torch.Tensor], and the cache update logic is correct.Also applies to: 208-209
261-263: LGTM! Improved exception chaining.The addition of
from eproperly chains the exception, preserving the original error context for better debugging.
650-651: LGTM! Log message formatting improvement.The consolidation of the log message into a single f-string improves readability.
577-577: LGTM! Consistent type annotation updates.All method signatures have been properly updated to use built-in generics (
tuple[bool, dict[str, bool]],dict[str, bool],list[int]) following modern Python conventions.Also applies to: 596-597, 1130-1130
grail/mining/rollout_generator.py (4)
6-6: LGTM! Import organization improvements.The imports have been properly reorganized with
typing.Anyfor type hints andtorchimport moved to its appropriate location.Also applies to: 10-10
81-82: LGTM! Type annotations modernized in GRPORollout dataclass.All type annotations have been properly updated to use built-in generics (
list[int],list[float],list[tuple[Any, Any, float]],dict) instead of typing module imports.Also applies to: 93-93, 97-97, 99-99
254-254: LGTM! Log message formatting improvement.The warning message has been consolidated into a single line for better readability.
143-143: LGTM! Consistent type annotation updates across all methods.All method signatures have been properly updated to use built-in generics (
list[GRPORollout],list[torch.Tensor],list[int],list[float],dict) following modern Python conventions.Also applies to: 300-300, 314-314, 339-339, 354-354, 367-367
grail/monitoring/manager.py (10)
31-31: LGTM: Type modernization applied correctlyThe union type syntax modernization from
Optional[MonitoringBackend]toMonitoringBackend | Nonealigns with PEP 604 and Python 3.10+ standards.
38-38: LGTM: Proper type annotation for metric bufferThe buffer is correctly typed as
list[MetricData]using built-in generics.
41-46: LGTM: Consistent type modernization for optional attributesAll optional attributes are properly typed using the modern union syntax (
X | None).
49-49: LGTM: Correct parameter type annotationThe
configparameter is properly typed asdict[str, Any]using built-in generics.
103-103: LGTM: Proper type annotation for window_number parameterThe optional
window_numberparameter is correctly typed asint | None.
116-117: LGTM: Consistent type modernization for numeric parametersThe
valueandtagsparameters are properly typed using modern union syntax.
148-149: LGTM: Consistent type annotations for gauge methodParameters are properly typed using built-in generics and union syntax.
176-176: LGTM: Proper type annotation for histogram methodThe
tagsparameter is correctly typed asdict[str, str] | None.
268-268: LGTM: Proper type annotation for start_run methodThe
configparameter is correctly typed as optionaldict[str, Any] | None.
335-335: LGTM: Consistent type modernization for global variableThe global
_monitoring_manageris properly typed asMonitoringManager | None.grail/monitoring/backends/wandb_backend.py (14)
13-13: LGTM: Proper import of Generator typeThe
Generatortype is correctly imported fromcollections.abcfor proper typing of context managers.
19-19: LGTM: Import reordering for consistencyThe imports are properly reorganized with base imports following a logical order.
35-35: LGTM: Type modernization for config attributeThe
configattribute is properly typed asdict[str, Any]using built-in generics.
39-41: LGTM: Consistent type modernization for optional attributesThe
_start_timeand_tablesattributes are properly typed using modern syntax.
43-43: LGTM: Proper parameter type annotationThe
configparameter is correctly typed asdict[str, Any].
180-180: LGTM: Proper type annotation for metrics listThe
metricsparameter is correctly typed aslist[MetricData].
197-197: LGTM: Explicit type annotation for data dictionaryThe
datavariable is explicitly typed asdict[str, Any]for clarity.
220-220: LGTM: Consistent return type annotationThe return type is properly typed as
dict[str, Any].
260-260: LGTM: Proper type annotation for array_like variableThe
array_likevariable is correctly typed asnp.ndarray | None.
280-280: LGTM: Proper return type annotationThe return type is correctly typed as
float | None.
296-296: LGTM: Consistent parameter type annotationsBoth
dataandmetricparameters are properly typed using built-in generics.
324-324: LGTM: Proper return type for context managerThe
timermethod correctly returnsGenerator[None, None, None]to match the context manager protocol.
371-416: Well-designed text artifact handlingThe implementation for handling text artifacts with persistent wandb Tables is well thought out. Using a defined schema with stable columns and MUTABLE mode for accumulation is a good design choice.
427-427: LGTM: Proper parameter type annotationThe
configparameter is correctly typed asdict[str, Any].grail/environments/sat.py (15)
5-7: LGTM: Proper imports for built-in typesThe imports are correctly updated to use built-in types directly rather than importing from
typing.
27-27: LGTM: Type modernization for SATProblem constructorThe
clausesparameter is properly typed aslist[list[int]]using built-in generics.
38-38: Minor: Unused loop variableThe loop variable
_iis prefixed with underscore to indicate it's unused, which is a good practice.
43-43: LGTM: Proper parameter type annotationThe
assignmentparameter is correctly typed aslist[bool].
120-120: LGTM: Consistent return type annotationThe
parsemethod return type is properly typed asdict[str, Any].
179-179: LGTM: Explicit type annotation for parsed dictionaryThe
parsedvariable is explicitly typed asdict[str, Any]for clarity.
191-191: Minor formatting improvement in loggingThe f-string formatting is cleaner than the previous style with explicit string conversion.
201-201: LGTM: Consistent type annotationsBoth the parameter and return types are properly typed using built-in generics.
294-294: LGTM: Proper type casting for reward functionsThe cast to
list[Callable[[Any, Any], float]]is correctly typed using built-in generics.
373-373: LGTM: Consistent return type annotationThe return type is properly typed as
dict[str, Any].
385-386: LGTM: Proper parameter type annotationsBoth
stateandtrajectoryparameters are correctly typed using built-in generics.
397-397: LGTM: Consistent parameter typesThe
stateparameter is properly typed asdict[str, Any].
404-404: Good practice: Direct attribute assignmentUsing direct attribute assignment (
self._last_completion_text = text) is cleaner than usingsetattr.
419-420: LGTM: Proper type annotations for step_environmentBoth parameters and return type are correctly typed using built-in generics.
488-489: LGTM: Consistent type annotationsThe parameters and return type are properly typed using built-in generics.
grail/cli/validate.py (24)
16-16: LGTM: Simplified importsThe import statement is simplified by removing unnecessary typing imports since built-in types are now used directly.
162-162: LGTM: Proper return type annotationThe return type is correctly typed as
tuple[Optional[str], Optional[int], Optional[int]].
175-175: LGTM: Consistent return type annotationThe return type is properly typed as
tuple[Optional[str], Optional[int]].
222-223: LGTM: Type modernization for global variableThe
miner_inference_countsglobal is properly typed asdefaultdict[str, list].
303-303: LGTM: Proper return type annotationThe return type is correctly typed as
tuple[float, float].
341-343: LGTM: Complex nested type properly annotatedThe
inference_countsvariable is correctly typed asdefaultdict[str, defaultdict[int, dict[str, int]]].
357-357: LGTM: Simplified dictionary creationUsing
dict(zip(...))is cleaner than the previous list comprehension approach.
438-441: LGTM: Cleaner f-string formattingThe f-string formatting is more readable than the previous style.
629-629: LGTM: Proper return type annotationThe return type is correctly typed as
tuple[Any, GrailChainManager].
689-689: LGTM: Proper return type annotationThe return type is correctly typed as
list[str].
692-692: LGTM: Cleaner f-string formattingThe f-string formatting improves readability.
702-704: LGTM: Proper type annotations for function signatureAll parameters and return type are correctly typed using built-in generics.
723-726: Verify the weighting logic changeThe base_score calculation currently only uses unique_score (coefficient 1.0) while successful_score and estimated_valid_score have zero coefficients. This appears intentional based on the comment at line 725, but please confirm this is the desired behavior for the current release.
737-738: LGTM: Proper type annotationsAll parameters and return type are correctly typed using built-in generics.
758-765: LGTM: Complex return type properly annotatedThe complex tuple return type is correctly typed with all nested generic types.
835-840: LGTM: Proper type annotations for function parametersAll parameters are correctly typed using built-in generics including
defaultdict.
868-874: LGTM: Cleaner f-string formattingThe f-string formatting improves readability in warning messages.
927-927: LGTM: Simplified sortingThe direct use of
sorted()with a lambda key function is clean and readable.
937-937: LGTM: Improved formatting precisionUsing
{SAMPLE_RATE * 100:.0f}%provides cleaner percentage display.
1236-1236: LGTM: Cleaner set display in loggingUsing
{set(base_seeds)}provides a cleaner display of unique values.
1286-1290: LGTM: Proper type annotations for weights computationAll parameters and return type are correctly typed using built-in generics.
1310-1310: LGTM: Explicit type annotation for empty metricsThe
EMPTY_METRICSvariable is explicitly typed asdict[str, int]for clarity.
1368-1368: LGTM: Proper type annotation for upload functionThe
all_valid_rolloutsparameter is correctly typed aslist[dict].
1383-1383: LGTM: Cleaner f-string formattingThe f-string formatting improves readability in the log message.
grail/shared/hf_compat.py (1)
3-3: Correct HF typing import.PretrainedConfig casing is correct; import looks good.
grail/infrastructure/network.py (1)
10-10: Typing modernization LGTM.tuple[...] return annotation change is correct and non-breaking.
grail/cli/train.py (3)
65-68: Type-hint modernization LGTM.tuple[...] annotations are correct and non-breaking.
80-83: Type-hint modernization LGTM.Consistent with PEP 585.
271-273: Reward logging OK; ensure non-empty rewards.Given earlier guards, this is fine.
grail/infrastructure/chain.py (3)
55-57: Built-in generics for commitments.Annotation update to dict[int, Bucket] is correct and consistent with the repo.
167-176: Return typing modernization LGTM.Signature and docstring align; function returns dict on all paths.
267-275: get_all_buckets typing/return LGTM.Consistent use of built-in generics.
- Add GRAIL_BURN_UID and GRAIL_BURN_PERCENTAGE environment variables - Implement burn logic in weight computation - When configured, allocates specified percentage of emissions to burn UID - Scales down all other weights proportionally - Updates docker-compose and documentation Example: GRAIL_BURN_UID=0 GRAIL_BURN_PERCENTAGE=10 burns 10% of emissions
…e as a single source of truth.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (9)
docker/docker-compose.validator.yml (4)
69-75: Healthcheck may fail: pgrep might be missing and exact match is brittle.Use CMD-SHELL and a more forgiving pattern; or a grail health subcommand if available. Verify pgrep exists in the image.
- healthcheck: - test: ["CMD", "pgrep", "-f", "grail validate"] + healthcheck: + test: ["CMD-SHELL", "pgrep -f 'grail.*validate' >/dev/null 2>&1 || exit 1"] interval: 60s timeout: 10s retries: 3 start_period: 30s
87-92: 30s Watchtower polling is aggressive.Consider 300–900s to reduce registry load and rate-limit risk. Functionality unchanged.
- command: --interval 30 --cleanup --label-enable + command: --interval 600 --cleanup --label-enable
11-49: Trailing spaces and missing EOF newline.Strip trailing spaces (see YAMLlint at Lines 16, 21, 29, 39, 44, 51, 57, 65, 68, 76) and add a newline at end of file to keep linters green.
Also applies to: 51-57, 65-68, 76-76, 98-102
52-57: Nice: wallet mount is read-only. Consider extra hardening.Optional hardening that usually works with CUDA containers: drop caps and set no-new-privileges.
validator: @@ volumes: - ${HOME}/.bittensor:/root/.bittensor:ro - validator-cache:/root/.cache - validator-data:/app/data + security_opt: + - no-new-privileges:true + cap_drop: + - ALLdocs/VALIDATOR_DEPLOYMENT.md (5)
65-73: Docs and compose disagree on GPU config.Compose file used with docker compose needs gpus: all (Swarm-only deploy.devices won’t apply). Add a note and/or snippet so users on docker compose get GPU access.
Also applies to: 138-145
165-166: CLI inconsistency: docker compose vs docker-compose.Standardize on docker compose (plugin) throughout.
-docker-compose -f docker/docker-compose.validator.yml restart validator +docker compose -f docker/docker-compose.validator.yml restart validator
171-175: ps template may not expose .State consistently.Consider showing only Status, or explicitly document expected output.
-docker ps --format "table {{.Names}}\t{{.Status}}\t{{.State}}" +docker ps --format "table {{.Names}}\t{{.Status}}"
234-235: Replace bare URLs with Markdown links.Keeps markdownlint happy.
-- GitHub Issues: https://github.com/tplr-ai/grail/issues -- Documentation: https://github.com/tplr-ai/grail/blob/main/SPEC.md +- GitHub Issues: [tplr-ai/grail issues](https://github.com/tplr-ai/grail/issues) +- Documentation: [SPEC.md](https://github.com/tplr-ai/grail/blob/main/SPEC.md)
198-209: Optional: provide port examples if “external axon” requires exposure.Section mentions ports but compose doesn’t expose any. Add a snippet for users who need it.
# Example (only if your validator exposes an external service) services: validator: ports: - "8080:8080" # replace with actual health/axon port
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
docker/README.md(1 hunks)docker/docker-compose.validator.yml(1 hunks)docs/VALIDATOR_DEPLOYMENT.md(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- docker/README.md
🧰 Additional context used
🪛 YAMLlint (1.37.1)
docker/docker-compose.validator.yml
[error] 16-16: trailing spaces
(trailing-spaces)
[error] 21-21: trailing spaces
(trailing-spaces)
[error] 29-29: trailing spaces
(trailing-spaces)
[error] 39-39: trailing spaces
(trailing-spaces)
[error] 44-44: trailing spaces
(trailing-spaces)
[error] 51-51: trailing spaces
(trailing-spaces)
[error] 57-57: trailing spaces
(trailing-spaces)
[error] 65-65: trailing spaces
(trailing-spaces)
[error] 68-68: trailing spaces
(trailing-spaces)
[error] 76-76: trailing spaces
(trailing-spaces)
[error] 102-102: no new line character at the end of file
(new-line-at-end-of-file)
🪛 markdownlint-cli2 (0.17.2)
docs/VALIDATOR_DEPLOYMENT.md
234-234: Bare URL used
(MD034, no-bare-urls)
235-235: Bare URL used
(MD034, no-bare-urls)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Run Tests (3.11)
| deploy: | ||
| resources: | ||
| reservations: | ||
| devices: | ||
| - driver: nvidia | ||
| count: 1 | ||
| capabilities: [gpu] | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
GPU reservation won’t be honored by docker compose; add gpus: all (or pin to Swarm).
The deploy.resources.devices block is only effective with Swarm (docker stack). Your docs use docker compose, so the container likely won’t get a GPU. Prefer the compose-native gpus field; keep deploy only if you also support Swarm.
Apply:
validator:
image: ghcr.io/tplr-ai/grail:latest
+ gpus: "all"
@@
- deploy:
- resources:
- reservations:
- devices:
- - driver: nvidia
- count: 1
- capabilities: [gpu]
+ # Optional: keep the Swarm block if you also deploy via `docker stack`
+ # deploy:
+ # resources:
+ # reservations:
+ # devices:
+ # - driver: nvidia
+ # count: 1
+ # capabilities: [gpu]📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| deploy: | |
| resources: | |
| reservations: | |
| devices: | |
| - driver: nvidia | |
| count: 1 | |
| capabilities: [gpu] | |
| validator: | |
| image: ghcr.io/tplr-ai/grail:latest | |
| gpus: "all" | |
| # Optional: keep the Swarm block if you also deploy via `docker stack` | |
| # deploy: | |
| # resources: | |
| # reservations: | |
| # devices: | |
| # - driver: nvidia | |
| # count: 1 | |
| # capabilities: [gpu] |
🧰 Tools
🪛 YAMLlint (1.37.1)
[error] 65-65: trailing spaces
(trailing-spaces)
🤖 Prompt for AI Agents
In docker/docker-compose.validator.yml around lines 58 to 65, the compose file
uses deploy.resources.devices which only works with Swarm and will not grant a
GPU when using docker compose; replace or complement that block with the
compose-native gpus setting (e.g., add gpus: all or gpus: '"device=0"' under the
service definition) so the GPU is available in docker compose, and keep
deploy.resources.devices only if you also support Swarm deployments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (2)
grail/__init__.py (2)
23-35: Public API: imports not reflected in all (PROTOCOL_VERSION, RolloutGenerator).You import these at module scope but don’t list them in all. If they’re intended as public symbols, add them; otherwise drop the imports to avoid accidental exposure.
Apply to include both:
__all__ = [ # Core classes "Prover", "Verifier", + "PROTOCOL_VERSION", # New reward system "Parser", "RewardVector", "SATParser", "create_sat_reward_vector", # Existing SAT classes "SATProblem", "generate_sat_problem", "SATRolloutGenerator", + "RolloutGenerator", # Entry points "main", ]Alternative: if they’re not public, remove the imports:
-from .infrastructure.comms import ( - PROTOCOL_VERSION, - ... -) ... -from .mining.rollout_generator import RolloutGenerator # noqa: F401, E402, E501, F403, F405 +from .infrastructure.comms import (... without PROTOCOL_VERSION ...) +# (and drop the RolloutGenerator import)Also applies to: 43-43, 47-62
10-20: Nit: narrow noqa disables.You’re not using import-star, so F403/F405 are unnecessary. Prefer minimal, targeted noqa to keep lints meaningful.
For example:
-from .environments import ( # noqa: F401, E402, E501, F403, F405 +from .environments import ( # noqa: F401, E402, E501 ... -from .grail import Prover, Verifier # noqa: F401, E402, E501, F403, F405 +from .grail import Prover, Verifier # noqa: F401, E402, E501 ... -from .infrastructure.comms import ( # noqa: F401, E402, E501, F403, F405 +from .infrastructure.comms import ( # noqa: F401, E402, E501 ... -from .infrastructure.drand import ( # noqa: F401, E402, E501, F403, F405 +from .infrastructure.drand import ( # noqa: F401, E402, E501 ... -from .mining.rollout_generator import RolloutGenerator # noqa: F401, E402, E501, F403, F405 +from .mining.rollout_generator import RolloutGenerator # noqa: F401, E402, E501Also applies to: 21-21, 23-35, 39-42
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
grail/__init__.py(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
grail/__init__.py (6)
grail/environments/sat.py (5)
SATParser(96-198)SATProblem(24-60)SATRolloutGenerator(337-523)create_sat_reward_vector(281-334)generate_sat_problem(63-93)grail/grail.py (2)
Prover(505-541)Verifier(547-1188)grail/infrastructure/comms.py (11)
download_file_chunked(371-483)download_from_huggingface(996-1051)file_exists(525-547)get_file(568-586)get_valid_rollouts(754-785)list_bucket_files(550-565)login_huggingface(793-820)sink_window_inferences(594-622)upload_file_chunked(204-297)upload_to_huggingface(823-993)upload_valid_rollouts(726-751)grail/drand.py (2)
get_drand_beacon(266-317)get_round_at_time(364-379)grail/infrastructure/drand.py (2)
get_drand_beacon(87-137)get_round_at_time(170-173)grail/mining/rollout_generator.py (1)
RolloutGenerator(102-379)
🔇 Additional comments (3)
grail/__init__.py (3)
18-20: LGTM: exposing SAT helpers in top-level API.Good call surfacing create_sat_reward_vector and generate_sat_problem here.
21-21: LGTM: Prover/Verifier re-export.This makes the common entry points easy to discover.
2-2: Confirm version reset; prefer forward-only bump per PEP 440.pyproject.toml uses hatch to read version from grail/init.py; version is currently "0.0.1" (grail/init.py:2) and there are no git tags — don't decrease the package version; bump to the next dev release.
-__version__ = "0.0.1" +__version__ = "0.1.0.dev1"
| from .infrastructure.drand import ( | ||
| get_drand_beacon, | ||
| get_round_at_time, | ||
| ) # noqa: F401, E402, E501, F403, F405 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Verification agent
🧩 Analysis chain
Unify DRAND API source to avoid inconsistent semantics.
Both grail/drand.py and grail/infrastructure/drand.py define get_drand_beacon/get_round_at_time with different edge-case handling (e.g., timestamp < genesis). Pick one canonical module for the public surface and ensure behavior is consistent.
Apply if grail/drand.py is the canonical API:
-from .infrastructure.drand import (
- get_drand_beacon,
- get_round_at_time,
-) # noqa: F401, E402, E501, F403, F405
+from .drand import get_drand_beacon, get_round_at_time # noqa: F401, E402Quick repo check for duplicate definitions and differing logic:
🏁 Script executed:
#!/bin/bash
rg -nC2 -g 'grail/**/drand*.py' -e 'def get_drand_beacon\(|def get_round_at_time\('
rg -nC3 -g 'grail/**/drand*.py' -e 'genesis|GENESIS|period|PERIOD|timestamp'Length of output: 13865
Unify DRAND public API to grail.drand and fix inconsistent round calculation.
- Replace public export in grail/init.py with the canonical module:
-from .infrastructure.drand import (
- get_drand_beacon,
- get_round_at_time,
-) # noqa: F401, E402, E501, F403, F405
+from .drand import get_drand_beacon, get_round_at_time # noqa: F401, E402- Fix grail/infrastructure/drand.py::get_round_at_time to match canonical semantics (t == genesis -> 1; t < genesis -> 0). Current implementation can produce negative rounds. Example replacement:
def get_round_at_time(timestamp: int) -> int:
if timestamp < DRAND_GENESIS_TIME:
return 0
return 1 + (timestamp - DRAND_GENESIS_TIME) // DRAND_PERIOD🤖 Prompt for AI Agents
In grail/__init__.py around lines 39 to 42, the package currently exports DRAND
symbols from grail.infrastructure.drand which creates an inconsistent public
API; change the public export to re-export the canonical grail.drand module
(i.e. expose grail.drand as the public DRAND API), and in
grail/infrastructure/drand.py update get_round_at_time so it never returns
negative rounds by returning 0 when timestamp < DRAND_GENESIS_TIME and otherwise
computing 1 + (timestamp - DRAND_GENESIS_TIME) // DRAND_PERIOD to ensure t ==
genesis yields round 1.
…e as a single source of truth.
- Add GRAIL_BURN_UID and GRAIL_BURN_PERCENTAGE environment variables - Implement burn logic in weight computation - When configured, allocates specified percentage of emissions to burn UID - Scales down all other weights proportionally - Updates docker-compose and documentation Example: GRAIL_BURN_UID=0 GRAIL_BURN_PERCENTAGE=10 burns 10% of emissions
feat: add emission burn mechanism for validators
….py; passing ci now
Summary by CodeRabbit
New Features
Documentation
Chores
Tests