Releases: gonka-ai/gonka
Release v0.2.13
This document covers the v0.2.13 upgrade proposal.
The release fixes confirmation PoC reward accounting, devshard escrow params,
complaint-response authz grants, upstream response parsing, participant
reactivation, node-manager gRPC defaults, and devshard storage growth.
It also adds a guardian-controlled emergency switch for disabling devshard
inference requests.
The upgrade also disables confirmation PoC for the rest of the upgrade epoch
so the new snapshot logic starts cleanly from the next epoch.
Upgrade Plan
The node binary is upgraded through an on-chain software upgrade proposal.
The PR also updates api and node container versions in
deploy/join/docker-compose.yml for hosts joining after the on-chain upgrade.
Existing hosts are not required to manually update their api or node
containers as part of the chain upgrade.
Proposed Process
- Active hosts review this proposal on GitHub.
- If the on-chain proposal is approved, this PR will be merged immediately after the upgrade is executed on-chain.
Migration
The on-chain migration logic is defined in upgrades.go.
Migrations:
- Sets
DevshardEscrowParams.MaxEscrowsPerEpochto500_000. - Sets
DevshardEscrowParams.MaxNonceto1_000_000. The previous settlement
path used a hardcoded20_000nonce limit. - Adds addresses of several early miners and known brokers to
DevshardEscrowParams.AllowedCreatorAddresses. - Sets
GenesisOnlyParams.GenesisGuardianMultiplierto0.33334, reducing
genesis guardian power from about 34% to about 25% of adjusted voting power
while early-network protection applies. - Sets the chain-wide governance quorum to
0.25. Quorum is computed against
total chain voting power; with genesis guardians (25%) not voting, this gives
an effective 1/3 quorum among the remaining 75% of voting power
(0.25 / 0.75 = 0.334). - Backfills
EpochGroupData.ConfirmationWeightScalesfor the current epoch and
clamps existing confirmation weights down to the new expected value. - Backfills
MsgRespondDealerComplaintsauthz grants on existing cold-to-warm
ML ops pairs. v0.2.12 added this message to the permission list but did not
migrate existing grants, so DAPIs that joined before v0.2.12 could not respond
to dealer complaints. - Disables confirmation PoC triggers for the rest of the upgrade epoch via a
grace-epochUpgradeProtectionWindowof 10000 blocks. The new snapshot logic
starts from the next epoch. - Adds MiniMax-M2.7 (
MiniMaxAI/MiniMax-M2.7) as a governance model and PoC
model config withPenaltyStartEpoch = 278(bootstrap activation epoch). - Updates
PocParams.Models[*].WeightScaleFactorto recalibrate against the
Qwen-on-B200 reference after the vLLM 0.20.1 release. Kimi was too high on
B* GPUs. Kimi = Qwen-on-B200 + 10% (top-tier premium), MiniMax = Qwen-on-B200:- Kimi (
moonshotai/Kimi-K2.6):0.78 - MiniMax (
MiniMaxAI/MiniMax-M2.7):0.3024
- Kimi (
- Updates
Model.ValidationThresholdfrom cross-version vLLM results:- Qwen (
Qwen/Qwen3-235B-A22B-Instruct-2507-FP8):0.940 - Kimi (
moonshotai/Kimi-K2.6):0.900 - MiniMax (
MiniMaxAI/MiniMax-M2.7):0.922
- Qwen (
- Adds
--enable-auto-tool-choiceto KimiModelArgsif missing.
Changes
inference-chain
- Confirmation PoC used different model sets for measured weight, preserved
weight, and reward rescaling. During new-model bootstrap, this could reduce
confirmation weight for honest miners serving both an eligible model and a
not-yet-eligible model. v0.2.13 stores one epoch snapshot of confirmable
models and weight-scale factors, then uses it for confirmation and reward
calculations. ConsecutiveInvalidInferenceswas not reset when a participant became ACTIVE
again. A host could return from invalid state and be invalidated again after
one new failure. v0.2.13 resets the counter on reactivation and upcoming
promotion.- Devshard settlement now reads the nonce limit from
DevshardEscrowParams.MaxNonceinstead of a hardcoded constant. - The upgrade adds addresses of several early miners and known brokers to the
devshard creator allowlist without removing existing allowed creator addresses. - Genesis guardians held about 34% of adjusted voting power, which made quorum
hard to reach when they did not vote. The upgrade reduces guardian power to
about 25% viaGenesisOnlyParams.GenesisGuardianMultiplier = 0.33334and
sets the chain-wide governance quorum to0.25. Quorum is computed against
total bonded power; with guardians not voting, this gives an effective 1/3
quorum among the remaining 75% of voting power (0.25 / 0.75 = 0.334). - Adds
MsgSetDevshardRequestsEnabled, a guardian-signed transaction for
emergency disabling and re-enabling devshard inference requests.
decentralized-api
- Some OpenAI-compatible upstreams return numeric
stop_reasonvalues.
Choice.StopReasonnow accepts any JSON type, so those responses no longer
fail unmarshalling. NodeManagerGrpcPortdid not start by default when unset. It now defaults to
9400, and join compose uses the same default so devshard can reach the API
without manual config.- The internal devshard service inside dapi uses the same devshard storage
changes listed below, including pruning and Postgres support.
devshard
- Devshard storage could grow forever because old escrow data stayed in one
SQLite store. Storage is now epoch-scoped and prunes old epochs in the
background, keeping the latest 3 epochs. - Devshard can use Postgres as the primary store for larger deployments, with
SQLite kept as a local fallback. - Postgres data is partitioned by
epoch_idfor sessions, diffs, and
signatures, so pruning can drop old epoch data cleanly.
Release v0.2.13: devshard binary
Release to save devshard binary
Upgrade Proposal: v0.2.12
This document outlines the proposed changes for on-chain software upgrade v0.2.12.
The Changes section details the major modifications, and the Upgrade Plan section describes the process for applying these changes.
Upgrade Plan
This PR updates the code for the api and node services. The PR modifies the container versions in deploy/join/docker-compose.yml and introduces a new versiond service in the join stack.
The binary versions will be updated via an on-chain upgrade proposal. For more information on the upgrade process, refer to /docs/upgrades.md.
Existing hosts are not required to upgrade their api and node containers as part of the on-chain upgrade itself. After the upgrade, hosts must deploy the new versiond service and update and redeploy proxy with VERSIOND_SERVICE_NAME=versiond and GONKA_API_EXEMPT_ROUTES=chat inference poc/proofs devshard so /devshard/<version>/* traffic is routed through proxy -> versiond -> devshardd. New hosts joining after the upgrade should use the updated container versions from this compose file.
Proposed Process
- Active hosts review this proposal on GitHub.
- If the on-chain proposal is approved, this PR will be merged immediately after the upgrade is executed on-chain.
Testing
The on-chain upgrade from version v0.2.11 to v0.2.12 has been successfully deployed and verified on the testnet. No regression in core functionality or performance has been observed during testing. More testing will be executed leading up to the upgrade.
Reviewers are encouraged to request access to testnet environments to validate both node behavior and the on-chain upgrade process, or to replay the upgrade on private testnets.
Migration
The on-chain migration logic is defined in upgrades.go.
Migrations:
- Auto-creates
x/feegrantallowances for every existing cold-to-warm ML ops authz grant in case transaction fees are later turned on. - Initializes
FeeParamswithmin_gas_price_ngonka = 0(fees are effectively disabled at upgrade time, see Changes). - Migrates singular PoC model parameters into the new multi-model
PocParams.Modelslist and initializesDelegationParams. - Adds the
moonshotai/Kimi-K2.6governance model and its PoC model config (seq_len=1024, scaled weight coefficient, penalty start at effective epoch + 2). - Seeds
DevshardEscrowParams.ApprovedVersionswith the initialv1devshard binary (sha25615f72244...d36d4715) soversiondhas an approved version to download and run immediately after the upgrade. - Sets
EpochParams.ConfirmationPocSafetyWindowto500blocks andDelegationParams.DeployWindowto500blocks. - Clears legacy PoC v2 data (which used old key layouts) and seeds new pruning state markers for the new multi-model collections.
- Backfills
ActiveParticipant.VotingPowersandEpochGroupDatasubgroup voting power for the current epoch to ensure seamless PoC validation post-upgrade. - Removes unused
TopMinersand training states (training will be moved to an off-chain architecture similar to devshards).
Changes
Multi-model PoC
Historically, PoC has been tied to a single base model. While the network aims to support multi-model inference, relying on a single-model PoC is not secure enough.
If the network served several models but only checked one during PoC, an attacker could spin up hardware just for the check and shut it down afterward. To prevent this, PoC must start immediately on the exact model being validated, proving the hardware is present and running that model right now with no window to swap deployments.
To support multiple models, this upgrade runs PoC for each model independently in separate model groups. The core mechanics:
- Each governance-approved model gets its own PoC group. PoC runs for all eligible groups in parallel.
- Weight is split into two layers.
PoC weightis model-local and drives inference routing and inference rewards inside that specific group.Consensus weightis the total weight aggregated across all eligible model groups (using model-specific coefficients) that determines block signing power, voting power, and bitcoin-style rewards. - Because not every Host can run every model, a Host not serving a model can delegate its consensus weight to a group member for PoC validation only (this does not affect block signing or governance voting power). This preserves the existing security model: a model group must reach a 2/3 validation threshold of the total network consensus weight, not just the group-local weight, even if its direct members hold less than that total amount.
- For each active model, Hosts must explicitly choose their participation mode (DIRECT, DELEGATE, REFUSE). Hosts who do nothing receive a penalty. Penalties are skipped during a model's initial grace period.
The current base model remains the starting group for bootstrapping additional models. The exact model coefficients and final parameter values are not yet part of this PR.
Transaction fees for spam prevention (#937, #981, #1120)
v0.2.12 lays the groundwork for consensus-level transaction fees. Before this upgrade, any funded account could broadcast an unlimited number of transactions at zero cost, because the chain relied only on per-validator minimum-gas-prices configuration, which is mempool-only and trivially bypassed by a malicious block proposer. This left governance proposals, bank sends, staking operations, collateral management, reward claims, bridge operations, and CosmWasm calls without any economic friction against abuse.
v0.2.12 introduces a governance-controlled FeeParams.min_gas_price_ngonka enforced during both CheckTx and DeliverTx. The full machinery is in place: a NetworkDutyFeeBypassDecorator that exempts protocol-obligation messages (PoC submissions, validation messages, inference start/finish, BLS DKG rounds), and a two-component fee on MsgPoCV2StoreCommit for Host sybil resistance (a base validation cost per participant per epoch plus a count-proportional cost per count delta).
Fees are effectively disabled at upgrade time. min_gas_price_ngonka is initialized to 0 due to remaining issues in client-side gas estimation. Once those are resolved, governance can flip on a non-zero value without a chain upgrade. No host action is required to support fees in this release; the upgrade still installs x/feegrant allowances from cold to warm keys so the switch can be flipped without a follow-up migration.
Devshards (formerly "subnets") — standalone, versioned runtime (#1045)
Previously, the devshard runtime lived inside the main DAPI process. Upgrading devshards meant rebuilding, redeploying, and restarting the entire DAPI, which slowed down development and added risk to all Host work (including inference, PoC, and Confirmation PoC).
To solve this, v0.2.12 decouples devshards into a standalone, versioned runtime managed by a new service called versiond.
versiondautomatically downloads and runs devshard binaries approved by on-chain governance.- Multiple devshard versions can run side-by-side. Traffic to
/devshard/<version>/*is routed to the corresponding binary, while the legacy/v1/devshard/*route remains active during the transition. - The standalone devshard directly communicates with MLNodes during inference but does not manage their lifecycle, cleanly separating the roles of MLNode manager (DAPI) and client.
- Each session is cryptographically bound to the specific binary version that served it. The settlement payload now includes a cleartext
versionfield, ensuring a session cannot mix responses from different versions. - The term "subnet" is entirely replaced by "devshard" across the codebase. Additionally, float math in devshard settlement has been replaced with deterministic integer arithmetic to eliminate consensus-failure risks.
Random selection of preserved MLNodes (#1089)
Previously, "preserved" nodes (the ones that stay on inference instead of running PoC) were chosen once per epoch via the static MLNodeInfo.timeslot_allocation[POC_SLOT] flag. Because the flag was visible at epoch start and held for the entire epoch, an operator knew well in advance which boxes would skip both the epoch-start PoC and every confirmation PoC event in that epoch. That made hardware downgrade or partial-capacity substitution easy to plan around.
v0.2.12 replaces epoch-long preservation with episode-scoped preservation. An episode is a single PoC execution window: either the epoch-start regular PoC, or one confirmation PoC event during the inference phase. At each PoC anchor (upcomingEpoch.PocStartBlockHeight for regular PoC, event.TriggerHeight for confirmation), the chain materializes a fresh preserved snapshot for that single episode and overwrites a singleton state slot. The next episode gets a new sample.
Key properties:
- Late-binding: an operator cannot predict far in advance whether a given node will be preserved for the next PoC window.
- The candidate pool is the current model subgroup
EpochGroupData.ValidationWeights/MlNodes, applying existing protocol exclusions. ActiveParticipantsstays stable for the whole epoch;timeslot_allocation[POC_SLOT]is deprecated for scheduling.- The broker reads the current episode snapshot instead of the static epoch-long flag.
- Reward weight collapses from the old "preserved + measured...
Release v0.2.11
Upgrade Proposal: v0.2.11
This document outlines the proposed changes for on-chain software upgrade v0.2.11.
The Changes section details the major modifications, and the Upgrade Plan section describes the process for applying these changes.
Upgrade Plan
This PR updates the code for the api and node services. The PR modifies the container versions in deploy/join/docker-compose.yml.
The binary versions will be updated via an on-chain upgrade proposal. For more information on the upgrade process, refer to /docs/upgrades.md.
Existing hosts are not required to upgrade their api and node containers. The updated container versions are intended for new hosts who join after the on-chain upgrade is complete.
It also updates CosmWasm contract artifacts for the community sale and liquidity pool, adds bridge/testnet operational scripts for IBC trading support, and introduces a new subnet/ package used by the new inference architecture.
Proposed Process
- Active hosts review this proposal on GitHub.
- If the on-chain proposal is approved, this PR will be merged immediately after the upgrade is executed on-chain.
Testing
The on-chain upgrade from version v0.2.10 to v0.2.11 has been successfully deployed and verified on the testnet. No regression in core functionality or performance has been observed during testing. More testing will be executed leading up to the upgrade.
Reviewers are encouraged to request access to testnet environments to validate both node behavior and the on-chain upgrade process, or to replay the upgrade on private testnets.
Migration
The on-chain migration logic is defined in upgrades.go.
Migrations:
- Sets
ValidationParams.ClaimValidationEnabled = false. - Rebuilds active participant caches for the current and previous epoch.
- Migrates epoch-group validations into the new entry-based format.
- Community-sale CosmWasm contract migration.
Changes
PR #877 Inference shards (Experimental)
- Introduces subnet-based inference flow, moving per-inference coordination off-chain.
- The chain now handles only session setup, escrow, and settlement.
- Adds support for subnet state, transport, signing, storage, settlement, and API integration.
- Note: This feature is currently experimental and under limited access. For reference design and architecture, see
proposals/inference/.
PR #812 StartInference and FinishInference performance improvements
- Reduces unnecessary state writes and query overhead for
MsgStartInferenceandMsgFinishInference. - Simplifies stats handling and cuts work done during the inference lifecycle for better block execution stability.
PR #760 Unified Permissions
- Consolidates message-permission checks across the inference module.
- Removes duplicated authorization logic to make permission behavior more explicit and testable.
PR #779 Inference msgs optimization: optimize key verification
- Reduces cryptographic verification overhead in the inference message path.
- Avoids repeating signature checks where protocol guarantees make them redundant.
PR #874 MsgValidation and MsgClaimRewards performance optimization
- Reduces hot-path lookups and adds transient caching in validation and reward-claiming paths.
- Restructures validation/reward logic and introduces state pruning support.
PR #822 BLS related fixes based on Certik audit
- Applies Certik audit fixes to the BLS module.
- Fixes threshold-validation and duplicate-slot handling issues for distributed key generation and threshold-signing flows.
PR #814 IBC Trade Support
- Introduces governance-controlled support for trading approved IBC-denominated assets.
- Includes chain message/query changes and contract updates for the community sale and liquidity pool.
PR #868 Required-collateral aware slashing flow
- Bases slashing penalties on required collateral rather than the full deposited amount.
- Makes the slashing model more proportional for participants who over-deposit relative to the minimum.
PR #888 Fix: collateral
- Fixes reward calculation for undercollateralized miners, ensuring actual collateral accurately reduces effective earning power.
PR #775 fix: redirect slashed coins to gov
- Redirects slashed collateral to governance-controlled destinations.
Other changes
- PR #867 Fix the application.db bloat issue.
- PR #835 Add Batch Transfer With Vesting.
- PR #773 feat: delete governance model.
- PR #675 security: update CometBFT to v0.38.21 (CSA-2026-001).
- PR #543 fix: data race conditions.
- PR #815 Update CONTRIBUTING.md.
- PR #807 Update issue templates.
Proposed Bounties
| Bounty ID | Sum GNK | Bounty Explanation | GitHub ID |
|---|---|---|---|
| PR #543 | 2500 | extra bounty for a comprehensive review of all cases where the data race conditions fix was needed | @x0152 |
| Issue #628 | 25000 | PoC integration into vllm v0.11.1 report | Axel-t, @Red-Caesar |
| -- | 10000 | report of series of prompts resulting in vllm HTTP 502 response, significant impact, was already used for intentoinal greifing | @blizko |
| -- | 1000 | report of dust transaction vulnerability extending blocks | @blizko |
| -- | 5000 | report of Remote DoS of Validator PoC Software via dist Assertion | @ouicate |
| -- | 5000 | report of State Bloat PoC and End-Block DoS via Unbounded Batch / Validation Payloads | @ouicate |
| -- | 750 | report of Bridge Ethereum Address Parsing Silently Falls Back to Zero Bytes (Loss/Misdirection of Funds) | @ouicate |
| PR #775 | 1000 | planned task | @x0152 |
| PR #773 | 1250 | planned task | @x0152 |
| qdanik/vllm/pull/5 | 12000 | vLLM 0.15.1 Compatibility Experiments - basis for next ML node version | @qdanik |
| qdanik/vllm/pull/6 | 15000 | vLLM 0.15.1 Compatibility Experiments - basis for next ML node version. covering simultanious PoC and inference | @qdanik |
| -- | 5000 | report of wind down window vulnerability fixed in PR #767 | @qdanik |
| Issue #797 | 1000 | collective solving of nodes unable to join from snapshots - proposed valuable hypothesis | @akup |
| Issue #797 | 3000 | collective solving of nodes unable to join from snapshots - found source problem | @x0152 |
| Issue #780 | 750 | collective solving StartInference and FinishInference issue | @hleb-albau |
| Issue #781 | 5000 | collective solving StartInference and FinishInference issue | @x0152 |
| Issue #782 | 5000 | collective solving StartInference and FinishInference issue | @akup |
| PR #867 | 7500 | important issue that affected many participants, not a vulnerability, fairly easy fir; adding extra payment for fully testing and providing results of the test together with the fix | @Lelouch33 |
| Issue #730 | 22500 | vLLM 0.15.1 Compatibility Experiments - basis for next ML node version | @clanster, @baychak |
| PR #835 | 5000 | Batch Transfer With Vesting implementation, huge kudos for figuring out how to use testnet | @huxuxuya |
| PR #868 | 5000 | collateral slashing vulnerability and fix; low severity: low risk, medium likelyhood, organic | @qdanik |
| v0.2.11 | 7500 | release management | @akup |
| v0.2.11 | 7500 | release management | @x0152 |
| v0.2.10 | 2500 | upgrade review | @Yapion |
| v0.2.10 | 2500 | upgrade review | @blizko |
| v0.2.10 | 2500 | upgrade review | @x0152 |
Release v0.2.10-post7
Full Changelog: release/v0.2.10-post6...release/v0.2.10-post7
Release v0.2.10-post6: Pruning
Full Changelog: release/v0.2.10-post5...release/v0.2.10-post6
Release v0.2.10
Upgrade Proposal: v0.2.10
This document outlines the proposed changes for on-chain software upgrade v0.2.10. The Changes section details the major modifications, and the Upgrade Plan section describes the process for applying these changes.
Upgrade Plan
This PR updates the code for the api and node services. The PR modifies the container versions in deploy/join/docker-compose.yml.
The binary versions will be updated via an on-chain upgrade proposal. For more information on the upgrade process, refer to /docs/upgrades.md.
Existing hosts are not required to upgrade their api and node containers. The updated container versions are intended for new hosts who join after the on-chain upgrade is complete.
To apply the new vLLM model parameters, mlnode must be restarted after the on-chain upgrade. The safest approach is:
docker restart join-mlnode-1
Proposed Process
- Active hosts review this proposal on GitHub.
- Once the PR is reviewed by the community, a
v0.2.10release will be created from this branch, and an on-chain upgrade proposal for this version will be submitted. - If the on-chain proposal is approved, this PR will be merged immediately after the upgrade is executed on-chain.
Creating the release from this branch (instead of main) minimizes the time that the /deploy/join/ directory on the main branch contains container versions that do not match the on-chain binary versions, ensuring a smoother onboarding experience for new hosts.
Testing
The on-chain upgrade from version v0.2.9 to v0.2.10 has been successfully deployed and verified on the testnet. PoC time-based weight normalization has been validated in the testnet environment. No regression in core functionality or performance has been observed during testing.
Reviewers are encouraged to request access to testnet environments to validate both node behavior and the on-chain upgrade process, or to replay the upgrade on private testnets.
Migration
The on-chain migration logic is defined in upgrades.go.
Migrations:
- Validation slots default: explicitly sets
PocParams.ValidationSlots=0during migration. This keeps existing O(N^2) validation behavior after upgrade until sampling is enabled by governance parameter update. - PoC normalization default: explicitly sets
PocParams.PocNormalizationEnabled=trueduring migration to enable time-based weight normalization. - Model parameter update: Updates
Qwen/Qwen3-235B-A22B-Instruct-2507-FP8with tool calling args (--enable-auto-tool-choice,--tool-call-parser hermes) and validation threshold0.958.
PoC Validation Sampling Optimization
This upgrade introduces a new PoC validation mechanism that reduces complexity from O(N^2) to O(N x N_SLOTS) by assigning each participant a fixed sampled set of validators.
Reference design and analysis: proposals/poc/optimize.md
Key points:
- Only assigned validators validate each participant when sampling is enabled.
- Sampling is deterministic on both chain and API sides (based on validation snapshot +
app_hash). - Decision threshold is strict supermajority of assigned slots (>66.7%).
- The feature is shipped in this release but disabled by default (
ValidationSlots=0) and can be enabled via a governance proposal that changes theValidationSlotsparameter to a non-zero value once rollout conditions are met.
PoC Weight Normalization by Real PoC Time
This upgrade normalizes PoC participant weights by actual PoC elapsed time to reduce block-time drift effects and keep weight outcomes consistent with real execution duration.
Key points:
- Adds
PocParams.PocNormalizationEnabledparameter for time-based normalization control. - Captures generation start and exchange end timestamps in
PoCValidationSnapshot. - Applies a normalization factor derived from expected stage duration vs actual elapsed time.
- Applies to both regular PoC and confirmation PoC weight calculations.
- Enabled by default in this upgrade (
PocNormalizationEnabled=true).
Upgrade Grace Period
To ensure a smooth upgrade transition:
- Confirmation PoC will not be triggered for the first 3000 blocks (~5 hours) after upgrade.
- Miss/invalid punishment rates are relaxed for the entire grace epoch (binom_test_p0 set to 0.5).
- Regular PoC operates normally during the grace period.
Changes
PR #710 PoC Validation Sampling Optimization
- Reduces validation complexity from quadratic to slot-based sampling.
- Adds deterministic slot assignment shared by chain and API, with snapshot-backed weight synchronization.
- Keeps backward-compatible fallback path when
ValidationSlots=0and includes upgrade-time default ofValidationSlots=0for safe rollout.
PR #725 PoC weight normalization on real PoC time
- Adds time-based PoC weight normalization to reduce sensitivity to block-time variance.
- Introduces
PocNormalizationEnabledin PoC params and uses validation snapshot timestamps to compute normalization factor. - Integrates normalization into both regular PoC and confirmation PoC weight calculations.
- Upgrade handler enables normalization by default for
v0.2.10.
PR #767 Upgrade grace period, tool calling, and PoC timing fix
- Adds grace epoch protection for the upgrade epoch: extended CPoC window (3000 blocks) and relaxed miss/invalid thresholds.
- Updates Qwen model with tool calling support (
--enable-auto-tool-choice,--tool-call-parser hermes). - Adjusts validation threshold from 0.970917 to 0.958.
- Deprecates
poc_exchange_durationparameter (set to 0 in upgrade). API artifact acceptance now aligns with chain exchange windows using explicit block height checks instead of relying on phase alone. Fixes a gap where chain accepted nonces longer than API.
PR #708 IBC Upgrade to v8.7.0
- Upgrades IBC stack to v8.7.0.
- Aligns chain interoperability components with current IBC release line.
PR #723 Testnet bridge setup scripts
- Adds bridge setup scripts for testnet operations.
- Improves reproducibility of bridge deployment and validation workflows.
PR #666 Artifact storage throughput optimization
- Improves PoC artifact storage throughput.
PR #688 Punishment statistics from on-chain data
- Uses on-chain data for punishment statistics with dynamic table selection.
PR #697 Portable BLST build for macOS test builds
- Uses a portable BLST build path for macOS test binaries.
- Improves reliability of local/test build pipeline on macOS hosts.
PR #712 Require proto-go generation matches committed code
- Enforces proto-go generation consistency in development flow.
- Prevents accidental drift between generated and committed protobuf code.
PR #711 PoC test params from chain state
- Replaces hardcoded PoC test defaults with chain state parameters.
PR #641 Streamvesting transfer with vesting
- Adds
MsgTransferWithVestingRPC and message type in thestreamvestingmodule. Enables sender-to-recipient token transfers with vesting over N epochs (default: 180 epochs when not specified). - Adds safety limits to prevent abusive requests: max
3650vesting epochs and max10coin denoms per transfer.
API hardening and reliability fixes
- PR #634: add request body size limits to reduce DoS risk.
- PR #727: follow-up for #634, pass response writer to
http.MaxBytesReaderand align tests. - PR #638: fix unsafe type assertions in request processing.
- PR #644: avoid rewriting static config on each startup.
- PR #661: prevent API crash on short network drops.
- PR #640: add unit tests for node version endpoint behavior.
- PR #622: propagate refund errors in
InvalidateInference. - PR #639: add missing return after error in task claiming path.
- PR #643: sanitize nil participants in executor selection.
- PR #545: minor bug fixes in API flow.
Other fixes
- PR #659: model assignment checks previous-epoch rewards.
- PR #716: rename PoC weight function for clarity and correctness.
Proposed Bounties
| PR/Issue | Sum GNK | Bounty Explanation |
|---|---|---|
| PR #661 | 500 | Valid fix for minor vulnerability that was previously reported in issue #422 |
| PR #644 | 700 | Planned task, not a vulnerability, important for the network. |
| PR #659 | 10,000 | Detailed report and fix for a Medium risk vulnerability. |
| Report | 5,000 | First report of the vulnerability fixed in #659 |
| PR #545 | 1,000 | Report and fix of low risk vulnerability. Extra appreciation for d... |
Upgrade v0.2.9
Full PoC V2 activation with model consolidation and access controls.
Summary
- PoC V2 fully enabled (tracking mode from v0.2.8 -> full enforcement)
- Network consolidated to single model:
Qwen/Qwen3-235B-A22B-Instruct-2507-FP8 - Transfer Agent whitelist for request gating
- Guardian tiebreaker for undecided PoC V2 votes
- 24 suspicious participants removed from allowlist
Chain Changes
PoC V2 Activation
PocV2Enabled=true,ConfirmationPocV2Enabled=trueWeightScaleFactor=0.262,InferenceValidationCutoff=2,PocValidationDuration=480- All nodes'
POC_SLOTallocations reset (bothActiveParticipantsandEpochGroupData) - First V2 epoch runs in grace mode, full enforcement begins the following epoch
Guardian Tiebreaker
- When neither valid nor invalid votes reach majority, guardians can break the tie
- Requires: no majority exists, guardians enabled, at least one guardian voted, all voting guardians agree
Transfer Agent Whitelist
- New
TransferAgentAccessParams.AllowedTransferAddressesin params - Validation in
StartInferenceandFinishInferencemessages - Empty whitelist = all TAs allowed; non-empty = only listed TAs
Model Consolidation
- All governance models except
Qwen/Qwen3-235B-A22B-Instruct-2507-FP8deleted during migration
Participant Access Update
- Registration and allowlist heights set to 2475000
- 24 addresses removed from allowlist (completed POC generation but did not vote at validation)
API Changes
Enforced Model Auto-Switch
- Auto-switches nodes to Qwen235B with
--tensor-parallel-size 4 --max-model-len 240000if model is not configured - Three-layer enforcement: config load, state sync, runtime verification via vLLM
/v1/models
Transfer Agent Whitelist
- Early enforcement in
/chat/completionbefore expensive operations - Cache synced from chain on every new block for O(1) lookups
Bug Fixes
#674 Missed inferences fix
- Don't punish for missed inferences of non-preserved nodes during PoC
- Don't punish if participant doesn't support the model
#678 CPoC downtime penalty redistribution
- Penalty transferred to community pool instead of being lost
Inference expiry during PoC/CPoC
- New
InferenceExpiryContexttracks latest PoC or CPoC range - Inferences started during PoC/CPoC not punished for expiry
Bitcoin rewards weight calculation
- Full weights used for denominator (prevents redistribution of invalidated shares)
- CPoC reductions and invalidated shares go to governance pool
Release v0.2.8
Upgrade Proposal: v0.2.8
This document outlines the proposed changes for on-chain software upgrade v0.2.8. The Changes section details the major modifications, and the Upgrade Plan section describes the process for applying these changes.
Upgrade Plan
This PR updates the code for the api and node services. The PR modifies the container versions in deploy/join/docker-compose.yml.
The binary versions will be updated via an on-chain upgrade proposal. For more information on the upgrade process, refer to /docs/upgrades.md.
Existing hosts are not required to upgrade their api and node containers. The updated container versions are intended for new hosts who join after the on-chain upgrade is complete.
Proposed Process
- Active hosts review this proposal on GitHub.
- Once the PR is approved by a majority, a
v0.2.8release will be created from this branch, and an on-chain upgrade proposal for this version will be submitted. - If the on-chain proposal is approved, this PR will be merged immediately after the upgrade is executed on-chain.
Creating the release from this branch (instead of main) minimizes the time that the /deploy/join/ directory on the main branch contains container versions that do not match the on-chain binary versions, ensuring a smoother onboarding experience for new hosts.
Testing
The on-chain upgrade from version v0.2.7-post1 to v0.2.8 has been successfully deployed and verified on the testnet, including the PoC V2 parameter migration.
Reviewers are encouraged to request access to the testnet environment to validate the upgrade or test the on-chain upgrade process on their own private testnets.
Migration
The on-chain migration logic is defined in upgrades.go.
Migration tasks:
- Burn extra community coins: Burns all coins from the
pre_programmed_salemodule account (gonka1rmac644w5hjsyxfggz6e4empxf02vegkt3ppec) which were inadvertently created during genesis. - Precompute BLS slot keys: Generates and stores precomputed BLS slot public keys for the current epoch to enable the new optimized verification logic (see PR #609).
- Set PoC V2 migration parameters: Configures dual-mode migration with
ConfirmationPocV2Enabled=trueandPocV2Enabled=false, sets model ID toQwen/Qwen3-235B-A22B-Instruct-2507-FP8, sequence length to 1024, and statistical test thresholds for V2 validation.
PoC V2 Migration
For a smooth transition from PoC V1 to PoC V2, the chain must ensure that the majority of participants have switched to the new MLNode build supporting the Qwen/Qwen3-235B-A22B-Instruct-2507-FP8 model before PoC V2 becomes the main PoC engine. This upgrade enables tracking mode to measure adoption without affecting weights.
After this upgrade:
- Regular PoC continues using V1 (on-chain batches, weight enforcement).
- First Confirmation PoC per epoch uses V2 for tracking only (no weight/slashing impact).
- V2 tracking results allow monitoring adoption before full activation.
MLNode upgrade:
- New versions:
ghcr.io/product-science/mlnode:3.0.12(or3.0.12-blackwellfor Blackwell GPUs). - Backward compatible with 3.0.11 — can be upgraded before or after this on-chain upgrade.
- Must be upgraded before PoC V2 is fully enabled.
Enabling full PoC V2:
- PoC V2 will not activate automatically.
- Once adoption is sufficient, a separate governance proposal will set
poc_v2_enabled=true. - The epoch when V2 is enabled runs in grace mode (no punishment).
- Full V2 enforcement begins the following epoch.
Changes
PR #505 Security Fixes for v0.2.7
Addresses multiple security vulnerabilities:
- SSRF & DoS: Validates
InferenceUrlto reject internal IPs and adds timeouts to prevent request hangs. - Vote Flipping: Prevents overwriting of PoC validations by rejecting duplicates.
- Batch Size Limits: Enforces bounds on PoC batch sizes to prevent state bloat.
- PoC Exclusion: Fixes
getInferenceServingNodeIdsto correctly exclude inference-serving nodes. - Auth Bypass & Replay: Binds
epochIdto signatures and validates authorization against the correct epoch. - Thanks to: @ouicate
PR #609 BLS optimized
- Significantly optimizes BLS signature verification (from ~2s down to <10ms) by using the
blstlibrary and precomputing slot public keys.
PR #540 Remove ALL panic and Must from chain code
- Removes
panicandMustcalls from chain code to prevent consensus failures. - Implements linting (
forbidigo) and CI checks to enforce this rule.
PR #534 Security: prevent SSRF via executor redirect
- Prevents SSRF attacks where a malicious executor redirects Transfer Agent requests to internal services (e.g., admin API).
- Implements a custom HTTP client that disables following redirects.
- Thanks to: @x0152
PR #544 Inference: defense-in-depth against int overflow
- Fixes integer overflow vulnerabilities in escrow and cost calculations using checked arithmetic.
- Adds hard caps for token counts and improves error handling to fail closed on overflows.
- Thanks to: @ouicate
PR #506 Standardize floating point math
- Replaces dangerous floating-point math with
shopspring/decimalfor deterministic calculations (e.g., Dynamic Pricing). - Updates reward exponent calculation to use a table-based approach for decay rates.
PR #536 Perf: optimize participants endpoint with single balance query
- Optimizes the
/v1/participantsendpoint by replacing N gRPC calls with a single blockchain query. - Achieves ~500x speedup for large sets of participants.
- Thanks to: @x0152
PR #553 Membership for correct epoch for Validation requests
- Ensures validation rights are checked against the active participants of the target epoch, not the current one.
- Fixes logic for sharing work coins and refunds during validation/invalidation.
PR #607 Fix(inference): update totalDistributed after debt deduction
- Fixes a bug where
totalDistributedwas not updated after deducting debt, causing tokens to be lost instead of returned to governance. - Thanks to: @0xMayoor
PR #549 Disable future timestamp check for EA
- Temporarily disables the future timestamp check in the External Adapter (EA) to prevent rejecting requests when the EA is behind the chain during high load.
PR #550 Negative coin balance for settle
- Handles edge cases with negative coin balances by subtracting the negative amount from rewards instead of erroring.
PR #541 PoC validation, retry getting nodes
- Adds retry logic for retrieving nodes during Proof of Compute (PoC) validation to improve robustness.
PR #551 Fix(bls): reject duplicate slot indices in partial signatures
- Rejects partial signatures with duplicate slot indices to prevent verification failures during aggregation.
- Thanks to: @0xMayoor
PR #563 Fix(inference): variable shadowing in direct payment path
- Fixes a variable shadowing bug that caused errors (like
SendCoinsfailures) to be swallowed during refunds. - Thanks to: @0xMayoor
PR #559 Burn extra pool coins, fix ValueDecimal validation
- Burns coins from an inadvertently created account.
- Fixes validation for
ValueDecimalto correctly handlenilvalues.
PR #616 Integration test database, debugging assistance
- Adds functionality to upload integration test results to BigQuery.
- Includes fuzz testing and improved debugging logs.
PR #547 Updated script snippets and MacOS Tahoe 26.1 Docker settings
- Updates documentation and adds Docker settings for running testermint locally on MacOS Tahoe.
PR #618 PoC v2 & Offchain PoC data
- Integrates PoC directly into vLLM, enabling immediate switch from inference to PoC without offloading the model or loading a separate PoC model.
- Migrates artifact storage off-chain using MMR (Merkle Mountain Range) commitments - only
root_hashandcountare recorded on-chain. - Adds statistical test-based validation with L2-distance mismatch rule and calibrated thresholds.
- New chain messages:
SubmitPocValidationsV2,PoCV2StoreCommit,MLNodeWeightDistribution. - Includes dual-mode migration strategy: V1 for regular PoC, V2 tracking for Confirmation PoC during rollout.
Release v0.2.7-post1
Upgrade Proposal: v0.2.7
This document outlines the proposed changes for on-chain software upgrade v0.2.7. The Changes section details the major modifications, and the Upgrade Plan section describes the process for applying these changes.
Upgrade Plan
This PR updates the code for the api and node services. The PR modifies the container versions in deploy/join/docker-compose.yml.
The binary versions will be updated via an on-chain upgrade proposal. For more information on the upgrade process, refer to /docs/upgrades.md.
Existing hosts are not required to upgrade their api and node containers. The updated container versions are intended for new hosts who join after the on-chain upgrade is complete.
Proposed Process
- Active hosts review this proposal on GitHub.
- Once the PR is approved by a majority, a
v0.2.7release will be created from this branch, and an on-chain upgrade proposal for this version will be submitted. - If the on-chain proposal is approved, this PR will be merged immediately after the upgrade is executed on-chain.
Creating the release from this branch (instead of main) minimizes the time that the /deploy/join/ directory on the main branch contains container versions that do not match the on-chain binary versions, ensuring a smoother onboarding experience for new hosts.
Start after upgrade:
git pull
source config.env && docker compose -f docker-compose.postgres.yml up -d
Testing
The on-chain upgrade from version v0.2.6 to v0.2.7 has been successfully deployed and verified on the testnet.
Reviewers are encouraged to request access to the testnet environment to validate the upgrade or test the on-chain upgrade process on their own private testnets.
Migration
The on-chain migration logic is defined in upgrades.go.
Migration sets new parameters:
GenesisGuardianParams.NetworkMaturityThreshold= 15,000,000GenesisGuardianParams.NetworkMaturityMinHeight= 3,000,000- Guardian addresses migrated from legacy
GenesisOnlyParamsinto governance-controlled params (only if not already set) DeveloperAccessParams.UntilBlockHeight= 2,294,222 (inference gating for non-allowlisted developers)DeveloperAccessParams.AllowedDeveloperAddresses= predefined allowlist (governance-updatable)ParticipantAccessParams.NewParticipantRegistrationStartHeight= 2,222,222 (new host registration blocked until this height)ParticipantAccessParams.BlockedParticipantAddresses= placeholder blocklist (governance-updatable)ParticipantAccessParams.UseParticipantAllowlist= false (epoch allowlist disabled by default)
Migration also distributes rewards from the community pool:
- Epoch 117 rewards for nodes that didn't receive them (but successfully recovered) plus additional reward for all active nodes proportional to the chain halt duration
- Bounty program rewards for bug reports
Changes
Genesis Guardian Enhancement (Temporary)
Commits: 3c004c6dd, 0e5094ca0, da1413498
Temporary reactivation of the Genesis Guardian Enhancement, a previously used defensive mechanism.
- Genesis Guardian parameters moved from genesis-only config to governance-controlled params
- Network maturity thresholds set: total power >= 15,000,000 AND block height >= 3,000,000
- Guardian addresses migrated from legacy params into governance-updatable
GenesisGuardianParams - Enhancement automatically deactivates when both maturity conditions are satisfied
Developer Access Restriction
Commits: 3c004c6dd, ca4b5f92f, fc3d13fb9, d5fae6671
Temporary restriction of inference execution to an allowlisted set of developer addresses.
- Inference requests (
MsgStartInference,MsgFinishInference) gated byrequested_byaddress - Restriction active until block height 2,294,222
- Allowlist is governance-updatable via
DeveloperAccessParams - Non-allowlisted developers receive
ErrDeveloperNotAllowlisted
Participant Access Gating
New participant registration pause and PoC blocklist enforcement.
- New host registration (
SubmitNewParticipant,SubmitNewUnfundedParticipant) blocked until height 2,222,222 - PoC blocklist enforced in
MsgSubmitPocBatchandMsgSubmitPocValidation - Adds
MsgAddParticipantsToAllowList,MsgRemoveParticipantsFromAllowList, andQueryParticipantAllowListfor future governance-controlled epoch allowlist (disabled by default)
PoC Transaction Filtering
Protocol-level filtering of stale PoC transactions and improved tx-manager reliability.
- Ante handler rejects too-late
MsgSubmitPocBatchandMsgSubmitPocValidationduring CheckTx - API tx-manager adds block-based deadlines per message type (PoC: 240 blocks, inference: 150 blocks)
- Business logic errors (e.g., duplicate validation, participant not found) fail immediately instead of retrying
- Batching for
MsgSubmitPocBatchandMsgSubmitPocValidationtransactions
Inference Completion Handling
Commit: 2c05788d5
Fixes incorrect accounting of failed inference requests.
- Malformed or broken payloads no longer cause inferences to be marked as missed
- Improves resilience around failed inference handling in the API
Governance-Owned Leftovers
Commit: cf483b34e
Settlement and bitcoin reward remainder accounting.
- Expired/unclaimed
SettleAmounttransferred to governance module account instead of burned - Bitcoin rewards: missed-share and rounding remainder transferred to governance and tracked via
BitcoinResult.GovernanceAmount
Epoch 117 + Bounty Rewards Distribution
Reward distribution executed during upgrade.
- Nodes active during Epoch 117 that didn't receive their epoch reward get the recovered amount
- All nodes active during Epoch 117 receive an additional payout proportional to the chain halt duration
- Bounty program rewards distributed for reported bugs
fix(bls): prevent consensus panic from decimal precision overflow
shopspring/decimal divisions can produce >18 decimal places, causing
LegacyMustNewDecFromStr to panic in ApplyBLSGuardianSlotReservation.
Changes:
- Add decimalToLegacyDec() helper using StringFixed(18) to truncate
- Replace LegacyMustNewDecFromStr with error-handled conversion
- Skip reservation entirely on parse failure (fallback to raw weights)
- Add isolated regression test demonstrating panic vs fix