Skip to content

Releases: gonka-ai/gonka

Release v0.2.13

20 May 20:38

Choose a tag to compare

Release v0.2.13 Pre-release
Pre-release

This document covers the v0.2.13 upgrade proposal.

The release fixes confirmation PoC reward accounting, devshard escrow params,
complaint-response authz grants, upstream response parsing, participant
reactivation, node-manager gRPC defaults, and devshard storage growth.

It also adds a guardian-controlled emergency switch for disabling devshard
inference requests.

The upgrade also disables confirmation PoC for the rest of the upgrade epoch
so the new snapshot logic starts cleanly from the next epoch.

Upgrade Plan

The node binary is upgraded through an on-chain software upgrade proposal.

The PR also updates api and node container versions in
deploy/join/docker-compose.yml for hosts joining after the on-chain upgrade.

Existing hosts are not required to manually update their api or node
containers as part of the chain upgrade.

Proposed Process

  1. Active hosts review this proposal on GitHub.
  2. If the on-chain proposal is approved, this PR will be merged immediately after the upgrade is executed on-chain.

Migration

The on-chain migration logic is defined in upgrades.go.

Migrations:

  • Sets DevshardEscrowParams.MaxEscrowsPerEpoch to 500_000.
  • Sets DevshardEscrowParams.MaxNonce to 1_000_000. The previous settlement
    path used a hardcoded 20_000 nonce limit.
  • Adds addresses of several early miners and known brokers to
    DevshardEscrowParams.AllowedCreatorAddresses.
  • Sets GenesisOnlyParams.GenesisGuardianMultiplier to 0.33334, reducing
    genesis guardian power from about 34% to about 25% of adjusted voting power
    while early-network protection applies.
  • Sets the chain-wide governance quorum to 0.25. Quorum is computed against
    total chain voting power; with genesis guardians (25%) not voting, this gives
    an effective 1/3 quorum among the remaining 75% of voting power
    (0.25 / 0.75 = 0.334).
  • Backfills EpochGroupData.ConfirmationWeightScales for the current epoch and
    clamps existing confirmation weights down to the new expected value.
  • Backfills MsgRespondDealerComplaints authz grants on existing cold-to-warm
    ML ops pairs. v0.2.12 added this message to the permission list but did not
    migrate existing grants, so DAPIs that joined before v0.2.12 could not respond
    to dealer complaints.
  • Disables confirmation PoC triggers for the rest of the upgrade epoch via a
    grace-epoch UpgradeProtectionWindow of 10000 blocks. The new snapshot logic
    starts from the next epoch.
  • Adds MiniMax-M2.7 (MiniMaxAI/MiniMax-M2.7) as a governance model and PoC
    model config with PenaltyStartEpoch = 278 (bootstrap activation epoch).
  • Updates PocParams.Models[*].WeightScaleFactor to recalibrate against the
    Qwen-on-B200 reference after the vLLM 0.20.1 release. Kimi was too high on
    B* GPUs. Kimi = Qwen-on-B200 + 10% (top-tier premium), MiniMax = Qwen-on-B200:
    • Kimi (moonshotai/Kimi-K2.6): 0.78
    • MiniMax (MiniMaxAI/MiniMax-M2.7): 0.3024
  • Updates Model.ValidationThreshold from cross-version vLLM results:
    • Qwen (Qwen/Qwen3-235B-A22B-Instruct-2507-FP8): 0.940
    • Kimi (moonshotai/Kimi-K2.6): 0.900
    • MiniMax (MiniMaxAI/MiniMax-M2.7): 0.922
  • Adds --enable-auto-tool-choice to Kimi ModelArgs if missing.

Changes

inference-chain

  • Confirmation PoC used different model sets for measured weight, preserved
    weight, and reward rescaling. During new-model bootstrap, this could reduce
    confirmation weight for honest miners serving both an eligible model and a
    not-yet-eligible model. v0.2.13 stores one epoch snapshot of confirmable
    models and weight-scale factors, then uses it for confirmation and reward
    calculations.
  • ConsecutiveInvalidInferences was not reset when a participant became ACTIVE
    again. A host could return from invalid state and be invalidated again after
    one new failure. v0.2.13 resets the counter on reactivation and upcoming
    promotion.
  • Devshard settlement now reads the nonce limit from
    DevshardEscrowParams.MaxNonce instead of a hardcoded constant.
  • The upgrade adds addresses of several early miners and known brokers to the
    devshard creator allowlist without removing existing allowed creator addresses.
  • Genesis guardians held about 34% of adjusted voting power, which made quorum
    hard to reach when they did not vote. The upgrade reduces guardian power to
    about 25% via GenesisOnlyParams.GenesisGuardianMultiplier = 0.33334 and
    sets the chain-wide governance quorum to 0.25. Quorum is computed against
    total bonded power; with guardians not voting, this gives an effective 1/3
    quorum among the remaining 75% of voting power (0.25 / 0.75 = 0.334).
  • Adds MsgSetDevshardRequestsEnabled, a guardian-signed transaction for
    emergency disabling and re-enabling devshard inference requests.

decentralized-api

  • Some OpenAI-compatible upstreams return numeric stop_reason values.
    Choice.StopReason now accepts any JSON type, so those responses no longer
    fail unmarshalling.
  • NodeManagerGrpcPort did not start by default when unset. It now defaults to
    9400, and join compose uses the same default so devshard can reach the API
    without manual config.
  • The internal devshard service inside dapi uses the same devshard storage
    changes listed below, including pruning and Postgres support.

devshard

  • Devshard storage could grow forever because old escrow data stayed in one
    SQLite store. Storage is now epoch-scoped and prunes old epochs in the
    background, keeping the latest 3 epochs.
  • Devshard can use Postgres as the primary store for larger deployments, with
    SQLite kept as a local fallback.
  • Postgres data is partitioned by epoch_id for sessions, diffs, and
    signatures, so pruning can drop old epoch data cleanly.

Release v0.2.13: devshard binary

14 May 19:36

Choose a tag to compare

Release to save devshard binary

Upgrade Proposal: v0.2.12

27 Apr 22:29

Choose a tag to compare

This document outlines the proposed changes for on-chain software upgrade v0.2.12.
The Changes section details the major modifications, and the Upgrade Plan section describes the process for applying these changes.

Upgrade Plan

This PR updates the code for the api and node services. The PR modifies the container versions in deploy/join/docker-compose.yml and introduces a new versiond service in the join stack.

The binary versions will be updated via an on-chain upgrade proposal. For more information on the upgrade process, refer to /docs/upgrades.md.

Existing hosts are not required to upgrade their api and node containers as part of the on-chain upgrade itself. After the upgrade, hosts must deploy the new versiond service and update and redeploy proxy with VERSIOND_SERVICE_NAME=versiond and GONKA_API_EXEMPT_ROUTES=chat inference poc/proofs devshard so /devshard/<version>/* traffic is routed through proxy -> versiond -> devshardd. New hosts joining after the upgrade should use the updated container versions from this compose file.

Proposed Process

  1. Active hosts review this proposal on GitHub.
  2. If the on-chain proposal is approved, this PR will be merged immediately after the upgrade is executed on-chain.

Testing

The on-chain upgrade from version v0.2.11 to v0.2.12 has been successfully deployed and verified on the testnet. No regression in core functionality or performance has been observed during testing. More testing will be executed leading up to the upgrade.

Reviewers are encouraged to request access to testnet environments to validate both node behavior and the on-chain upgrade process, or to replay the upgrade on private testnets.

Migration

The on-chain migration logic is defined in upgrades.go.

Migrations:

  • Auto-creates x/feegrant allowances for every existing cold-to-warm ML ops authz grant in case transaction fees are later turned on.
  • Initializes FeeParams with min_gas_price_ngonka = 0 (fees are effectively disabled at upgrade time, see Changes).
  • Migrates singular PoC model parameters into the new multi-model PocParams.Models list and initializes DelegationParams.
  • Adds the moonshotai/Kimi-K2.6 governance model and its PoC model config (seq_len=1024, scaled weight coefficient, penalty start at effective epoch + 2).
  • Seeds DevshardEscrowParams.ApprovedVersions with the initial v1 devshard binary (sha256 15f72244...d36d4715) so versiond has an approved version to download and run immediately after the upgrade.
  • Sets EpochParams.ConfirmationPocSafetyWindow to 500 blocks and DelegationParams.DeployWindow to 500 blocks.
  • Clears legacy PoC v2 data (which used old key layouts) and seeds new pruning state markers for the new multi-model collections.
  • Backfills ActiveParticipant.VotingPowers and EpochGroupData subgroup voting power for the current epoch to ensure seamless PoC validation post-upgrade.
  • Removes unused TopMiners and training states (training will be moved to an off-chain architecture similar to devshards).

Changes

Multi-model PoC

Historically, PoC has been tied to a single base model. While the network aims to support multi-model inference, relying on a single-model PoC is not secure enough.

If the network served several models but only checked one during PoC, an attacker could spin up hardware just for the check and shut it down afterward. To prevent this, PoC must start immediately on the exact model being validated, proving the hardware is present and running that model right now with no window to swap deployments.

To support multiple models, this upgrade runs PoC for each model independently in separate model groups. The core mechanics:

  • Each governance-approved model gets its own PoC group. PoC runs for all eligible groups in parallel.
  • Weight is split into two layers. PoC weight is model-local and drives inference routing and inference rewards inside that specific group. Consensus weight is the total weight aggregated across all eligible model groups (using model-specific coefficients) that determines block signing power, voting power, and bitcoin-style rewards.
  • Because not every Host can run every model, a Host not serving a model can delegate its consensus weight to a group member for PoC validation only (this does not affect block signing or governance voting power). This preserves the existing security model: a model group must reach a 2/3 validation threshold of the total network consensus weight, not just the group-local weight, even if its direct members hold less than that total amount.
  • For each active model, Hosts must explicitly choose their participation mode (DIRECT, DELEGATE, REFUSE). Hosts who do nothing receive a penalty. Penalties are skipped during a model's initial grace period.

The current base model remains the starting group for bootstrapping additional models. The exact model coefficients and final parameter values are not yet part of this PR.

Transaction fees for spam prevention (#937, #981, #1120)

v0.2.12 lays the groundwork for consensus-level transaction fees. Before this upgrade, any funded account could broadcast an unlimited number of transactions at zero cost, because the chain relied only on per-validator minimum-gas-prices configuration, which is mempool-only and trivially bypassed by a malicious block proposer. This left governance proposals, bank sends, staking operations, collateral management, reward claims, bridge operations, and CosmWasm calls without any economic friction against abuse.

v0.2.12 introduces a governance-controlled FeeParams.min_gas_price_ngonka enforced during both CheckTx and DeliverTx. The full machinery is in place: a NetworkDutyFeeBypassDecorator that exempts protocol-obligation messages (PoC submissions, validation messages, inference start/finish, BLS DKG rounds), and a two-component fee on MsgPoCV2StoreCommit for Host sybil resistance (a base validation cost per participant per epoch plus a count-proportional cost per count delta).

Fees are effectively disabled at upgrade time. min_gas_price_ngonka is initialized to 0 due to remaining issues in client-side gas estimation. Once those are resolved, governance can flip on a non-zero value without a chain upgrade. No host action is required to support fees in this release; the upgrade still installs x/feegrant allowances from cold to warm keys so the switch can be flipped without a follow-up migration.

Devshards (formerly "subnets") — standalone, versioned runtime (#1045)

Previously, the devshard runtime lived inside the main DAPI process. Upgrading devshards meant rebuilding, redeploying, and restarting the entire DAPI, which slowed down development and added risk to all Host work (including inference, PoC, and Confirmation PoC).

To solve this, v0.2.12 decouples devshards into a standalone, versioned runtime managed by a new service called versiond.

  • versiond automatically downloads and runs devshard binaries approved by on-chain governance.
  • Multiple devshard versions can run side-by-side. Traffic to /devshard/<version>/* is routed to the corresponding binary, while the legacy /v1/devshard/* route remains active during the transition.
  • The standalone devshard directly communicates with MLNodes during inference but does not manage their lifecycle, cleanly separating the roles of MLNode manager (DAPI) and client.
  • Each session is cryptographically bound to the specific binary version that served it. The settlement payload now includes a cleartext version field, ensuring a session cannot mix responses from different versions.
  • The term "subnet" is entirely replaced by "devshard" across the codebase. Additionally, float math in devshard settlement has been replaced with deterministic integer arithmetic to eliminate consensus-failure risks.

Random selection of preserved MLNodes (#1089)

Previously, "preserved" nodes (the ones that stay on inference instead of running PoC) were chosen once per epoch via the static MLNodeInfo.timeslot_allocation[POC_SLOT] flag. Because the flag was visible at epoch start and held for the entire epoch, an operator knew well in advance which boxes would skip both the epoch-start PoC and every confirmation PoC event in that epoch. That made hardware downgrade or partial-capacity substitution easy to plan around.

v0.2.12 replaces epoch-long preservation with episode-scoped preservation. An episode is a single PoC execution window: either the epoch-start regular PoC, or one confirmation PoC event during the inference phase. At each PoC anchor (upcomingEpoch.PocStartBlockHeight for regular PoC, event.TriggerHeight for confirmation), the chain materializes a fresh preserved snapshot for that single episode and overwrites a singleton state slot. The next episode gets a new sample.

Key properties:

  • Late-binding: an operator cannot predict far in advance whether a given node will be preserved for the next PoC window.
  • The candidate pool is the current model subgroup EpochGroupData.ValidationWeights / MlNodes, applying existing protocol exclusions.
  • ActiveParticipants stays stable for the whole epoch; timeslot_allocation[POC_SLOT] is deprecated for scheduling.
  • The broker reads the current episode snapshot instead of the static epoch-long flag.
  • Reward weight collapses from the old "preserved + measured...
Read more

Release v0.2.11

19 Mar 05:10

Choose a tag to compare

Upgrade Proposal: v0.2.11

This document outlines the proposed changes for on-chain software upgrade v0.2.11.
The Changes section details the major modifications, and the Upgrade Plan section describes the process for applying these changes.

Upgrade Plan

This PR updates the code for the api and node services. The PR modifies the container versions in deploy/join/docker-compose.yml.

The binary versions will be updated via an on-chain upgrade proposal. For more information on the upgrade process, refer to /docs/upgrades.md.

Existing hosts are not required to upgrade their api and node containers. The updated container versions are intended for new hosts who join after the on-chain upgrade is complete.

It also updates CosmWasm contract artifacts for the community sale and liquidity pool, adds bridge/testnet operational scripts for IBC trading support, and introduces a new subnet/ package used by the new inference architecture.

Proposed Process

  1. Active hosts review this proposal on GitHub.
  2. If the on-chain proposal is approved, this PR will be merged immediately after the upgrade is executed on-chain.

Testing

The on-chain upgrade from version v0.2.10 to v0.2.11 has been successfully deployed and verified on the testnet. No regression in core functionality or performance has been observed during testing. More testing will be executed leading up to the upgrade.

Reviewers are encouraged to request access to testnet environments to validate both node behavior and the on-chain upgrade process, or to replay the upgrade on private testnets.

Migration

The on-chain migration logic is defined in upgrades.go.

Migrations:

  • Sets ValidationParams.ClaimValidationEnabled = false.
  • Rebuilds active participant caches for the current and previous epoch.
  • Migrates epoch-group validations into the new entry-based format.
  • Community-sale CosmWasm contract migration.

Changes

PR #877 Inference shards (Experimental)

  • Introduces subnet-based inference flow, moving per-inference coordination off-chain.
  • The chain now handles only session setup, escrow, and settlement.
  • Adds support for subnet state, transport, signing, storage, settlement, and API integration.
  • Note: This feature is currently experimental and under limited access. For reference design and architecture, see proposals/inference/.

PR #812 StartInference and FinishInference performance improvements

  • Reduces unnecessary state writes and query overhead for MsgStartInference and MsgFinishInference.
  • Simplifies stats handling and cuts work done during the inference lifecycle for better block execution stability.

PR #760 Unified Permissions

  • Consolidates message-permission checks across the inference module.
  • Removes duplicated authorization logic to make permission behavior more explicit and testable.

PR #779 Inference msgs optimization: optimize key verification

  • Reduces cryptographic verification overhead in the inference message path.
  • Avoids repeating signature checks where protocol guarantees make them redundant.

PR #874 MsgValidation and MsgClaimRewards performance optimization

  • Reduces hot-path lookups and adds transient caching in validation and reward-claiming paths.
  • Restructures validation/reward logic and introduces state pruning support.

PR #822 BLS related fixes based on Certik audit

  • Applies Certik audit fixes to the BLS module.
  • Fixes threshold-validation and duplicate-slot handling issues for distributed key generation and threshold-signing flows.

PR #814 IBC Trade Support

  • Introduces governance-controlled support for trading approved IBC-denominated assets.
  • Includes chain message/query changes and contract updates for the community sale and liquidity pool.

PR #868 Required-collateral aware slashing flow

  • Bases slashing penalties on required collateral rather than the full deposited amount.
  • Makes the slashing model more proportional for participants who over-deposit relative to the minimum.

PR #888 Fix: collateral

  • Fixes reward calculation for undercollateralized miners, ensuring actual collateral accurately reduces effective earning power.

PR #775 fix: redirect slashed coins to gov

  • Redirects slashed collateral to governance-controlled destinations.

Other changes

  • PR #867 Fix the application.db bloat issue.
  • PR #835 Add Batch Transfer With Vesting.
  • PR #773 feat: delete governance model.
  • PR #675 security: update CometBFT to v0.38.21 (CSA-2026-001).
  • PR #543 fix: data race conditions.
  • PR #815 Update CONTRIBUTING.md.
  • PR #807 Update issue templates.

Proposed Bounties

Bounty ID Sum GNK Bounty Explanation GitHub ID
PR #543 2500 extra bounty for a comprehensive review of all cases where the data race conditions fix was needed @x0152
Issue #628 25000 PoC integration into vllm v0.11.1 report   Axel-t, @Red-Caesar
-- 10000 report of series of prompts resulting in vllm HTTP 502 response, significant impact, was already used for intentoinal greifing @blizko
-- 1000 report of dust transaction vulnerability extending blocks @blizko
-- 5000 report of Remote DoS of Validator PoC Software via dist Assertion @ouicate
-- 5000 report of State Bloat PoC and End-Block DoS via Unbounded Batch / Validation Payloads @ouicate
-- 750 report of Bridge Ethereum Address Parsing Silently Falls Back to Zero Bytes (Loss/Misdirection of Funds) @ouicate
PR #775 1000 planned task @x0152
PR #773 1250 planned task @x0152
qdanik/vllm/pull/5 12000 vLLM 0.15.1 Compatibility Experiments - basis for next ML node version @qdanik
qdanik/vllm/pull/6 15000 vLLM 0.15.1 Compatibility Experiments - basis for next ML node version. covering simultanious PoC and inference @qdanik
-- 5000 report of wind down window vulnerability fixed in PR #767 @qdanik
Issue #797 1000 collective solving of nodes unable to join from snapshots - proposed valuable hypothesis @akup
Issue #797 3000 collective solving of nodes unable to join from snapshots - found source problem @x0152
Issue #780 750 collective solving StartInference and FinishInference issue @hleb-albau
Issue #781 5000 collective solving StartInference and FinishInference issue @x0152
Issue #782 5000 collective solving StartInference and FinishInference issue @akup
PR #867 7500 important issue that affected many participants, not a vulnerability, fairly easy fir; adding extra payment for fully testing and providing results of the test together with the fix @Lelouch33
Issue #730 22500 vLLM 0.15.1 Compatibility Experiments - basis for next ML node version @clanster, @baychak
PR #835 5000 Batch Transfer With Vesting implementation, huge kudos for figuring out how to use testnet @huxuxuya
PR #868 5000 collateral slashing vulnerability and fix; low severity: low risk, medium likelyhood, organic @qdanik
v0.2.11 7500 release management @akup
v0.2.11 7500 release management @x0152
v0.2.10 2500 upgrade review @Yapion
v0.2.10 2500 upgrade review @blizko
v0.2.10 2500 upgrade review @x0152

Release v0.2.10-post7

17 Mar 05:45

Choose a tag to compare

Release v0.2.10-post6: Pruning

17 Mar 05:02

Choose a tag to compare

Release v0.2.10

17 Feb 09:19
faa358d

Choose a tag to compare

Upgrade Proposal: v0.2.10

This document outlines the proposed changes for on-chain software upgrade v0.2.10. The Changes section details the major modifications, and the Upgrade Plan section describes the process for applying these changes.

Upgrade Plan

This PR updates the code for the api and node services. The PR modifies the container versions in deploy/join/docker-compose.yml.

The binary versions will be updated via an on-chain upgrade proposal. For more information on the upgrade process, refer to /docs/upgrades.md.

Existing hosts are not required to upgrade their api and node containers. The updated container versions are intended for new hosts who join after the on-chain upgrade is complete.

To apply the new vLLM model parameters, mlnode must be restarted after the on-chain upgrade. The safest approach is:

docker restart join-mlnode-1

Proposed Process

  1. Active hosts review this proposal on GitHub.
  2. Once the PR is reviewed by the community, a v0.2.10 release will be created from this branch, and an on-chain upgrade proposal for this version will be submitted.
  3. If the on-chain proposal is approved, this PR will be merged immediately after the upgrade is executed on-chain.

Creating the release from this branch (instead of main) minimizes the time that the /deploy/join/ directory on the main branch contains container versions that do not match the on-chain binary versions, ensuring a smoother onboarding experience for new hosts.

Testing

The on-chain upgrade from version v0.2.9 to v0.2.10 has been successfully deployed and verified on the testnet. PoC time-based weight normalization has been validated in the testnet environment. No regression in core functionality or performance has been observed during testing.

Reviewers are encouraged to request access to testnet environments to validate both node behavior and the on-chain upgrade process, or to replay the upgrade on private testnets.

Migration

The on-chain migration logic is defined in upgrades.go.

Migrations:

  • Validation slots default: explicitly sets PocParams.ValidationSlots=0 during migration. This keeps existing O(N^2) validation behavior after upgrade until sampling is enabled by governance parameter update.
  • PoC normalization default: explicitly sets PocParams.PocNormalizationEnabled=true during migration to enable time-based weight normalization.
  • Model parameter update: Updates Qwen/Qwen3-235B-A22B-Instruct-2507-FP8 with tool calling args (--enable-auto-tool-choice, --tool-call-parser hermes) and validation threshold 0.958.

PoC Validation Sampling Optimization

This upgrade introduces a new PoC validation mechanism that reduces complexity from O(N^2) to O(N x N_SLOTS) by assigning each participant a fixed sampled set of validators.

Reference design and analysis: proposals/poc/optimize.md

Key points:

  • Only assigned validators validate each participant when sampling is enabled.
  • Sampling is deterministic on both chain and API sides (based on validation snapshot + app_hash).
  • Decision threshold is strict supermajority of assigned slots (>66.7%).
  • The feature is shipped in this release but disabled by default (ValidationSlots=0) and can be enabled via a governance proposal that changes the ValidationSlots parameter to a non-zero value once rollout conditions are met.

PoC Weight Normalization by Real PoC Time

This upgrade normalizes PoC participant weights by actual PoC elapsed time to reduce block-time drift effects and keep weight outcomes consistent with real execution duration.

Key points:

  • Adds PocParams.PocNormalizationEnabled parameter for time-based normalization control.
  • Captures generation start and exchange end timestamps in PoCValidationSnapshot.
  • Applies a normalization factor derived from expected stage duration vs actual elapsed time.
  • Applies to both regular PoC and confirmation PoC weight calculations.
  • Enabled by default in this upgrade (PocNormalizationEnabled=true).

Upgrade Grace Period

To ensure a smooth upgrade transition:

  • Confirmation PoC will not be triggered for the first 3000 blocks (~5 hours) after upgrade.
  • Miss/invalid punishment rates are relaxed for the entire grace epoch (binom_test_p0 set to 0.5).
  • Regular PoC operates normally during the grace period.

Changes

PR #710 PoC Validation Sampling Optimization

  • Reduces validation complexity from quadratic to slot-based sampling.
  • Adds deterministic slot assignment shared by chain and API, with snapshot-backed weight synchronization.
  • Keeps backward-compatible fallback path when ValidationSlots=0 and includes upgrade-time default of ValidationSlots=0 for safe rollout.

PR #725 PoC weight normalization on real PoC time

  • Adds time-based PoC weight normalization to reduce sensitivity to block-time variance.
  • Introduces PocNormalizationEnabled in PoC params and uses validation snapshot timestamps to compute normalization factor.
  • Integrates normalization into both regular PoC and confirmation PoC weight calculations.
  • Upgrade handler enables normalization by default for v0.2.10.

PR #767 Upgrade grace period, tool calling, and PoC timing fix

  • Adds grace epoch protection for the upgrade epoch: extended CPoC window (3000 blocks) and relaxed miss/invalid thresholds.
  • Updates Qwen model with tool calling support (--enable-auto-tool-choice, --tool-call-parser hermes).
  • Adjusts validation threshold from 0.970917 to 0.958.
  • Deprecates poc_exchange_duration parameter (set to 0 in upgrade). API artifact acceptance now aligns with chain exchange windows using explicit block height checks instead of relying on phase alone. Fixes a gap where chain accepted nonces longer than API.

PR #708 IBC Upgrade to v8.7.0

  • Upgrades IBC stack to v8.7.0.
  • Aligns chain interoperability components with current IBC release line.

PR #723 Testnet bridge setup scripts

  • Adds bridge setup scripts for testnet operations.
  • Improves reproducibility of bridge deployment and validation workflows.

PR #666 Artifact storage throughput optimization

  • Improves PoC artifact storage throughput.

PR #688 Punishment statistics from on-chain data

  • Uses on-chain data for punishment statistics with dynamic table selection.

PR #697 Portable BLST build for macOS test builds

  • Uses a portable BLST build path for macOS test binaries.
  • Improves reliability of local/test build pipeline on macOS hosts.

PR #712 Require proto-go generation matches committed code

  • Enforces proto-go generation consistency in development flow.
  • Prevents accidental drift between generated and committed protobuf code.

PR #711 PoC test params from chain state

  • Replaces hardcoded PoC test defaults with chain state parameters.

PR #641 Streamvesting transfer with vesting

  • Adds MsgTransferWithVesting RPC and message type in the streamvesting module. Enables sender-to-recipient token transfers with vesting over N epochs (default: 180 epochs when not specified).
  • Adds safety limits to prevent abusive requests: max 3650 vesting epochs and max 10 coin denoms per transfer.

API hardening and reliability fixes

  • PR #634: add request body size limits to reduce DoS risk.
  • PR #727: follow-up for #634, pass response writer to http.MaxBytesReader and align tests.
  • PR #638: fix unsafe type assertions in request processing.
  • PR #644: avoid rewriting static config on each startup.
  • PR #661: prevent API crash on short network drops.
  • PR #640: add unit tests for node version endpoint behavior.
  • PR #622: propagate refund errors in InvalidateInference.
  • PR #639: add missing return after error in task claiming path.
  • PR #643: sanitize nil participants in executor selection.
  • PR #545: minor bug fixes in API flow.

Other fixes

  • PR #659: model assignment checks previous-epoch rewards.
  • PR #716: rename PoC weight function for clarity and correctness.

Proposed Bounties

PR/Issue Sum GNK Bounty Explanation
PR #661 500 Valid fix for minor vulnerability that was previously reported in issue #422
PR #644 700 Planned task, not a vulnerability, important for the network.
PR #659 10,000 Detailed report and fix for a Medium risk vulnerability.
Report 5,000 First report of the vulnerability fixed in #659
PR #545 1,000 Report and fix of low risk vulnerability. Extra appreciation for d...
Read more

Upgrade v0.2.9

31 Jan 21:48
808247e

Choose a tag to compare

Full PoC V2 activation with model consolidation and access controls.

Summary

  • PoC V2 fully enabled (tracking mode from v0.2.8 -> full enforcement)
  • Network consolidated to single model: Qwen/Qwen3-235B-A22B-Instruct-2507-FP8
  • Transfer Agent whitelist for request gating
  • Guardian tiebreaker for undecided PoC V2 votes
  • 24 suspicious participants removed from allowlist

Chain Changes

PoC V2 Activation

  • PocV2Enabled=true, ConfirmationPocV2Enabled=true
  • WeightScaleFactor=0.262, InferenceValidationCutoff=2, PocValidationDuration=480
  • All nodes' POC_SLOT allocations reset (both ActiveParticipants and EpochGroupData)
  • First V2 epoch runs in grace mode, full enforcement begins the following epoch

Guardian Tiebreaker

  • When neither valid nor invalid votes reach majority, guardians can break the tie
  • Requires: no majority exists, guardians enabled, at least one guardian voted, all voting guardians agree

Transfer Agent Whitelist

  • New TransferAgentAccessParams.AllowedTransferAddresses in params
  • Validation in StartInference and FinishInference messages
  • Empty whitelist = all TAs allowed; non-empty = only listed TAs

Model Consolidation

  • All governance models except Qwen/Qwen3-235B-A22B-Instruct-2507-FP8 deleted during migration

Participant Access Update

  • Registration and allowlist heights set to 2475000
  • 24 addresses removed from allowlist (completed POC generation but did not vote at validation)

API Changes

Enforced Model Auto-Switch

  • Auto-switches nodes to Qwen235B with --tensor-parallel-size 4 --max-model-len 240000 if model is not configured
  • Three-layer enforcement: config load, state sync, runtime verification via vLLM /v1/models

Transfer Agent Whitelist

  • Early enforcement in /chat/completion before expensive operations
  • Cache synced from chain on every new block for O(1) lookups

Bug Fixes

#674 Missed inferences fix

  • Don't punish for missed inferences of non-preserved nodes during PoC
  • Don't punish if participant doesn't support the model

#678 CPoC downtime penalty redistribution

  • Penalty transferred to community pool instead of being lost

Inference expiry during PoC/CPoC

  • New InferenceExpiryContext tracks latest PoC or CPoC range
  • Inferences started during PoC/CPoC not punished for expiry

Bitcoin rewards weight calculation

  • Full weights used for denominator (prevents redistribution of invalidated shares)
  • CPoC reductions and invalidated shares go to governance pool

Release v0.2.8

28 Jan 01:48

Choose a tag to compare

Release v0.2.8 Pre-release
Pre-release

Upgrade Proposal: v0.2.8

This document outlines the proposed changes for on-chain software upgrade v0.2.8. The Changes section details the major modifications, and the Upgrade Plan section describes the process for applying these changes.

Upgrade Plan

This PR updates the code for the api and node services. The PR modifies the container versions in deploy/join/docker-compose.yml.

The binary versions will be updated via an on-chain upgrade proposal. For more information on the upgrade process, refer to /docs/upgrades.md.

Existing hosts are not required to upgrade their api and node containers. The updated container versions are intended for new hosts who join after the on-chain upgrade is complete.

Proposed Process

  1. Active hosts review this proposal on GitHub.
  2. Once the PR is approved by a majority, a v0.2.8 release will be created from this branch, and an on-chain upgrade proposal for this version will be submitted.
  3. If the on-chain proposal is approved, this PR will be merged immediately after the upgrade is executed on-chain.

Creating the release from this branch (instead of main) minimizes the time that the /deploy/join/ directory on the main branch contains container versions that do not match the on-chain binary versions, ensuring a smoother onboarding experience for new hosts.

Testing

The on-chain upgrade from version v0.2.7-post1 to v0.2.8 has been successfully deployed and verified on the testnet, including the PoC V2 parameter migration.

Reviewers are encouraged to request access to the testnet environment to validate the upgrade or test the on-chain upgrade process on their own private testnets.

Migration

The on-chain migration logic is defined in upgrades.go.

Migration tasks:

  • Burn extra community coins: Burns all coins from the pre_programmed_sale module account (gonka1rmac644w5hjsyxfggz6e4empxf02vegkt3ppec) which were inadvertently created during genesis.
  • Precompute BLS slot keys: Generates and stores precomputed BLS slot public keys for the current epoch to enable the new optimized verification logic (see PR #609).
  • Set PoC V2 migration parameters: Configures dual-mode migration with ConfirmationPocV2Enabled=true and PocV2Enabled=false, sets model ID to Qwen/Qwen3-235B-A22B-Instruct-2507-FP8, sequence length to 1024, and statistical test thresholds for V2 validation.

PoC V2 Migration

For a smooth transition from PoC V1 to PoC V2, the chain must ensure that the majority of participants have switched to the new MLNode build supporting the Qwen/Qwen3-235B-A22B-Instruct-2507-FP8 model before PoC V2 becomes the main PoC engine. This upgrade enables tracking mode to measure adoption without affecting weights.

After this upgrade:

  • Regular PoC continues using V1 (on-chain batches, weight enforcement).
  • First Confirmation PoC per epoch uses V2 for tracking only (no weight/slashing impact).
  • V2 tracking results allow monitoring adoption before full activation.

MLNode upgrade:

  • New versions: ghcr.io/product-science/mlnode:3.0.12 (or 3.0.12-blackwell for Blackwell GPUs).
  • Backward compatible with 3.0.11 — can be upgraded before or after this on-chain upgrade.
  • Must be upgraded before PoC V2 is fully enabled.

Enabling full PoC V2:

  • PoC V2 will not activate automatically.
  • Once adoption is sufficient, a separate governance proposal will set poc_v2_enabled=true.
  • The epoch when V2 is enabled runs in grace mode (no punishment).
  • Full V2 enforcement begins the following epoch.

Changes

PR #505 Security Fixes for v0.2.7

Addresses multiple security vulnerabilities:

  • SSRF & DoS: Validates InferenceUrl to reject internal IPs and adds timeouts to prevent request hangs.
  • Vote Flipping: Prevents overwriting of PoC validations by rejecting duplicates.
  • Batch Size Limits: Enforces bounds on PoC batch sizes to prevent state bloat.
  • PoC Exclusion: Fixes getInferenceServingNodeIds to correctly exclude inference-serving nodes.
  • Auth Bypass & Replay: Binds epochId to signatures and validates authorization against the correct epoch.
  • Thanks to: @ouicate

PR #609 BLS optimized

  • Significantly optimizes BLS signature verification (from ~2s down to <10ms) by using the blst library and precomputing slot public keys.

PR #540 Remove ALL panic and Must from chain code

  • Removes panic and Must calls from chain code to prevent consensus failures.
  • Implements linting (forbidigo) and CI checks to enforce this rule.

PR #534 Security: prevent SSRF via executor redirect

  • Prevents SSRF attacks where a malicious executor redirects Transfer Agent requests to internal services (e.g., admin API).
  • Implements a custom HTTP client that disables following redirects.
  • Thanks to: @x0152

PR #544 Inference: defense-in-depth against int overflow

  • Fixes integer overflow vulnerabilities in escrow and cost calculations using checked arithmetic.
  • Adds hard caps for token counts and improves error handling to fail closed on overflows.
  • Thanks to: @ouicate

PR #506 Standardize floating point math

  • Replaces dangerous floating-point math with shopspring/decimal for deterministic calculations (e.g., Dynamic Pricing).
  • Updates reward exponent calculation to use a table-based approach for decay rates.

PR #536 Perf: optimize participants endpoint with single balance query

  • Optimizes the /v1/participants endpoint by replacing N gRPC calls with a single blockchain query.
  • Achieves ~500x speedup for large sets of participants.
  • Thanks to: @x0152

PR #553 Membership for correct epoch for Validation requests

  • Ensures validation rights are checked against the active participants of the target epoch, not the current one.
  • Fixes logic for sharing work coins and refunds during validation/invalidation.

PR #607 Fix(inference): update totalDistributed after debt deduction

  • Fixes a bug where totalDistributed was not updated after deducting debt, causing tokens to be lost instead of returned to governance.
  • Thanks to: @0xMayoor

PR #549 Disable future timestamp check for EA

  • Temporarily disables the future timestamp check in the External Adapter (EA) to prevent rejecting requests when the EA is behind the chain during high load.

PR #550 Negative coin balance for settle

  • Handles edge cases with negative coin balances by subtracting the negative amount from rewards instead of erroring.

PR #541 PoC validation, retry getting nodes

  • Adds retry logic for retrieving nodes during Proof of Compute (PoC) validation to improve robustness.

PR #551 Fix(bls): reject duplicate slot indices in partial signatures

  • Rejects partial signatures with duplicate slot indices to prevent verification failures during aggregation.
  • Thanks to: @0xMayoor

PR #563 Fix(inference): variable shadowing in direct payment path

  • Fixes a variable shadowing bug that caused errors (like SendCoins failures) to be swallowed during refunds.
  • Thanks to: @0xMayoor

PR #559 Burn extra pool coins, fix ValueDecimal validation

  • Burns coins from an inadvertently created account.
  • Fixes validation for ValueDecimal to correctly handle nil values.

PR #616 Integration test database, debugging assistance

  • Adds functionality to upload integration test results to BigQuery.
  • Includes fuzz testing and improved debugging logs.

PR #547 Updated script snippets and MacOS Tahoe 26.1 Docker settings

  • Updates documentation and adds Docker settings for running testermint locally on MacOS Tahoe.

PR #618 PoC v2 & Offchain PoC data

  • Integrates PoC directly into vLLM, enabling immediate switch from inference to PoC without offloading the model or loading a separate PoC model.
  • Migrates artifact storage off-chain using MMR (Merkle Mountain Range) commitments - only root_hash and count are recorded on-chain.
  • Adds statistical test-based validation with L2-distance mismatch rule and calibrated thresholds.
  • New chain messages: SubmitPocValidationsV2, PoCV2StoreCommit, MLNodeWeightDistribution.
  • Includes dual-mode migration strategy: V1 for regular PoC, V2 tracking for Confirmation PoC during rollout.

Release v0.2.7-post1

08 Jan 18:02
6a34c52

Choose a tag to compare

Upgrade Proposal: v0.2.7

This document outlines the proposed changes for on-chain software upgrade v0.2.7. The Changes section details the major modifications, and the Upgrade Plan section describes the process for applying these changes.

Upgrade Plan

This PR updates the code for the api and node services. The PR modifies the container versions in deploy/join/docker-compose.yml.

The binary versions will be updated via an on-chain upgrade proposal. For more information on the upgrade process, refer to /docs/upgrades.md.

Existing hosts are not required to upgrade their api and node containers. The updated container versions are intended for new hosts who join after the on-chain upgrade is complete.

Proposed Process

  1. Active hosts review this proposal on GitHub.
  2. Once the PR is approved by a majority, a v0.2.7 release will be created from this branch, and an on-chain upgrade proposal for this version will be submitted.
  3. If the on-chain proposal is approved, this PR will be merged immediately after the upgrade is executed on-chain.

Creating the release from this branch (instead of main) minimizes the time that the /deploy/join/ directory on the main branch contains container versions that do not match the on-chain binary versions, ensuring a smoother onboarding experience for new hosts.

Start after upgrade:

git pull
source config.env && docker compose -f docker-compose.postgres.yml up -d

Testing

The on-chain upgrade from version v0.2.6 to v0.2.7 has been successfully deployed and verified on the testnet.

Reviewers are encouraged to request access to the testnet environment to validate the upgrade or test the on-chain upgrade process on their own private testnets.

Migration

The on-chain migration logic is defined in upgrades.go.

Migration sets new parameters:

  • GenesisGuardianParams.NetworkMaturityThreshold = 15,000,000
  • GenesisGuardianParams.NetworkMaturityMinHeight = 3,000,000
  • Guardian addresses migrated from legacy GenesisOnlyParams into governance-controlled params (only if not already set)
  • DeveloperAccessParams.UntilBlockHeight = 2,294,222 (inference gating for non-allowlisted developers)
  • DeveloperAccessParams.AllowedDeveloperAddresses = predefined allowlist (governance-updatable)
  • ParticipantAccessParams.NewParticipantRegistrationStartHeight = 2,222,222 (new host registration blocked until this height)
  • ParticipantAccessParams.BlockedParticipantAddresses = placeholder blocklist (governance-updatable)
  • ParticipantAccessParams.UseParticipantAllowlist = false (epoch allowlist disabled by default)

Migration also distributes rewards from the community pool:

  • Epoch 117 rewards for nodes that didn't receive them (but successfully recovered) plus additional reward for all active nodes proportional to the chain halt duration
  • Bounty program rewards for bug reports

Changes


Genesis Guardian Enhancement (Temporary)

Commits: 3c004c6dd, 0e5094ca0, da1413498

Temporary reactivation of the Genesis Guardian Enhancement, a previously used defensive mechanism.

  • Genesis Guardian parameters moved from genesis-only config to governance-controlled params
  • Network maturity thresholds set: total power >= 15,000,000 AND block height >= 3,000,000
  • Guardian addresses migrated from legacy params into governance-updatable GenesisGuardianParams
  • Enhancement automatically deactivates when both maturity conditions are satisfied

Developer Access Restriction

Commits: 3c004c6dd, ca4b5f92f, fc3d13fb9, d5fae6671

Temporary restriction of inference execution to an allowlisted set of developer addresses.

  • Inference requests (MsgStartInference, MsgFinishInference) gated by requested_by address
  • Restriction active until block height 2,294,222
  • Allowlist is governance-updatable via DeveloperAccessParams
  • Non-allowlisted developers receive ErrDeveloperNotAllowlisted

Participant Access Gating

Commits: 1d309fe27, d1523d1ca

New participant registration pause and PoC blocklist enforcement.

  • New host registration (SubmitNewParticipant, SubmitNewUnfundedParticipant) blocked until height 2,222,222
  • PoC blocklist enforced in MsgSubmitPocBatch and MsgSubmitPocValidation
  • Adds MsgAddParticipantsToAllowList, MsgRemoveParticipantsFromAllowList, and QueryParticipantAllowList for future governance-controlled epoch allowlist (disabled by default)

PoC Transaction Filtering

Commits: 1644047b9, 2dbdcca00

Protocol-level filtering of stale PoC transactions and improved tx-manager reliability.

  • Ante handler rejects too-late MsgSubmitPocBatch and MsgSubmitPocValidation during CheckTx
  • API tx-manager adds block-based deadlines per message type (PoC: 240 blocks, inference: 150 blocks)
  • Business logic errors (e.g., duplicate validation, participant not found) fail immediately instead of retrying
  • Batching for MsgSubmitPocBatch and MsgSubmitPocValidation transactions

Inference Completion Handling

Commit: 2c05788d5

Fixes incorrect accounting of failed inference requests.

  • Malformed or broken payloads no longer cause inferences to be marked as missed
  • Improves resilience around failed inference handling in the API

Governance-Owned Leftovers

Commit: cf483b34e

Settlement and bitcoin reward remainder accounting.

  • Expired/unclaimed SettleAmount transferred to governance module account instead of burned
  • Bitcoin rewards: missed-share and rounding remainder transferred to governance and tracked via BitcoinResult.GovernanceAmount

Epoch 117 + Bounty Rewards Distribution

Commits: 3d8d4caf2, a4828a1d0

Reward distribution executed during upgrade.

  • Nodes active during Epoch 117 that didn't receive their epoch reward get the recovered amount
  • All nodes active during Epoch 117 receive an additional payout proportional to the chain halt duration
  • Bounty program rewards distributed for reported bugs

fix(bls): prevent consensus panic from decimal precision overflow

shopspring/decimal divisions can produce >18 decimal places, causing
LegacyMustNewDecFromStr to panic in ApplyBLSGuardianSlotReservation.

Changes:

  • Add decimalToLegacyDec() helper using StringFixed(18) to truncate
  • Replace LegacyMustNewDecFromStr with error-handled conversion
  • Skip reservation entirely on parse failure (fallback to raw weights)
  • Add isolated regression test demonstrating panic vs fix