generated from opentensor/bittensor-subnet-template
-
Notifications
You must be signed in to change notification settings - Fork 22
Open
Description
Currently we use a one-hot rewarding system for miners. The goal of this issue is to build, implement, and test new rewarding systems that are more dependent on the distribution of energies.
- Sorting energies in ascending order (most negative to 0). Assign rewards linearly such that the most negative energy gets a reward of 1 and energy 0 gets a reward of 0.
- Scale the rewards using min-max normalization, excluding the zero energy value. This can be done by removing 0 energy values, determine the min and max energies from the filtered set, and apply min-max normalization to scale the rewards between 0 and 1.
- Rewards can be scaled exponentially to emphasize lower energy values more strongly.
- Logarithmic scaling can be used to reduce the impact of high energy values and provide a smoother reward distribution.
- Softmax scaling to convert the energies into a probability distribution, which can be interpreted as scaled rewards.
- others.
Metadata
Metadata
Assignees
Labels
No labels