Skip to content

Latest commit

 

History

History
49 lines (33 loc) · 4.46 KB

vision_and_roadmap.md

File metadata and controls

49 lines (33 loc) · 4.46 KB

Vision & Roadmap

At the moment, many subnets have tasks for which they have implemented SOTA models in their miner codes to instantly achieve high quality. For such tasks, implementing better solutions could give miners only basis points of improvement. Almost no room to grow.

But our subnet is different. AI detection is a hard task to achieve high quality. That's why we aimed not just make "marketplace for inference SOTA models" as other subnets did but rather to create a constantly evolving environment where miners have to get better over time and not just run the same models for months.

In order to implement such an environment, we need to do the following.

Validators

Currently, validators use one large dataset with human data and two models (mistral and vicuna) to generate AI texts. What could be done to improve that:

  1. Use softmax on miners' scores for higher miners motivation
  2. Add more models. By increasing the number and diversity of models, we will improve the overall quality of detection
  3. Add more languages
  4. Paraphrasing of AI texts
  5. Make it resilient to tricks and attacks
  6. Various types of text: differentiate articles/comments/posts/etc., in order to improve quality on each distinct type
  7. Save all data that validator generates into cloud to make an open-source dataset in future

Miners

Generally speaking, improving miners is not our task. Miners should get better themselves. But there are a few things we can do to help them:

  1. Host testnet validators so miners can start without wasting TAO.
  2. Make leaderboard and local dataset: we will list miners' metrics and allow people who want to start mining to evaluate their solution on a local dataset to compare them with existing ones before going to the mainnet.
  3. Create Kaggle competition to introduce some of the best ML engineers to our subnet and make them run their top solution on-chain.
  4. Despite the fact that solving LLM detection is a miner's problem, we are going to continue our own researches in this field to improve baseline solution and increase overall subnet's quality.

Applications

One of the important tasks for us as subnet owners is to apply the subnet for real usage. Given the relevance of the problem, there is clearly a request for such solutions. That’s what we’re going to do:

Web service

We’ve already developed an MVP version of a website for our subnet, where you can write some texts and then get miners' predictions with probability of this text to be ai-generated. But we’re going to develop a full version of web service, which will provide users even outside bittensor community availability to detect ai-generated texts.

Twitter extension

Today, X/Twitter is among the top 6 social networking apps in the United States. And boasts over 500 million users worldwide. With the rapid growth of Large Language Models like ChatGpt and more and more content on the internet are generated by them. We’re going to build an extension for twitter, which will mark tweets and comments that you’re reading with ai-generated/human-written tags based on miners predictions from the subnet, so that people can know what content is qualitative and which texts are just auto-generated.

Browser extension

We also found it very useful to have an ability to instantly check whether some peace of text that you’re reading is ai-generated or human-written, so one of the application that we want to develop is a browser extension, with which users can just highlight some text and see a probability of this text to be ai-generated.

API

As mentioned above we’re going to develop several applications based on our subnet, but there are of course many more use cases for llm-detection in particular situations/businesses. So, we are also going to provide an API service that can be used by developers for their own integrations or for making predictions on a big amount of text (for example by AI engineers to clean up their datasets).

Commerce

All of the mentioned above services will have their own subscription plans to commercialize SN32. They will be based on api, which will be run by validators to provide access for miners and on which validators will be able to earn additional money.

By commercializing our product, we will become less reliant on emissions and start gaining real usage. Also, by the time when dynamic tao is introduced and validators' emission becomes zero, our token will already have great utility, and validators will be earning from the mentioned services.