Skip to content

Decentralized Data Annotation Platform (DAP) #2504

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 12 commits into from

Conversation

BlockchainViper
Copy link

@BlockchainViper BlockchainViper commented Feb 6, 2025

Project Abstract

The Decentralized Data Annotation Platform (DAP) is a blockchain-powered solution designed to enhance AI model training by enabling fair and transparent data annotation. Built on Polkadot, DAP connects AI companies and researchers with annotators who label images, text, and audio while ensuring privacy and data security. Annotators are fairly compensated via smart contracts, and data providers receive high-quality, ethically sourced datasets. Leveraging Substrate for blockchain logic, IPFS for decentralized storage, and (ZKPs) for privacy-preserving verification.

Our mission is to create a decentralized, equitable ecosystem that empowers annotators, safeguards data privacy, and contributes to the ethical development of AI.

Grant level

  • Level 1: Up to $10,000, 2 approvals
  • Level 2: Up to $30,000, 3 approvals
  • Level 3: Unlimited, 5 approvals (for >$100k: Web3 Foundation Council approval)

Application Checklist

  • The application template has been copied and aptly renamed (project_name.md).
  • I have read the application guidelines.
  • Payment details have been provided (Polkadot AssetHub (USDC & DOT) address in the application and bank details via email, if applicable).
  • I understand that an agreed upon percentage of each milestone will be paid in vested DOT, to the Polkadot address listed in the application.
  • I am aware that, in order to receive a grant, I (and the entity I represent) have to successfully complete a KYC/KYB check.
  • The software delivered for this grant will be released under an open-source license specified in the application.
  • The initial PR contains only one commit (squash and force-push if needed).
  • The grant will only be announced once the first milestone has been accepted (see the announcement guidelines).
  • I prefer the discussion of this application to take place in a private Element/Matrix channel. My username is: @_______:matrix.org (change the homeserver if you use a different one)

@github-actions github-actions bot added the admin-review This application requires a review from an admin. label Feb 6, 2025
Copy link
Contributor

github-actions bot commented Feb 6, 2025

CLA Assistant Lite bot All contributors have signed the CLA ✍️ ✅

@BlockchainViper
Copy link
Author

BlockchainViper commented Feb 6, 2025

I have read and hereby sign the Contributor License Agreement.

Copy link
Member

@semuelle semuelle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the application, @BlockchainViper. Please give us a few days to catch up, someone will look into your application soon. In the meantime, I left some change requests to make the doc a bit more readable.

Can you explain how the quality of an annotation is verified, and how ZKPs come in there? Also, why do you split the functionality between pallets and smart contracts?

@BlockchainViper
Copy link
Author

BlockchainViper commented Feb 6, 2025

Thanks for your questions @semuelle we are very excited about this.

1. How is the quality of an annotation verified, and how do ZKPs come into play?

Quality Verification Process:
Annotations go through a multi-step verification process to ensure accuracy and fairness:

Initial Annotation:

  • An annotator completes a task (e.g., labeling an image, tagging text) and submits it.

Cross-Verification:

  • Multiple annotators review the same task independently.
  • If discrepancies arise, the task is flagged for review.
  • Annotators build a reputation score based on accuracy over time.

AI-Assisted Quality Check:

  • Lightweight machine learning models analyze annotations for consistency and compliance with guidelines.
  • Tasks flagged as low-confidence undergo additional human verification.

Reward Distribution:

  • Smart contracts distribute rewards in DOT, ensuring transparency.
  • Annotators receive payments only for validated annotations.
  • Disputed tasks are temporarily withheld from payment until resolved.

Role of ZKPs (Zero-Knowledge Proofs):

  • Annotators generate ZKP proofs to demonstrate that their annotation meets quality standards without revealing the underlying data.

Why use ZKPs?

  • Privacy-Preserving Verification: The system can confirm correctness without exposing sensitive annotation data.
  • Trustless Validation: Annotators cannot cheat since ZKPs cryptographically guarantee compliance with predefined rules.
  • Optional Feature: Data owners can choose to enable ZKPs depending on privacy requirements.

These proofs are verified on-chain, ensuring a trustless and decentralized validation process.

2. Why do you split the functionality between pallets and smart contracts?

The hybrid approach optimizes performance, scalability, and modularity.

Pallets (Core Logic, High Performance)

Handles core blockchain functions requiring high security and consensus:

  • Task assignment and annotation tracking
  • Reward distribution (ensuring low-cost, transparent payments)
  • Reputation management and governance

Integrated directly into Substrate runtime, making it:

  • More efficient than smart contracts for frequent operations
  • More secure (runs at the protocol level, reducing attack surfaces)

Smart Contracts (Flexibility & Extensibility)

Manages customizable workflows, such as:

  • Annotation-specific verification mechanisms (e.g., ZKP-based validation)
  • Dispute resolution logic (allowing case-specific adjudication)
  • External integrations (AI-assisted annotation tools, data marketplaces)
  • Enables third-party developers to extend the system with custom logic.

Performance & Scalability Considerations:

  • Frequent, predictable operations (like task assignments and payments) stay in pallets for lower execution costs.
  • Complex, evolving workflows (like annotation verification) run in smart contracts, ensuring flexibility.
  • Enables cross-chain interoperability with other parachains or AI-focused projects.

Future-Proofing the Platform

  • Modular design simplifies upgrades.
  • New annotation types or privacy mechanisms can be introduced without disrupting existing functionality.

@keeganquigley
Copy link
Contributor

keeganquigley commented Feb 11, 2025

Thanks for the application @BlockchainViper a few questions:

  • What kind of smart contracts would you be writing? (Solidity, ink!)
  • Would these be deployed on your own parachain built in M2 or somewhere else?
  • What language & framework would be used for the back-end? Node.js? nvm I see now it says Node.js

@BlockchainViper
Copy link
Author

BlockchainViper commented Feb 13, 2025

Thank you @keeganquigley. Yes you are correct, the backend is in node.js

  • We will write the smart contract in ink.
  • The smart contracts will be deployed on our own parachain..

Copy link
Member

@semuelle semuelle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @BlockchainViper, sorry for the long wait. I had another look and found some things still missing:

  • All the template text is still in the document. Please review.
  • Team code repos
  • Legal structure

One of your main arguments is user control. Can you explain how DAP gives the users more control? As far as I can see, it's still a marketplace. Also, can you explain how a ZKP verifies annotation quality?

@semuelle semuelle added changes requested The team needs to clarify a few things first. and removed stale labels Mar 7, 2025
@BlockchainViper
Copy link
Author

@semuelle

"Thank you for your response. I have fixed the issues with the template. To answer your questions, DAP isn’t just a marketplace, it’s a platform where annotators can actively perform data annotation tasks and earn rewards, while data providers gain high-quality datasets with greater autonomy than centralized alternatives."

1. How does DAP give users more control?

  • Annotators: They choose tasks based on their skills and interests, not assigned by a central authority, with pallets managing this transparently on our parachain (Milestone 2). Rewards in DOT, distributed via ink! smart contracts are tied to reputation scores earned through quality work—not arbitrary platform fees or middlemen. Dispute resolution on-chain allows annotators to challenge unfair rejections, giving them a voice absent in traditional platforms.

  • Data Providers: They upload datasets to IPFS/Arweave, retaining control over access and privacy settings (e.g., enabling ZKPs), not surrendering them to a central entity. DAP avoids centralized bottlenecks, letting users shape future features via reputation-weighted feedback—aligning with Web3’s user-owned ethos, not a top-down model.

2. How does a ZKP verify annotation quality?
ZKP verify quality cryptographically without exposing sensitive data:

  • Process: An annotator labels data (e.g., tags an image as ‘cat’) and generates a Zk-SNARK proof using predefined quality rules (e.g., ‘label matches AI prediction at 95% confidence’). The proof, submitted to an ink! smart contract on our parachain, proves correctness without revealing the image or label.

  • Why It Works: Data stays confidential—crucial for sensitive datasets like medical images—and cryptographic math ensures trust lessness, preventing fakery. Proofs are compact, keeping costs low (Milestone 3).

In addition, we recognize that not all annotators may have the skills or hardware to create ZKPs, which could exclude users, a significant concern for the major centralized platforms since DAP prioritizes accessibility, ZKPs are optional for privacy-sensitive tasks were the data provider requires it (Milestone 3). Standard tasks use cross-verification and AI checks, requiring no such expertise. If ZKP generation fails (e.g., on low-end devices). Post-launch we’ll also provide tutorials to help annotators build their skills, reinforcing user empowerment and inclusivity.

@PieWol
Copy link
Member

PieWol commented Mar 26, 2025

Hey @BlockchainViper ,

Unfortunately, your application couldn't gather enough approvals from the grants committee, despite the long time it's been up for debate. Please let me know if you have any questions.

All the best for your future endeavours.

@PieWol PieWol closed this Mar 26, 2025
@BlockchainViper
Copy link
Author

BlockchainViper commented Mar 26, 2025

Hey @PieWol, can you please cross-check it does not seem like it has been up for a vote yet? Last I saw, it was still in 'admin review.

@PieWol PieWol added the ready for review The project is ready to be reviewed by the committee members. label Mar 27, 2025
@PieWol
Copy link
Member

PieWol commented Mar 27, 2025

Hey @BlockchainViper ,

I understand where you come from, since we didn't add the "ready for review" label here. Instead of the regular asynchronous reviews from the team we decided to simply discuss it in one of our team meetings. I hope this clarifies things and makes you less concerned about not receiving a proper review from each team member within this GitHub PR directly.

For the sake of completeness I will now add the missing label but no additional reviews will take place. Again, we wish you all the best for your future endeavours within the Polkadot ecosystem and appreciate your application.

@BlockchainViper
Copy link
Author

@PieWol Thanks for the clarification. No need to add the label. I just wanted to understand the process. I appreciate the explanation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
admin-review This application requires a review from an admin. changes requested The team needs to clarify a few things first. ready for review The project is ready to be reviewed by the committee members.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants