22 Apr 19:33

Arrmlet

3dd11bf

Into s3 buckets Latest

Latest

Enabled uploads into s3 buckets
We to start with dual uploads HF + S3 to not disrupt the current products
Once we everything works correctly and secure, we to disable HF uploads
If miners can upload the data with no issues, we will deprecate the HF rewards and enable the S3 validation ( without dehydration so you will upload the data much faster) with **keyword DD ** which will move away us from hastags/subreddits on data you already uploaded.
prevent potential media content spoofing in X tweet validation

Assets 2

18 Apr 15:49

Arrmlet

v1.9.0

2d48a6d

Media Universe S3 Infra

Release 1.9.0 – Media Universe

This release introduces the infrastructure required for full migration to S3 storage and launches the Media Universe — an extension of our tweet model and validation system with native support for tweet media.

🚧 Infrastructure (S3 Auth, Signatures, Delay)

This release sets up the core S3 upload mechanism using presigned POST policies authenticated with Bittensor commitments and Keypair signatures.

Uploads are paused until after Easter to allow time for testing.

We will begin dual-storage (S3 + HF) temporarily and fully migrate after testing.

Keypair signing integration will be finalized before uploads resume, securing all interactions.

:camera_with_flash: Media Universe: Native Support for Tweet Media

We’re launching media validation in tweets to support richer content comparison and enable future CV-based models.

Added Media Field to XContent
media: Optional[List[str]] added to the content schema

Uses exclude_none=True for full backward compatibility

Introduced MEDIA_REQUIRED_DATE
New constant MEDIA_REQUIRED_DATE = 2025-05-15

Media validation only applies to tweets after this date

Scraper Upgrades
Updated ApiDojoTwitterScraper and MicroworldsTwitterScraper to extract media_urls

Validators now extract and store media for comparison

Validation Logic
validate_tweet_content() now checks:

If validator has media, miner must too

If both have media, media count must match

Older tweets skip media validation

Tests Added
Media extraction tests

Media validation tests

✅ How This Works With Existing Data
No migration required

Optional field means old records still load fine

Validation is gradual, with enforcement starting May 15

We recommend miners start integrating media support immediately.

Media Universe is live.
S3 infrastructure is ready.
Uploads enabled after Easter.

Assets 2

09 Apr 20:47

Arrmlet

v1.8.2

12a2694

Update the docs with S3 storage implementation

Assets 2

17 Mar 16:35

Arrmlet

v1.8.1

443c293

Release 1.8.1

Enhanced On-Demand API Release Announcement
New X/Twitter Data Enrichment Feature
We're excited to announce a significant upgrade to the Data Universe On-Demand API! Starting tomorrow, miners will be able to test this enhanced functionality on testnet using their existing hotkeys.
What's New
The enhanced API now delivers substantially richer X/Twitter content, including:

Comprehensive User Metadata

User display names, verification status, follower counts
Profile details and engagement metrics

Complete Tweet Context

Full engagement metrics (likes, retweets, replies, quotes, views)
Tweet classification (reply, quote, retweet)
Conversation threading information

Rich Media Support

Media URLs and content types
Support for photos and videos

Enhanced Value

More valuable data for validators and users
Better content analysis possibilities

How to Get Started
Miners can test this enhanced functionality on testnet starting tomorrow using their existing hotkeys. Installation is simple and backward compatible - the enhanced scraper will be available for all X/Twitter requests while Reddit functionality remains unchanged.
Implementation Benefits

Higher Quality Data: Deliver richer, more valuable content to validators
Competitive Edge: Enhanced content can lead to better scores in validation
Future-Ready: Positioning for upcoming data quality measurements

Miner Policy

To launch Gravity as a commercial product, we need to adhere to legal guidelines. The miner policy provides SN13 a basis for legality and outlines measures that miners should adhere to when scraping.
The miner policy is now displayed in the SN13 docs. It includes prohibiting the scraping of harmful or illegal content, and outlines legal responsibilities of data collection. We ask that you read this over and should you need to, make appropriate changes immediately.
Datasets uploaded to Hugging Face now display the Macrocosmos Miner Policy in dataset cards.
API Improvements

[CONTINUE HERE FOR ON-DEMAND API]

New Endpoint: list_hf_repo_names
Returns the list of distinct miner Hugging Face repos currently stored by the Validator.

Assets 2

03 Mar 22:08

Arrmlet

v1.8.0

e26dd8c

Release 1.8.0

Key Enhancements

API Database Stability Fix
Fixed critical issues with the API key database system
Implemented more robust database initialization
Added improved error handling for database operations
Enhanced On-Demand Data Verification

Added validator verification when miners return empty results
Penalizes miners who fail to return data that actually exists
Provides users with data even when miners fail to deliver

Assets 2

27 Feb 17:33

chai-amy

v1.7.9

23d5cd6

Release 1.7.9

In this update, we added improvements to on-demand data requests to deliver better results to API users and support future collaborations with other subnets, who will make use of this feature.

On-Demand Request Changes:

Now queries 5 miners instead of just 1 for each request ( random coldkeys).
Added consistency checks between miner responses
Implemented occasional validation of returned data (5% of requests)
Added small credibility penalties for miners who return bad data
Improved handling of empty results and non-existent data queries
Better selection logic to return the most reliable data to users

The process is as follows:

Select up to 5 diverse miners from top 60% performers ( by coldkey)
Query all selected miners with the same request
Check consistency among responses (within 30% of median)
Validate data in 5% of cases or when consistency is poor
Apply small credibility for bad data (0.01-0.05)
Choose best data to return from the following:
a. Validated miners with highest score
b. Consistent miners with most data
c. Median response when inconsistent
Return unique results to API user

Assets 2

25 Feb 19:38

chai-amy

v1.7.85

33f5225

Release 1.7.85

In this update:

Removed labels with greater than 140 characters from Dynamic Desirability uploads and retrieval.
Fixed datetime fromisoformat error when if commit date is greater than 19 hours old.
No action needed from miners.

Assets 2

20 Feb 15:40

Arrmlet

v1.7.84

11b25be

Release 1.7.84

A label weight can have a max value of 5 when incentivized by dynamic desirability
change label limit from 32 to 140 chars
filter out Unexpected header key encountered logs

Assets 2

19 Feb 17:06

Arrmlet

v1.7.83

dfb2fb1

Release 1.7.83

Temporarily remove parquet check.
Change base miner code to upload data every 17h.
Increase max total dynamic desirability value from 100 to 250

Assets 2

17 Feb 22:34

Arrmlet

v1.7.82

89fe222

Hotfix of < vs >

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Release 1.9.0 – Media Universe

🚧 Infrastructure (S3 Auth, Signatures, Delay)

:camera_with_flash: Media Universe: Native Support for Tweet Media

On-Demand Request Changes:

The process is as follows:

Releases: macrocosm-os/data-universe

Into s3 buckets

Media Universe S3 Infra

Release 1.9.0 – Media Universe

🚧 Infrastructure (S3 Auth, Signatures, Delay)

:camera_with_flash: Media Universe: Native Support for Tweet Media

Update the docs with S3 storage implementation

Release 1.8.1

Release 1.8.0

Release 1.7.9

On-Demand Request Changes:

The process is as follows:

Release 1.7.85

Release 1.7.84

Release 1.7.83

Hotfix of < vs >