Add community.lexicon.preference.ai lexicon by ngerakines · Pull Request #72 · lexicon-community/lexicon

ngerakines · 2026-04-04T18:51:47Z

Summary

Introduces the community.lexicon.preference.ai lexicon for declaring user preferences regarding AI usage of their public data
Decomposes AI usage into four distinct categories (training, inference, synthetic content generation, embedding), each with independent allow/deny controls
Supports scoped overrides via globalScope, entityScope, and collectionScope so users can set account-wide defaults and carve out exceptions for specific entities or collections

Design

Each preference is tri-state: allowed, denied, or undefined (omitted). The record at key self with globalScope establishes account-wide defaults. Additional records keyed by TID are scoped overrides that only need to declare the preferences they change — everything else falls through to the default.

Consumer resolution order:

Entity-scoped override matching the consumer's DID or domain
Collection-scoped override matching the content's NSID
Global default at key self

Related work

Complementary to Bluesky Proposal 0008: User Intents for Data Reuse
IETF AI Preferences working group

snarfed · 2026-04-04T22:31:46Z

Exciting! lexicon.community could be a great home for this, esp since it stalled within Bluesky PBC.

I'd have to think a bit more to fully grok the scopes and usage types, but my main first thought is, if we go to all the effort of a working group etc, maybe we shoould go ahead and include the other two intents in https://github.com/bluesky-social/proposals/blob/main/0008-user-intents/README.md too, bulk datasets and protocol bridging?

rudyfraser · 2026-04-04T23:08:42Z

LGTM besides the matter of default values and maybe how omission should be interpreted; Agree with @snarfed on other intents being in scope. Thanks for the quick turnaround

musicjunkieg · 2026-04-05T06:49:57Z

I'm not certain this language is quite as clear as it should be; as models continue to change in terms of their creation primitives, what seems reasonable now to split between inference and training may not seem that way in 12-18 months.

I think many of the thoughts in Focus on purpose of use rather than time of ingestion - IETF AIPREF WG #159as well as Replace current vocabulary with a display-based preferences vocabulary have very effective ideas. I'd like to see this concept fleshed out a bit, especially if the goal is to make this legible to regular users.

Things like "scientific use: true" or "generative fiction use: false" may be more relevant than the time of ingestion frames currently used here.

sposth · 2026-04-23T07:10:46Z

As mentioned above, the work at the IETF is highly relevant – not only with regard to the vocabulary itself, but also the attachment mechanisms.

https://datatracker.ietf.org/doc/draft-ietf-aipref-attach/
https://datatracker.ietf.org/doc/draft-ietf-aipref-vocab/05/

Following the IETF meeting in Toronto a couple of days ago, a new editor’s draft is expected soon. It will include several improvements, in particular on the discoverability of content that has been opted out in the context of search.

This is an important point. Many users may wish to opt out of AI training while still remaining discoverable in what the IETF may call “non-generative search” – meaning AI-assisted search that does not provide AI-generated summaries, synthetic answers, or other substitute outputs. In practice, this can be understood as a narrower form of the IETF Internet draft on display-based preferences mentioned already by @musicjunkieg.

The use cases around RAG and inference – where content is used by AI systems after model training – are expected to be discussed at a future IETF meeting, likely in late summer. That discussion should help clarify whether the emerging IETF vocabulary will provide meaningful value for creators and rightsholders, who would like to have a say how content is used by AI systems post-training.

Adding new AI preference expressions may be desirable, however it should be considered whether the AI model developers, AI system providers, or search engines will take them into account. I suggest a realistic approach in this regard.

A second point concerns the attachment mechanism. Should AI preferences be applied only as general account-level settings, or should they rather be attachable to individual posts and media assets? This distinction matters. Content is frequently shared, quoted, or reposted by accounts that are not in a position to decide on rights reservations or permissions. For that reason, attaching such preferences solely at account level may create both practical and legal concerns.

At Liccium, we are working on an asset-level approach in which AI preferences can be bound directly to the individual post and to the underlying media asset (blob) using ISCC fingerprints. This allows preferences to travel with the content itself, rather than depending only on the account or the platform through which it was shared.

feature: introduction of the community.lexicon.preference.ai lexicon

9d5c583

ngerakines merged commit 02044ea into main Apr 25, 2026
1 check failed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add community.lexicon.preference.ai lexicon#72

Add community.lexicon.preference.ai lexicon#72
ngerakines merged 1 commit into
mainfrom
ngerakines/community.lexicon.preference.ai

ngerakines commented Apr 4, 2026

Uh oh!

snarfed commented Apr 4, 2026

Uh oh!

rudyfraser commented Apr 4, 2026

Uh oh!

musicjunkieg commented Apr 5, 2026

Uh oh!

sposth commented Apr 23, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

ngerakines commented Apr 4, 2026

Summary

Design

Related work

Uh oh!

snarfed commented Apr 4, 2026

Uh oh!

rudyfraser commented Apr 4, 2026

Uh oh!

musicjunkieg commented Apr 5, 2026

Uh oh!

sposth commented Apr 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

sposth commented Apr 23, 2026 •

edited

Loading