Skip to content

Heterod0x/oto-open-full-duplex-spontaneous-conversation-datasets

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 

Repository files navigation

OTO Open Full‑Duplex Spontaneous Conversation Datasets

This repository hosts open discussions and documentation for designing, collecting, evaluating, and (where possible) releasing large‑scale, full‑duplex spontaneous conversation datasets in English. The project is led by oto (our startup), with a long‑term goal of 10 million hours of natural, open‑domain small‑talk.

Overview

  • Purpose: Build shared understanding and community consensus around dataset requirements, collection design, evaluation, and release policies for full‑duplex conversations.
  • Focus: English, spontaneous and natural small‑talk audio conversations.
  • Where discussions happen: Primarily on Hugging Face Discussions (link TBA).

What is full‑duplex?

Full‑duplex means both parties can speak at the same time without waiting for the other to finish—allowing overlaps, backchannels, interruptions, and collaborative repairs, closer to real human talk than half‑duplex, turn‑by‑turn systems.

Goals and milestones

  • Short term: Define requirements, collection design, and ethics/privacy standards.
  • Mid term: Establish prototype-scale collection, evaluation procedures, and metadata specs.
  • Long term: Grow toward 10M hours and release datasets in staged, policy‑compliant ways.

Language roadmap

We will begin with English as the first step, then expand to multiple languages in phases. The expansion pace and language priority will be guided by community discussion, ethical considerations, and data availability. Our long‑term goal is a diverse, multilingual, and representative corpus while maintaining strong privacy and consent standards.

Scope

  • Spontaneous small‑talk (task‑free or loose‑goal)
  • Conversation phenomena: overlap, backchannels, repairs, silences
  • Audio plus metadata (recording conditions, anonymized speaker attributes, etc.)

Out of scope (for now)

  • Unrestricted distribution of raw, personally identifying data
  • Heavily scripted read‑aloud data as the only source

Ethics and privacy

We prioritize participant privacy and ethics. Collection and releases will follow applicable laws and platform policies, with clear consent, anonymization, and redistribution terms. Detailed policies will be codified through community discussion.

How to participate

  1. Join the discussions on Hugging Face (proposals, questions, debates)
  2. Use GitHub Issues for problem statements and requests
  3. Send Pull Requests to improve documentation and integrate proposals

Links (will be updated)

  • Hugging Face Discussions: TBA
  • oto (service): TBA

Contribution guidelines

  • When opening a new Issue or Discussion, briefly state background, purpose, and expected impact.
  • Keep PRs minimal and link them to related discussions or issues.
  • English or Japanese contributions are both welcome.

About OTO (for Hugging Face Organization Card)

OTO is a project focused on end‑to‑end, full‑duplex conversational AI and large‑scale spontaneous dialogue datasets. We explore speech processing across overlapping talk, backchannels, interruptions, and collaborative repair—phenomena essential to human‑like interaction. Our work emphasizes:

  • privacy‑first data collection and release policies,
  • reproducible evaluation and metadata specs,
  • open community discussions to shape standards and best practices.

We build on modern deep learning tooling and follow rigorous data processing, feature design, and recipe‑driven workflows to enable comprehensive experimentation in full‑duplex conversation research. If you are interested in contributing datasets, evaluation ideas, or tooling, please join the discussions.

License

Licensing for discussions and design docs is being finalized. Dataset releases may carry source‑specific terms. Once decided, this README and LICENSE will be updated.

Maintainers

  • oto (startup) / Maintainers: TBA

FAQ

Q. When will data be released? A. Releases will be staged based on ethics review, anonymization, and rights clearance. Timing will be updated as discussions progress.

Q. Are non‑English languages included? A. Our initial focus is English. Multilingual expansion may be considered later through community consensus.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors