tutubo

YouTube and YouTube Music metadata library. Searches videos, channels, playlists, music tracks, albums, and artists — with per-item content-type classification and lazy channel tab iteration. No pytube dependency.

Install

pip install tutubo
# Downloading requires yt-dlp (optional):
pip install yt-dlp
# Stealth transport — browser-fingerprinted TLS via curl_cffi (optional):
pip install tutubo[stealth]

30-second quickstart

from tutubo import YoutubeSearch

# Search and classify
for v in YoutubeSearch("rob zombie").iterate_videos(max_res=5):
    print(v.title, v.length, v.published_time)
    print("  content_type:", v.content_type)
    print("  badges:", v.badges)

# Intent-focused factory — appends "full movie" to the query
for v in YoutubeSearch.for_movies("blade runner").iterate_movies(max_res=3):
    print(v.title, v.length)

High-level shape

Class	What it gives you
`YoutubeSearch`	Searches youtube.com; yields `VideoPreview`, `ChannelPreview`, `PlaylistPreview`, mixes, related queries
`YoutubeMusicSearch`	Searches music.youtube.com; yields `MusicTrack`, `MusicAlbum`, `MusicArtist`, `MusicPlaylist`, `MusicVideo`
`Channel`	Fetches a channel page; exposes `.videos`, `.shorts`, `.streams`, `.live`, `.playlists`, `.podcasts`
`Playlist`	Fetches a playlist page; exposes `.videos` (lazy-paginated generator)
`ContentType`	30-value enum inferred from title, duration, badges, and channel tags

Key invariants

A Video from a channel tab has view_count as a human string (e.g. "31K views") and published_time as a relative string (e.g. "5 hours ago"). It has no length field — duration is not available from channel-page renderers.
A VideoPreview from a search result has length as seconds (int) and view_count as an exact integer.
MusicAlbum is a subclass of MusicPlaylist. Both expose .tracks as a list[MusicTrack], .track_count, .year, and .playlist_url.
Channel.live — fetches /@handle/live (a watch-page redirect); returns one Video or None — the currently on-air stream.
Channel.streams — fetches /@handle/streams (a browse tab); returns a lazy list of all livestream videos (past + current).
VideoPreview.content_type does not use channel tags (search results don't include them). Video.content_type (from Channel.videos) does, giving better accuracy for ambiguous titles.

Factory classmethods — 24 intent-focused search shortcuts

Every factory appends a keyword phrase to improve YouTube's ranking. Pair with the matching typed iterator for content-type enforcement:

from tutubo import YoutubeSearch

for v in YoutubeSearch.for_concerts("black sabbath").iterate_concerts(max_res=5):
    print(v.title, v.length)

for v in YoutubeSearch.for_podcasts("lex fridman").iterate_podcasts():
    print(v.title)

All 24 factories: for_movies, for_short_films, for_trailers, for_documentaries, for_behind_the_scenes, for_anime, for_tv_episodes, for_audiobooks, for_audio_dramas, for_podcasts, for_stand_up, for_interviews, for_lectures, for_concerts, for_news, for_live_news, for_sport, for_gaming, for_tutorials, for_reactions, for_compilations, for_kids, for_music_videos, for_music_audio. See docs/search.md.

YouTube Music search

YoutubeMusicSearch is a separate class from YoutubeSearch. It queries the YouTube Music API and returns structured music objects:

from tutubo import YoutubeMusicSearch

s = YoutubeMusicSearch("black sabbath paranoid")

for track in s.iterate_tracks(max_res=5):
    print(track.title, track.artist, track.length)
    print("  audio_only:", track.is_audio_only, "| music_video:", track.is_music_video)

for album in s.iterate_albums(max_res=3):
    print(album.title, album.artist, album.year, f"({album.track_count} tracks)")
    for t in album.tracks:
        print(f"  {t.track_number}. {t.title} [{t.length}s]")

for artist in s.iterate_artists(max_res=2):
    print(artist.name, artist.subscribers)

Channel metadata and tab iteration

from tutubo import Channel

c = Channel("https://www.youtube.com/@Metallica")
print(c.channel_name, c.subscribers, c.video_count_label)
print("keywords:", c.keywords[:5])
print("rss:", c.rss_url)

# Regular uploads
for video in c.videos:
    print(video.title, video.view_count, video.published_time)
    print("  content_type:", video.content_type)

# Currently on-air stream — one Video or None (reads /@handle/live)
live = c.live
if live:
    print("LIVE:", live.title, live.watch_url)

# Full stream archive — paginated list from /@handle/streams
for stream in c.streams:
    print(stream.title, stream.is_live)

# Podcast shows
c2 = Channel("https://www.youtube.com/@TheDissenterRL")
for pod in c2.podcasts:
    print(pod.title, pod.episode_count)
    pl = pod.get()
    for ep in pl.videos:
        print("  episode:", ep.watch_url)
        break

mediavocab integration

mediavocab is a hard runtime dependency. It provides ContentType, classify_video(), parse_title(), extract_tags(), and the Work / Release / Entity data model.

# ContentType and classifiers re-exported from tutubo for convenience
from tutubo import ContentType, classify_video, parse_title, extract_tags

# Convert a search result to typed mediavocab objects
from tutubo import YoutubeSearch

for v in YoutubeSearch.for_movies("nosferatu").iterate_movies(max_res=3):
    work = v.to_work()       # mediavocab.Work
    release = v.to_release() # mediavocab.Release
    print(work.title, work.year, work.media_type)
    print(release.resolution, release.accessibility)
    print(release.external_ids)   # {"youtube": "<video_id>"}

Badge → resolution mapping: "4K" → "2160p", "8K" → "4320p", "HD" → "1080p". CC badge → AccessibilityTrack(kind="captions"). See docs/mediavocab.md.

Pluggable session and `TUTUBO_TRANSPORT`

Channel and playlist HTML pages are fetched via a pluggable session (requests.Session by default). Set TUTUBO_TRANSPORT=curl_cffi to use browser-fingerprinted TLS:

export TUTUBO_TRANSPORT=curl_cffi   # requires: pip install tutubo[stealth]

Or inject a session object directly:

from curl_cffi import requests as cffi_requests
from tutubo.channel import Channel

ch = Channel("https://www.youtube.com/@LinusTechTips",
             session=cffi_requests.Session(impersonate="chrome"))

Note: tutubo._innertube._post (the search path) uses stdlib urllib.request and is not affected by the transport setting. See docs/transport.md.

Configuration

Env var	Effect
`TUTUBO_TRANSPORT`	Set to `curl_cffi` to enable stealth transport for channel/playlist fetches
`MEDIAVOCAB_LANG`	Default language for classification (e.g. `es-es`, `fr-fr`); default `en-us`

Examples

File	What it shows
`examples/01_quickstart.py`	Search, result types, content_type, dict interface
`examples/02_search_factories.py`	24 intent-focused factory methods + typed iterators
`examples/03_channel.py`	Channel metadata, videos tab, `Channel.live` vs `Channel.streams`
`examples/04_playlist.py`	Channel playlists and direct playlist iteration
`examples/05_podcasts.py`	Podcast shows, episode listing, `is_podcast=True` classification
`examples/06_music_search.py`	`YoutubeMusicSearch` — tracks, artists, community playlists
`examples/07_music_album.py`	`MusicAlbum` with full track listing
`examples/08_fanedits.py`	Fan-edit detection via `parse_title()` + `VariantKind.FANEDIT`
`examples/09_to_mediavocab.py`	All mediavocab fields from `to_work()` / `to_release()`
`examples/10_custom_session.py`	Pluggable session, `TUTUBO_TRANSPORT`, `curl_cffi` injection
`examples/11_pipeline.py`	Full `parse_title` → `classify` → `to_routing()` → `Signals` pipeline

Documentation

docs/index.md — class index and overview
docs/search.md — YoutubeSearch, 24 factories, YoutubeMusicSearch
docs/channel.md — Channel, Playlist, Video, PodcastPreview
docs/models.md — all model types with typed field reference
docs/content_type.md — ContentType enum and classification internals
docs/mediavocab.md — to_work() / to_release() bridge
docs/transport.md — pluggable session and stealth transport
docs/locale.md — locale system and supported languages
docs/downloading.md — download() and download_playlist()
docs/testing.md — fixture-based offline testing

License

Apache 2.0

Name		Name	Last commit message	Last commit date
Latest commit History 45 Commits
.github/workflows		.github/workflows
docs		docs
examples		examples
test		test
tutubo		tutubo
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

tutubo

Install

30-second quickstart

High-level shape

Key invariants

Factory classmethods — 24 intent-focused search shortcuts

YouTube Music search

Channel metadata and tab iteration

mediavocab integration

Pluggable session and `TUTUBO_TRANSPORT`

Configuration

Examples

Documentation

License

About

Uh oh!

Releases 22

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

tutubo

Install

30-second quickstart

High-level shape

Key invariants

Factory classmethods — 24 intent-focused search shortcuts

YouTube Music search

Channel metadata and tab iteration

mediavocab integration

Pluggable session and TUTUBO_TRANSPORT

Configuration

Examples

Documentation

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 22

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Pluggable session and `TUTUBO_TRANSPORT`

Packages