Skip to content

Handle 403 errors with youtube#187

Merged
NotJoeMartinez merged 10 commits intomainfrom
handle-bot-blocking-maybe-use-api
Jul 4, 2025
Merged

Handle 403 errors with youtube#187
NotJoeMartinez merged 10 commits intomainfrom
handle-bot-blocking-maybe-use-api

Conversation

@NotJoeMartinez
Copy link
Owner

This pull request introduces several improvements and updates to the yt-fts project, focusing on troubleshooting documentation, type hinting, dependency updates, and Python compatibility. The most significant changes include adding a comprehensive troubleshooting guide for 403 errors, updating type hints across multiple functions for better code clarity, upgrading dependencies, and increasing the minimum required Python version.

Randomizing User agent, retry method

added yt-dlp config options to retry downloads and randomize user agent

Documentation Enhancements:

  • Added a new file docs/TROUBLESHOOTING_403.md with detailed explanations of 403 errors, diagnosis tools, common solutions, advanced troubleshooting steps, and prevention tips. This includes example workflows and error message references for user guidance.

Type Hinting Improvements:

  • Updated type hints across functions in src/yt_fts/config.py, src/yt_fts/db_utils.py, and src/yt_fts/export.py to improve code readability and enforce stricter type checking. Examples include specifying return types (str | None, list[tuple[int, str, str, str]]) and parameter types (channel_id: str, limit: int | None). [1] [2] [3]

Dependency Updates:

  • Upgraded dependencies in pyproject.toml:
    • openai updated from 1.35.3 to 1.93.0.
    • chromadb updated from 0.5.2 to 1.0.15.

Python Compatibility:

  • Increased the minimum required Python version from >=3.8 to >=3.10 in pyproject.toml to leverage newer language features and maintain compatibility with updated dependencies.

File Renaming:

  • Renamed yt_fts/config.py, yt_fts/db_utils.py, and yt_fts/export.py to src/yt_fts/config.py, src/yt_fts/db_utils.py, and src/yt_fts/export.py respectively, reflecting a change in project structure. [1] [2] [3]

…tics

- Introduced a new `diagnose` command to identify and troubleshoot 403 errors when accessing YouTube.
- Added detailed troubleshooting steps and recommendations in a new documentation file.
- Updated the download handler to improve error handling and retry logic for 403 and rate limiting issues.
- Enhanced the `.gitignore` file to include additional custom entries.
…ity functions

- Changed the minimum required Python version from 3.8 to 3.10 in `pyproject.toml`.
- Added type hints to various functions in `db_utils.py` for improved code clarity and type checking.
- Enhanced type annotations for functions in `config.py`, `export.py`, `get_embeddings.py`, `list.py`, `llm.py`, `search.py`, `summarize.py`, `utils.py`, and `yt_fts.py` to improve code clarity and type checking.
- Updated function signatures to specify return types and parameter types, ensuring better compatibility with type checkers and enhancing overall code quality.
- Moved `SummarizeHandler` to a new `summarize.py` file within the `llm` directory for better organization.
- Introduced `LLMHandler` in `chatbot.py` to manage interactions with the OpenAI API.
- Updated imports in `yt_fts.py` to reflect the new module structure.
- Added functionality for video summarization and transcript handling in the `SummarizeHandler` class.
- Cleaned up import statements across multiple files for consistency and clarity.
- Introduced a new `EmbeddingsHandler` class in `get_embeddings.py` to manage embedding functionalities.
- Updated references in `yt_fts.py`, `search.py`, and `chatbot.py` to align with the new import structure.
- Upgraded `openai` from version 1.35.3 to 1.93.0
- Updated `chromadb` from version 0.5.2 to 1.0.15 .
@NotJoeMartinez NotJoeMartinez requested a review from Copilot July 4, 2025 05:16
@NotJoeMartinez NotJoeMartinez merged commit bdaeace into main Jul 4, 2025
3 checks passed
@NotJoeMartinez NotJoeMartinez deleted the handle-bot-blocking-maybe-use-api branch July 4, 2025 05:16
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR enhances error handling for YouTube downloads, upgrades code clarity with type hints, and updates project dependencies and structure.

  • Introduces a thorough troubleshooting guide and a diagnose CLI command for 403 errors
  • Adds retry logic, user-agent randomization, and improved session handling in DownloadHandler
  • Applies type hints across functions and bumps Python requirement to 3.10, plus dependency upgrades

Reviewed Changes

Copilot reviewed 14 out of 18 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
docs/TROUBLESHOOTING_403.md New guide for diagnosing and resolving YouTube 403 errors
pyproject.toml Increased Python minimum to 3.10; updated openai and chromadb
src/yt_fts/config.py Added return type hints
src/yt_fts/db_utils.py Added type hints; improved function signatures
src/yt_fts/utils.py Added type hints for utility functions
src/yt_fts/yt_fts.py Refactored imports, added diagnose command, and type hints
src/yt_fts/download/download_handler.py New download handler with retry, UA options, and 403 diagnostics
src/yt_fts/search.py Added type hints; reorganized imports
src/yt_fts/llm/summarize.py Added type hints; adjusted imports
src/yt_fts/llm/get_embeddings.py Added type hints; reorganized imports
src/yt_fts/llm/chatbot.py Added type hints; reorganized imports
src/yt_fts/list.py Added type hints; cleaned imports
src/yt_fts/export.py Added type hints; refined export messages
Comments suppressed due to low confidence (6)

src/yt_fts/download/download_handler.py:418

  • The placeholder tag !SLOP GENERATED in the docstring should be removed or replaced with a meaningful description.
        !SLOP GENERATED

src/yt_fts/yt_fts.py:108

  • The function name list shadows the built-in Python list type. Consider renaming this command handler to avoid confusion.
def list(transcript: str | None, channel: str | None, library: bool) -> None:

src/yt_fts/yt_fts.py:196

  • The parameter name format shadows the built-in format function. Consider renaming it (e.g., fmt) to avoid collisions.
def export(channel: str, format: str) -> None:

src/yt_fts/export.py:60

  • Fix spelling: Erorr should be Error in the console message.
            file_name = f"all_{timestamp}.csv"

src/yt_fts/export.py:81

  • Fix spelling: Erorr should be Error in the console message.
                metadata = get_metadata_from_db(video_id)

src/yt_fts/download/download_handler.py:257

  • The user_agent key in the vid_obj dict is never used downstream; consider removing it to keep the payload focused on needed fields.
                        'user_agent': 'random',

Comment on lines +197 to +201
ydl_opts = {
'extract_flat': True,
'quiet': True,
'nocheckcertificate': True,
'user_agent': 'random',
Copy link

Copilot AI Jul 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ensure that 'user_agent': 'random' is a valid yt_dlp option. If the intent is to randomize the UA header, you may need to supply a list of agents and set http_headers: {'User-Agent': ...} instead.

Suggested change
ydl_opts = {
'extract_flat': True,
'quiet': True,
'nocheckcertificate': True,
'user_agent': 'random',
import random
user_agents = [
"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36",
"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36",
"Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36",
"Mozilla/5.0 (iPhone; CPU iPhone OS 14_6 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/14.1.1 Mobile/15E148 Safari/604.1",
"Mozilla/5.0 (iPad; CPU OS 14_6 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/14.1.1 Mobile/15E148 Safari/604.1",
]
selected_user_agent = random.choice(user_agents)
ydl_opts = {
'extract_flat': True,
'quiet': True,
'nocheckcertificate': True,
'http_headers': {'User-Agent': selected_user_agent},

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants