Skip to content

Latest commit

 

History

History
437 lines (315 loc) · 15.7 KB

examples.md

File metadata and controls

437 lines (315 loc) · 15.7 KB

Example CLI Commands

Outline

Content and Feed Inputs

Process Single Video URLs

Run on a single YouTube video.

npm run as -- --video "https://www.youtube.com/watch?v=MORMZXEaONk"

Process Multiple Videos in YouTube Playlist

Run on multiple YouTube videos in a playlist.

npm run as -- --playlist "https://www.youtube.com/playlist?list=PLCVnrVv4KhXPz0SoAVu8Rc1emAdGPbSbr"

Run on playlist URL and generate JSON info file with markdown metadata of each video in the playlist:

npm run as -- --playlist "https://www.youtube.com/playlist?list=PLCVnrVv4KhXPz0SoAVu8Rc1emAdGPbSbr" --info

Process Multiple Videos Specified in a URLs File

Run on an arbitrary list of URLs in example-urls.md.

npm run as -- --urls "content/example-urls.md"

Run on URLs file and generate JSON info file with markdown metadata of each video:

npm run as -- --urls "content/example-urls.md" --info

Process Single Audio or Video File

Run on audio.mp3 on the content directory:

npm run as -- --file "content/audio.mp3"

Process Podcast RSS Feed

Process RSS feed from newest to oldest (default behavior):

npm run as -- --rss "https://feeds.transistor.fm/fsjam-podcast/"

Process RSS feed from oldest to newest:

npm run as -- \
  --rss "https://feeds.transistor.fm/fsjam-podcast/" \
  --order oldest

Start processing a different episode by selecting a number of episodes to skip:

npm run as -- \
  --rss "https://feeds.transistor.fm/fsjam-podcast/" \
  --skip 1

Process a single specific episode from a podcast RSS feed by providing the episode's audio URL with the --item option:

npm run as -- \
  --rss "https://ajcwebdev.substack.com/feed" \
  --item "https://api.substack.com/feed/podcast/36236609/fd1f1532d9842fe1178de1c920442541.mp3" \
  --whisper tiny \
  --llama \
  --prompt titles summary longChapters takeaways questions

Run on a podcast RSS feed and generate JSON info file with markdown metadata of each item:

npm run as -- --rss "https://ajcwebdev.substack.com/feed" --info

Language Model (LLM) Options

Create a .env file and set API key as demonstrated in .env.example for either:

  • OPENAI_API_KEY
  • ANTHROPIC_API_KEY
  • COHERE_API_KEY
  • MISTRAL_API_KEY
  • OCTOAI_API_KEY

For each model available for each provider, I have collected the following details:

  • Context Window, the limit of tokens a model can process at once.
  • Max Output, the upper limit of tokens a model can generate in a response, influencing response length and detail.
  • Cost of input and output tokens per million tokens.
    • Some model providers also offer a Batch API with input/output tokens at half the price.

OpenAI's ChatGPT Models

npm run as -- --video "https://www.youtube.com/watch?v=MORMZXEaONk" --chatgpt

Select ChatGPT model:

# Select GPT-4o mini model - https://platform.openai.com/docs/models/gpt-4o-mini
npm run as -- --video "https://www.youtube.com/watch?v=MORMZXEaONk" --chatgpt GPT_4o_MINI

# Select GPT-4o model - https://platform.openai.com/docs/models/gpt-4o
npm run as -- --video "https://www.youtube.com/watch?v=MORMZXEaONk" --chatgpt GPT_4o

# Select GPT-4 Turbo model - https://platform.openai.com/docs/models/gpt-4-turbo-and-gpt-4
npm run as -- --video "https://www.youtube.com/watch?v=MORMZXEaONk" --chatgpt GPT_4_TURBO

# Select GPT-4 model - https://platform.openai.com/docs/models/gpt-4-turbo-and-gpt-4
npm run as -- --video "https://www.youtube.com/watch?v=MORMZXEaONk" --chatgpt GPT_4
Model Context Window Max Output Input Tokens Output Tokens Batch Input Batch Output
GPT-4o mini 128,000 16,384 $0.15 $0.60 $0.075 $0.30
GPT-4o 128,000 4,096 $5 $15 $2.50 $7.50
GPT-4 Turbo 128,000 4,096 $10 $30 $5 $15
GPT-4 8,192 8,192 $30 $60 $15 $30

Anthropic's Claude Models

npm run as -- --video "https://www.youtube.com/watch?v=MORMZXEaONk" --claude

Select Claude model:

npm run as -- --video "https://www.youtube.com/watch?v=MORMZXEaONk" --claude CLAUDE_3_5_SONNET
npm run as -- --video "https://www.youtube.com/watch?v=MORMZXEaONk" --claude CLAUDE_3_OPUS
npm run as -- --video "https://www.youtube.com/watch?v=MORMZXEaONk" --claude CLAUDE_3_SONNET
npm run as -- --video "https://www.youtube.com/watch?v=MORMZXEaONk" --claude CLAUDE_3_HAIKU

Google's Gemini Models

npm run as -- --video "https://www.youtube.com/watch?v=MORMZXEaONk" --gemini

Select Gemini model:

npm run as -- --video "https://www.youtube.com/watch?v=MORMZXEaONk" --gemini GEMINI_1_5_FLASH
npm run as -- --video "https://www.youtube.com/watch?v=MORMZXEaONk" --gemini GEMINI_1_5_PRO

Cohere's Command Models

npm run as -- --video "https://www.youtube.com/watch?v=MORMZXEaONk" --cohere

Select Cohere model:

npm run as -- --video "https://www.youtube.com/watch?v=MORMZXEaONk" --cohere COMMAND_R
npm run as -- --video "https://www.youtube.com/watch?v=MORMZXEaONk" --cohere COMMAND_R_PLUS

Mistral's Mistral Models

npm run as -- --video "https://www.youtube.com/watch?v=MORMZXEaONk" --mistral

Select Mistral model:

npm run as -- --video "https://www.youtube.com/watch?v=MORMZXEaONk" --mistral MIXTRAL_8x7b
npm run as -- --video "https://www.youtube.com/watch?v=MORMZXEaONk" --mistral MIXTRAL_8x22b
npm run as -- --video "https://www.youtube.com/watch?v=MORMZXEaONk" --mistral MISTRAL_LARGE
npm run as -- --video "https://www.youtube.com/watch?v=MORMZXEaONk" --mistral MISTRAL_NEMO

OctoAI's Models

npm run as -- --video "https://www.youtube.com/watch?v=MORMZXEaONk" --octo

Select Octo model:

npm run as -- --video "https://www.youtube.com/watch?v=MORMZXEaONk" --octo LLAMA_3_1_8B
npm run as -- --video "https://www.youtube.com/watch?v=MORMZXEaONk" --octo LLAMA_3_1_70B
npm run as -- --video "https://www.youtube.com/watch?v=MORMZXEaONk" --octo LLAMA_3_1_405B
npm run as -- --video "https://www.youtube.com/watch?v=MORMZXEaONk" --octo MISTRAL_7B
npm run as -- --video "https://www.youtube.com/watch?v=MORMZXEaONk" --octo MIXTRAL_8X_7B
npm run as -- --video "https://www.youtube.com/watch?v=MORMZXEaONk" --octo NOUS_HERMES_MIXTRAL_8X_7B
npm run as -- --video "https://www.youtube.com/watch?v=MORMZXEaONk" --octo WIZARD_2_8X_22B

Llama.cpp

npm run as -- --video "https://www.youtube.com/watch?v=MORMZXEaONk" --llama

Transcription Options

Whisper.cpp

If neither the --deepgram or --assembly option is included for transcription, autoshow will default to running the largest Whisper.cpp model. To configure the size of the Whisper model, use the --model option and select one of the following:

# tiny model
npm run as -- --video "https://www.youtube.com/watch?v=MORMZXEaONk" --whisper tiny

# base model
npm run as -- --video "https://www.youtube.com/watch?v=MORMZXEaONk" --whisper base

# small model
npm run as -- --video "https://www.youtube.com/watch?v=MORMZXEaONk" --whisper small

# medium model
npm run as -- --video "https://www.youtube.com/watch?v=MORMZXEaONk" --whisper medium

# large-v2 model
npm run as -- --video "https://www.youtube.com/watch?v=MORMZXEaONk" --whisper large

Run whisper.cpp in a Docker container with --whisperDocker:

npm run as -- --video "https://www.youtube.com/watch?v=MORMZXEaONk" --whisperDocker base

Deepgram

Create a .env file and set API key as demonstrated in .env.example for DEEPGRAM_API_KEY.

npm run as -- --video "https://www.youtube.com/watch?v=MORMZXEaONk" --deepgram

Assembly

Create a .env file and set API key as demonstrated in .env.example for ASSEMBLY_API_KEY.

npm run as -- --video "https://www.youtube.com/watch?v=MORMZXEaONk" --assembly

Include speaker labels and number of speakers:

npm run as -- --video "https://ajc.pics/audio/fsjam-short.mp3" --assembly --speakerLabels

Prompt Options

Default includes summary and long chapters, equivalent to running this:

npm run as -- --video "https://www.youtube.com/watch?v=MORMZXEaONk" --prompt summary longChapters

Create five title ideas:

npm run as -- --video "https://www.youtube.com/watch?v=MORMZXEaONk" --prompt titles

Create a one sentence and one paragraph summary:

npm run as -- --video "https://www.youtube.com/watch?v=MORMZXEaONk" --prompt summary

Create a short, one sentence description for each chapter that's 25 words or shorter.

npm run as -- --video "https://www.youtube.com/watch?v=MORMZXEaONk" --prompt shortChapters

Create a one paragraph description for each chapter that's around 50 words.

npm run as -- --video "https://www.youtube.com/watch?v=MORMZXEaONk" --prompt mediumChapters

Create a two paragraph description for each chapter that's over 75 words.

npm run as -- --video "https://www.youtube.com/watch?v=MORMZXEaONk" --prompt longChapters

Create three key takeaways about the content:

npm run as -- --video "https://www.youtube.com/watch?v=MORMZXEaONk" --prompt takeaways

Create ten questions about the content to check for comprehension:

npm run as -- --video "https://www.youtube.com/watch?v=MORMZXEaONk" --prompt questions

Include all prompt options:

npm run as -- --video "https://www.youtube.com/watch?v=MORMZXEaONk" --prompt titles summary longChapters takeaways questions

Alternative Runtimes

Docker Compose

This will run both whisper.cpp and the AutoShow Commander CLI in their own Docker containers.

docker-compose run autoshow --video "https://www.youtube.com/watch?v=MORMZXEaONk" --whisperDocker base

Currently working on the llama.cpp Docker integration so the entire project can be encapsulated in one local Docker Compose file.

Bun

bun bun-as -- --video "https://www.youtube.com/watch?v=MORMZXEaONk"

Deno

deno task deno-as --video "https://www.youtube.com/watch?v=MORMZXEaONk"

Makeshift Test Suite

Creating a robust and flexible test suite for this project is complex because of the range of network requests, file system operations, build steps, and 3rd party APIs involved. A more thought out test suite will be created at some point, but in the mean time these are hacky but functional ways to test the majority of the project in a single go.

Full Test Suite

  • You'll need API keys for all services to make it through this entire command.
  • Mostly uses transcripts of videos around one minute long and cheaper models when possible, so the total cost of running this for any given service should be at most only a few cents.
npm run test-all

Partial Test Command for Local Services

This version of the test suite only uses Whisper for transcription and Llama.cpp for LLM operations.

npm run test-local

Create Single Markdown File with Entire Project

This can be a useful way of creating a single markdown file of the entire project for giving to an LLM as context to develop new features or debug code. I'll usually start a conversation by including this along with a prompt that explains what I want changed or added.

export MD="LLM.md" && export COMMANDS="src/commands" && export UTILS="src/utils" && \
  export LLMS="src/llms" && export TRANSCRIPT="src/transcription" && \
  export OPEN="\n\n\`\`\`js" && export CLOSE="\n\`\`\`\n\n" && cat README.md >> $MD && \
  echo '\n\n### Directory and File Structure\n\n```' >> $MD && tree >> $MD && \
  echo '```\n\n## Example CLI Commands Test Suite'$OPEN'' >> $MD && cat test/all.test.js >> $MD && \
  echo ''$CLOSE'## JSDoc Types'$OPEN'' >> $MD && cat src/types.js >> $MD && \
  echo ''$CLOSE'## AutoShow CLI Entry Point'$OPEN'' >> $MD && cat src/autoshow.js >> $MD && \
  echo ''$CLOSE'## Utility Functions\n\n### Generate Markdown'$OPEN'' >> $MD && cat $UTILS/generateMarkdown.js >> $MD && \
  echo ''$CLOSE'### Download Audio'$OPEN'' >> $MD && cat $UTILS/downloadAudio.js >> $MD && \
  echo ''$CLOSE'### Run Transcription'$OPEN'' >> $MD && cat $UTILS/runTranscription.js >> $MD && \
  echo ''$CLOSE'### Run LLM'$OPEN'' >> $MD && cat $UTILS/runLLM.js >> $MD && \
  echo ''$CLOSE'### Clean Up Files'$OPEN'' >> $MD && cat $UTILS/cleanUpFiles.js >> $MD && \
  echo ''$CLOSE'## Process Commands\n\n### Process Video'$OPEN'' >> $MD && cat $COMMANDS/processVideo.js >> $MD && \
  echo ''$CLOSE'### Process Playlist'$OPEN'' >> $MD && cat $COMMANDS/processPlaylist.js >> $MD && \
  echo ''$CLOSE'### Process URLs'$OPEN'' >> $MD && cat $COMMANDS/processURLs.js >> $MD && \
  echo ''$CLOSE'### Process RSS'$OPEN'' >> $MD && cat $COMMANDS/processRSS.js >> $MD && \
  echo ''$CLOSE'### Process File'$OPEN'' >> $MD && cat $COMMANDS/processFile.js >> $MD && \
  echo ''$CLOSE'## Transcription Functions\n\n### Call Whisper'$OPEN'' >> $MD && cat $TRANSCRIPT/whisper.js >> $MD && \
  echo ''$CLOSE'### Call Deepgram'$OPEN'' >> $MD && cat $TRANSCRIPT/deepgram.js >> $MD && \
  echo ''$CLOSE'### Call Assembly'$OPEN'' >> $MD && cat $TRANSCRIPT/assembly.js >> $MD && \
  echo ''$CLOSE'## LLM Functions\n\n### Prompt Function'$OPEN'' >> $MD && cat $LLMS/prompt.js >> $MD && \
  echo ''$CLOSE'### Call ChatGPT'$OPEN'' >> $MD && cat $LLMS/chatgpt.js >> $MD && \
  echo ''$CLOSE'### Call Claude'$OPEN'' >> $MD && cat $LLMS/claude.js >> $MD && \
  echo ''$CLOSE'### Call Cohere'$OPEN'' >> $MD && cat $LLMS/cohere.js >> $MD && \
  echo ''$CLOSE'### Call Gemini'$OPEN'' >> $MD && cat $LLMS/gemini.js >> $MD && \
  echo ''$CLOSE'### Call Llama.cpp'$OPEN'' >> $MD && cat $LLMS/llama.js >> $MD && \
  echo ''$CLOSE'### Call Mistral'$OPEN'' >> $MD && cat $LLMS/mistral.js >> $MD && \
  echo ''$CLOSE'### Call Octo'$OPEN'' >> $MD && cat $LLMS/octo.js >> $MD && \
  echo ''$CLOSE'## Docker Files\n\n```Dockerfile' >> $MD && cat .github/whisper.Dockerfile >> $MD && \
  echo ''$CLOSE'```Dockerfile' >> $MD && cat .github/llama.Dockerfile >> $MD && \
  echo ''$CLOSE'```Dockerfile' >> $MD && cat Dockerfile >> $MD && \
  echo ''$CLOSE'```yml' >> $MD && cat docker-compose.yml >> $MD && \
  echo ''$CLOSE'```bash' >> $MD && cat docker-entrypoint.sh >> $MD && \
  echo '\n```\n' >> $MD