Skip to content

Added YouTube extraction to enable lyrics extraction. Added file upload section for addition of relevant info. #10

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

cmm25
Copy link
Collaborator

@cmm25 cmm25 commented Oct 6, 2024

In this pull, I have taken into account the file addition requirements. Moreover polished the prompts.py as song titles were only being generated in Native Chinese. One main issue is that due to the youtube pipeline so far only enabling lyrics retrieval of songs in English its quite evident many languages are being hampered. Moreover, lyrics extraction is only so far for songs with captions enabled thus necessitating need for manual extraction to cater for all.

# Try to get the transcript (subtitles) in English
try:
transcript_list = YouTubeTranscriptApi.list_transcripts(video_id)
transcript = transcript_list.find_transcript(['en'])
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thing is that the song chosen by the user is very likely to be in their native language, so they would probably want to see the lyrics in their own language.

If we use subtitles, the user might have to select the language manually. That's why I was thinking of just using Whisper instead.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another thing is, if you extract YouTube using subtitles and your MP3 extraction uses Whisper, both methods achieve the same result but may behave differently, which could confuse the user.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in regards to youtube subtitles and whisper I thought it would be nicer initially to have a failsafe so that if one fails the other kicks in. On review trying to use pure whisper I am encoutring lots of errors which now explains why i used the 2. Maybe, I may not know how but I thought the 2 would be nice to have them all.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I watch a Japanese video, I’d like to get the Japanese lyrics if I’m Japanese. If you want to provide this option, I think you can add another button to get the English subtitles. Let the user make the selection.


# Read and return lyrics
with open("lyrics.txt", "r", encoding="utf-8") as f:
lyrics = f.read()
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The result from Whisper is the lyrics, so why not use it directly instead of reading the temp file again?


Your Thought:
{thought}
"""
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

User prompts have been moved to prompts.py. Please follow the rule.


Please provide only the generated title in the specified language, without any additional explanation.
If the language is Chinese, use Traditional Chinese characters.
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Traditional Chinese is not the same as the Simplified Chinese, so we need to specify it. (Or we need add more selection options at app.py)

prompts.py Outdated
Do not include any quotation marks or parentheses at the beginning or end of the title.
Your Thought:
{thought}
Please provide only the generated title in the specified language, without any additional explanation. Do not include any quotation marks or parentheses at the beginning or end of the title.
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change in this prompt doesn’t seem to have a very significant effect. What is the purpose of changing this prompt?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The song title can be generated in the language we selected.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants