-
Notifications
You must be signed in to change notification settings - Fork 211
Parakeet text cleanups #1193
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Parakeet text cleanups #1193
Conversation
7788457
to
29b704b
Compare
🚀 The docs preview is ready! Check it out here: https://modal-labs-examples--frontend-preview-bd2d836.modal.run |
🚀 The docs preview is ready! Check it out here: https://modal-labs-examples--frontend-preview-61c1417.modal.run |
🚀 The docs preview is ready! Check it out here: https://modal-labs-examples--frontend-preview-d4e954b.modal.run |
@@ -275,9 +278,7 @@ def main(audio_url: str = AUDIO_URL): | |||
# Below are the three main functions that coordinate streaming audio and receiving transcriptions. | |||
# | |||
# `send_audio` transmits chunks of audio data and then pauses to approximate streaming | |||
# speech at a natural rate. That said, we set it to faster |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thought this was a bit too honest 😅
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would say the prose is too casual.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll resist defending the tone of this sentence. But it might be worth mentioning that we set it to faster than realtime just so people understand why we divide the wait time by 8 (i.e. wait for 1/8th the duration of the chunk we just sent).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably okay to not mention it. it's not crucial or the focus here.
942ceb2
to
eabfaf1
Compare
🚀 The docs preview is ready! Check it out here: https://modal-labs-examples--frontend-preview-43bff72.modal.run |
# ... | ||
# ``` | ||
# See [Troubleshooting](https://modal.com/docs/examples/parakeet#client) at the bottom if you run into issues. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we need the Troubleshooting section anymore.
🚀 The docs preview is ready! Check it out here: https://modal-labs-examples--frontend-preview-694f621.modal.run |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for taking another pass at this!
|
||
# - run the browser/microphone frontend, or | ||
# - Run the browser/microphone frontend. Modal handles the deployment of both the frontend and backend in a single app! You should see a browser window pop up - make sure you allow access to your microphone. The full frontend code can be found [here](https://github.com/modal-labs/modal-examples/tree/main/06_gpu_and_ml/audio-to-text/frontend). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-
Do we need the callout to it being a single app?
-
For me the browser window does not pop up automatically... I have to click the link in the terminal. Does it automatically open for you?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- I always think it's so cool but defer to you guys
- Oh good catch; will fix
|
||
# ```bash | ||
# 🌐 Downloading audio file... | ||
# 🎧 Downloaded 6331478 bytes | ||
# ☀️ Waking up model, this may take a few seconds on cold start... | ||
# 📝 Transcription: A Dream Within A Dream Edgar Allan Poe | ||
# 📝 Transcription: | ||
# 📝 Transcription: take this kiss upon the brow, And in parting from you now, Thus much let me avow You are not wrong who deem That my days have been a dream. | ||
# 📝 Transcription: Take this kiss upon the brow, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't actually break up the lines like this in the output... do we want it to be the actual output or this better looking one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I kinda like the better looking one but defer to you guys if that's dishonest 😅
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Movie magic yk
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@charlesfrye what's the Modal-Frye Style Guide say here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think splitting on punctuation with newlines in the code is a good idea! I'd like for the output to be real.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@charlesfrye i feel like coding that is a can of worms. like, just breaking on ,
or .
could have false positives. right now line breaks are based on speech breaks. using a poem as the example is fun but complicates this a bit.
@@ -275,9 +278,7 @@ def main(audio_url: str = AUDIO_URL): | |||
# Below are the three main functions that coordinate streaming audio and receiving transcriptions. | |||
# | |||
# `send_audio` transmits chunks of audio data and then pauses to approximate streaming | |||
# speech at a natural rate. That said, we set it to faster |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll resist defending the tone of this sentence. But it might be worth mentioning that we set it to faster than realtime just so people understand why we divide the wait time by 8 (i.e. wait for 1/8th the duration of the chunk we just sent).
🚀 The docs preview is ready! Check it out here: https://modal-labs-examples--frontend-preview-b852fe8.modal.run |
🚀 The docs preview is ready! Check it out here: https://modal-labs-examples--frontend-preview-ed65e78.modal.run |
@@ -376,3 +377,27 @@ def preprocess_audio(audio_bytes: bytes) -> bytes: | |||
def chunk_audio(data: bytes, chunk_size: int): | |||
for i in range(0, len(data), chunk_size): | |||
yield data[i : i + chunk_size] | |||
|
|||
|
|||
def output_message_as_transcript(message: str): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This might be weird. At the very least we shouldn't change the punctuation and just do a word wrap where we limit the number of char
s per line. @charlesfrye thoughts on this?
🚀 The docs preview is ready! Check it out here: https://modal-labs-examples--frontend-preview-11af6fe.modal.run |
@prbot automerge |
Type of Change
/docs/examples
)Documentation Site Checklist
☑️ Monitoring
modal run
, or an alternativecmd
is provided in the example frontmatter (e.g.cmd: ["modal", "serve"]
)cmd
with no arguments, or theargs
are provided in the example frontmatter (e.g.args: ["--prompt", "Formula for room temperature superconductor:"]
fastapi
to be installed locally (e.g. does not importrequests
ortorch
in the global scope or other code executed locally)☑️ Content
modal-cdn.com
☑️ Build Stability
v1
, not a dynamic tag likelatest
python_version
for the base image, if it is used~=x.y.z
or==x.y
, or we expect this example to work across major versions of the dependency and are committed to maintenance across those versionsversion < 1
are pinned to patch version,==0.y.z
Outside Contributors
You're great! Thanks for your contribution.