Skip to content

[WIP] TTS Quickstart Typescript: add audio playback and mention instant mode #160

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
48 changes: 41 additions & 7 deletions tts/tts-typescript-quickstart/index.ts
Original file line number Diff line number Diff line change
Expand Up @@ -2,11 +2,12 @@ import { HumeClient } from "hume"
import fs from "fs/promises"
import path from "path"
import * as os from "os"
import * as child_process from "child_process"
import dotenv from "dotenv"

dotenv.config()

const hume = new HumeClient({
const hume = new HumeClient({
apiKey: process.env.HUME_API_KEY!,
})

Expand All @@ -18,10 +19,35 @@ const writeResultToFile = async (base64EncodedAudio: string, filename: string) =
console.log('Wrote', filePath)
}

const startAudioPlayer = () => {
const proc = child_process.spawn('ffplay', ['-nodisp', '-autoexit', '-infbuf', '-i', '-'], {
detached: true,
stdio: ['pipe', 'ignore', 'ignore'],
})

proc.on('error', (err) => {
if ((err as any).code === 'ENOENT') {
console.error('ffplay not found. Please install ffmpeg to play audio.')
}
})

return {
sendAudio: (audio: string) => {
const buffer = Buffer.from(audio, "base64")
proc.stdin.write(buffer)
},
stop: () => {
proc.stdin.end()
proc.unref()
}
}

}

const main = async () => {
await fs.mkdir(outputDir)
console.log('Writing to', outputDir)

const speech1 = await hume.tts.synthesizeJson({
utterances: [{
description: "A refined, British aristocrat",
Expand All @@ -35,7 +61,7 @@ const main = async () => {
name,
generationId: speech1.generations[0].generationId,
})

const speech2 = await hume.tts.synthesizeJson({
utterances: [{
voice: { name },
Expand All @@ -48,7 +74,7 @@ const main = async () => {
})
await writeResultToFile(speech2.generations[0].audio, "speech2_0")
await writeResultToFile(speech2.generations[1].audio, "speech2_1")

const speech3 = await hume.tts.synthesizeJson({
utterances: [{
voice: { name },
Expand All @@ -62,15 +88,23 @@ const main = async () => {
})
await writeResultToFile(speech3.generations[0].audio, "speech3_0")

let i = 0
const audioPlayer = startAudioPlayer()
for await (const snippet of await hume.tts.synthesizeJsonStreaming({
context: {
generationId: speech3.generations[0].generationId,
},
utterances: [{text: "He's drawn the bow..."}, {text: "he's fired the arrow..."}, {text: "I can't believe it! A perfect bullseye!"}],
utterances: [{ text: "He's drawn the bow..." }, { text: "he's fired the arrow..." }, { text: "I can't believe it! A perfect bullseye!" }],
// Uncomment to reduce latency to < 500ms, at a 10% higher cost
// instantMode: true,
//
// By default, the audio data of every chunk returned by `synthesizeJsonStreaming` is a standalone 'mp3' file.
// The `playAudio` function expects to receive a single audio file. You can pass the `stripHeaders` option to
// remove the "headers" from each chunk so that the streamed audio can be played as a single file.
// TODO: stripHeaders: true
})) {
await writeResultToFile(snippet.audio, `speech4_${i++}`)
audioPlayer.sendAudio(snippet.audio)
}
audioPlayer.stop()
}

main().then(() => console.log('Done')).catch(console.error)
8 changes: 4 additions & 4 deletions tts/tts-typescript-quickstart/package-lock.json

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion tts/tts-typescript-quickstart/package.json
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@
"description": "<div align=\"center\"> <img src=\"https://storage.googleapis.com/hume-public-logos/hume/hume-banner.png\"> <h1>Text-to-Speech | TypeScript Quickstart</h1> <p> <strong>Jumpstart your development with Hume's OCTAVE TTS API!</strong> </p> </div>",
"dependencies": {
"dotenv": "^16.4.7",
"hume": "^0.9.18"
"hume": "^0.10.1"
},
"devDependencies": {
"@types/node": "^22.14.0",
Expand Down