Skip to content

add Python interface #817

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 6 commits into
base: main
Choose a base branch
from
Draft

add Python interface #817

wants to merge 6 commits into from

Conversation

Picus303
Copy link
Contributor

This PR allows to use Fish Speech as an importable Python library for easy integration in code.
It adds the sub-directory lib in the main directory and mainly relies on the dedicated Pipeline class.

Here is an example of use:

# Optionnal: remove prints in terminal
import os
from loguru import logger

logger.remove()
os.environ["TQDM_DISABLE"] = "1"

# Prepare the models
from fish_speech.lib import Pipeline

model = Pipeline(
    llama_path = "models/fish-speech-1_5",
    vqgan_path = "models/vqgan-1_5.pth",
)

# Create a reference audio
ref = model.make_reference("ref.wav", "reference text")

# Generate audio (no streaming)
output = model.generate("text to generate.", ref)

# Generate audio (streaming)
import numpy as np

generator = model.generate("text to generate.", ref, streaming=True)

parts = []
for part in generator:
    parts.append(part)
    print(part.shape)

output = np.concatenate(parts, axis=0)

# Save the output to a file
sample_rate = model.sample_rate

import soundfile as sf
sf.write("output.wav", output, sample_rate)

This PR is not complete yet as it's missing:
1 - Documentation. Question: Where do you want to put it?
2 - The code still depends on .project-root to manage paths, making it impossible to install in non-editable mode, forcing the user to keep the source code in a separate folder. I'd be glad if you have suggestions for this part.

@Picus303 Picus303 marked this pull request as draft January 10, 2025 16:25
@Whale-Dolphin Whale-Dolphin marked this pull request as ready for review January 12, 2025 10:44
@Picus303 Picus303 marked this pull request as draft January 12, 2025 10:45
Copy link
Contributor

This PR is stale because it has been open for 30 days with no activity.

@github-actions github-actions bot added the stale label Feb 12, 2025
@tarun7r tarun7r mentioned this pull request Mar 18, 2025
6 tasks
@organics2016
Copy link

organics2016 commented Apr 2, 2025

我已经尝试了这个PR,interface工作的非常好且设计优雅,希望作者不要放弃这个PR。

还有一些小问题,

当我直接通过pip git方式安装时,
pip install fish-speech@git+https://github.com/Picus303/fish-speech.git --no-cache-dir
调用这个PR的interface后,下面这行代码会因为找不到 ".project-root" 而提示报错。

pyrootutils.setup_root(__file__, indicator=".project-root", pythonpath=True)

我的解决方式是fork后注释它,然后一切工作正常。我的方式不是一个好的解决方式,可能会影响其他功能,在这里说明一下,作为参考,以便PR合并时的兼容性工作。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants