Skip to content
This repository was archived by the owner on Dec 21, 2023. It is now read-only.

Commit 0c2bf08

Browse files
authored
Update README.md
1 parent 7c00f49 commit 0c2bf08

File tree

1 file changed

+64
-0
lines changed

1 file changed

+64
-0
lines changed

README.md

Lines changed: 64 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,67 @@
1+
# Tacotron-2-Chinese
2+
3+
## 预训练模型
4+
5+
[标贝数据集100K步模型](https://github.com/JasonWei512/Tacotron-2-Original/releases/download/Biaobei-100K-Taoctron/logs-Tacotron-2.zip)
6+
7+
仅Tacotron,无WaveNet(训练WaveNet时loss总爆炸)
8+
9+
使用标贝数据集,为避免爆显存用了ffmpeg把语料的采样率从48KHz降到了36KHz
10+
11+
## 安装依赖
12+
13+
1. 安装 Python 3 和 Tensorflow
14+
15+
2. 安装依赖:
16+
```
17+
apt-get install -y libasound-dev portaudio19-dev libportaudio2 libportaudiocpp0 ffmpeg libav-tools
18+
```
19+
20+
3. 安装 requirements:
21+
```
22+
pip install -r requirements.txt
23+
```
24+
25+
## 训练
26+
27+
1. **下载[标贝数据集](https://weixinxcxdb.oss-cn-beijing.aliyuncs.com/gwYinPinKu/BZNSYP.rar),解压至 `Tacotron-2`**
28+
29+
目录结构如下:
30+
31+
```
32+
Tacotron-2
33+
|- BZNSYP
34+
|- PhoneLabeling
35+
|- ProsodyLabeling
36+
|- Wave
37+
```
38+
39+
2. **ffmpeg降语音采样率**
40+
```
41+
ffmpeg.exe -i 输入.wav -ar 22050 输出.wav
42+
```
43+
44+
3. **预处理数据**
45+
```
46+
python preprocess.py --dataset='Biaobei'
47+
```
48+
49+
4. **训练模型(自动从最新 Checkpoint 继续)**
50+
```
51+
python train.py --model='Tacotron-2'
52+
```
53+
54+
5. **从最新 Checkpoint 合成语音**
55+
56+
```
57+
python synthesize.py --model='Tacotron-2' --text_list='path_to_text_file.txt'
58+
```
59+
无WaveNet时,Tacotron输出mel谱,后处理得线性谱,由Griffin-Lim生成波形
60+
61+
 
62+
63+
 
64+
165
# Tacotron-2:
266
Tensorflow implementation of DeepMind's Tacotron-2. A deep neural network architecture described in this paper: [Natural TTS synthesis by conditioning Wavenet on MEL spectogram predictions](https://arxiv.org/pdf/1712.05884.pdf)
367

0 commit comments

Comments
 (0)