2
2
3
3
** aeneas** is a Python/C library and a set of tools to automagically synchronize audio and text (aka forced alignment).
4
4
5
- * Version: 1.5.0.3
6
- * Date: 2016-04-23
5
+ * Version: 1.5.1.0
6
+ * Date: 2016-07-25
7
7
* Developed by: [ ReadBeyond] ( http://www.readbeyond.it/ )
8
8
* Lead Developer: [ Alberto Pettarin] ( http://www.albertopettarin.it/ )
9
9
* License: the GNU Affero General Public License Version 3 (AGPL v3)
@@ -87,6 +87,16 @@ which can be installed on any modern OS (Linux, Mac OS X, Windows).
87
87
88
88
### Installation
89
89
90
+ All-in-one installers are available for Mac OS X and Windows,
91
+ and a Bash script for deb-based Linux distributions (Debian, Ubuntu)
92
+ is provided in this repository.
93
+ It is also possible to download a VirtualBox+Vagrant virtual machine.
94
+ Please see the
95
+ [ INSTALL file] ( https://github.com/readbeyond/aeneas/blob/master/wiki/INSTALL.md )
96
+ for detailed, step-by-step installation procedures for different operating systems.
97
+
98
+ The generic OS-independent procedure is simple:
99
+
90
100
1 . Install
91
101
[ Python] ( https://python.org/ ) (2.7.x preferred),
92
102
[ FFmpeg] ( https://www.ffmpeg.org/ ) , and
@@ -102,20 +112,16 @@ which can be installed on any modern OS (Linux, Mac OS X, Windows).
102
112
pip install aeneas
103
113
```
104
114
105
- See the
106
- [INSTALL file](https://github.com/readbeyond/aeneas/blob/master/wiki/INSTALL.md)
107
- for detailed, step-by-step procedures for Linux, OS X, and Windows.
108
-
109
-
110
- # # Usage
111
-
112
- 1. To ** check** whether you installed ** aeneas** correctly, run:
115
+ 4. To ** check** whether you installed ** aeneas** correctly, run:
113
116
114
117
` ` ` bash
115
118
python -m aeneas.diagnostics
116
119
` ` `
117
120
118
- 2. Run without arguments to get the ** usage message** :
121
+
122
+ # # Usage
123
+
124
+ 1. Run without arguments to get the ** usage message** :
119
125
120
126
` ` ` bash
121
127
python -m aeneas.tools.execute_task
@@ -131,7 +137,7 @@ for detailed, step-by-step procedures for Linux, OS X, and Windows.
131
137
python -m aeneas.tools.execute_task --examples-all
132
138
` ` `
133
139
134
- 3 . To ** compute a synchronization map** ` map.json` for a pair
140
+ 2 . To ** compute a synchronization map** ` map.json` for a pair
135
141
(` audio.mp3` , ` text.txt` in
136
142
[plain](http://www.readbeyond.it/aeneas/docs/textfile.html#aeneas.textfile.TextFileFormat.PLAIN)
137
143
text format), you can run:
@@ -169,7 +175,7 @@ for detailed, step-by-step procedures for Linux, OS X, and Windows.
169
175
[documentation](http://www.readbeyond.it/aeneas/docs/)
170
176
for details.
171
177
172
- 4 . If you have several tasks to process,
178
+ 3 . If you have several tasks to process,
173
179
you can create a ** job container**
174
180
to batch process them:
175
181
@@ -222,12 +228,12 @@ which explains how to use the built-in command line tools.
222
228
* Arbitrary text fragment granularity (single word, subphrase, phrase, paragraph, etc.)
223
229
* Input audio file formats: all those readable by ` ffmpeg`
224
230
* Output sync map formats: AUD, CSV, EAF, JSON, SMIL, SRT, SSV, SUB, TSV, TTML, TXT, VTT, XML
225
- * Tested languages: ARA, BUL, CAT, CYM, CES, DAN, DEU, ELL, ENG, EPO, EST, FAS, FIN, FRA, GLE, GRC, HRV, HUN, ISL, ITA, LAT, LAV, LIT, NLD, NOR, RON, RUS, POL, POR, SLK, SPA, SRP, SWA, SWE, TUR, UKR
231
+ * Confirmed working on languages: ARA, BUL, CAT, CYM, CES, DAN, DEU, ELL, ENG, EPO, EST, FAS, FIN, FRA, GLE, GRC, HRV, HUN, ISL, ITA, JPN , LAT, LAV, LIT, NLD, NOR, RON, RUS, POL, POR, SLK, SPA, SRP, SWA, SWE, TUR, UKR
226
232
* MFCC and DTW computed via Python C extensions to reduce the processing time
227
- * On Linux, eSpeak called via a Python C extension for faster audio synthesis
228
- * Batch processing of multiple audio/text pairs
229
233
* Several built-in TTS engine wrappers: eSpeak (default, FLOSS), Festival (FLOSS), Nuance TTS API (commercial)
230
- * Use custom TTS engine wrappers besides the built-in ones
234
+ * Default TTS (eSpeak) called via a Python C extension for fast audio synthesis
235
+ * A custom, user-provided TTS engine Python wrapper can be used instead of the built-in ones (included example for speect)
236
+ * Batch processing of multiple audio/text pairs
231
237
* Download audio from a YouTube video
232
238
* In multilevel mode, recursive alignment from paragraph to sentence to word level
233
239
* Robust against misspelled/mispronounced words, local rearrangements of words, background noise/sporadic spikes
@@ -236,13 +242,14 @@ which explains how to use the built-in command line tools.
236
242
* Output an HTML file for fine tuning the sync map manually (` finetuneas` project)
237
243
* Execution parameters tunable at runtime
238
244
* Code suitable for Web app deployment (e.g., on-demand cloud computing)
245
+ * Extensive test suite including 898 unit/integration/performance tests, that run and must pass before each release
239
246
240
247
241
248
# # Limitations and Missing Features
242
249
243
250
* Audio should match the text: large portions of spurious text or audio might produce a wrong sync map
244
- * Audio is assumed to be spoken: not suitable/YMMV for song captioning
245
- * No protection against memory trashing if you feed extremely long audio files
251
+ * Audio is assumed to be spoken: not suitable for song captioning, YMMV for CC applications
252
+ * No protection against memory trashing if you feed extremely long audio files ( > 1.5h per single audio file)
246
253
* [Open issues](https://github.com/readbeyond/aeneas/issues)
247
254
248
255
@@ -340,6 +347,9 @@ for its asynchronous usage.
340
347
** Chris Hubbard** prepared the files for
341
348
packaging aeneas as a Debian/Ubuntu ` .deb` .
342
349
350
+ ** Daniel Bair** , ** Chris Hubbard** , and ** Richard Margetts**
351
+ packaged the installers for Mac OS X and Windows.
352
+
343
353
** Firat Ozdemir** contributed the ` finetuneas`
344
354
HTML/JS code for fine tuning sync maps in the browser.
345
355
0 commit comments