2
2
3
3
** aeneas** is a Python library and a set of tools to automagically synchronize audio and text.
4
4
5
- * Version: 1.3.3
6
- * Date: 2015-12-20
5
+ * Version: 1.4.0
6
+ * Date: 2016-01-15
7
7
* Developed by: [ ReadBeyond] ( http://www.readbeyond.it/ )
8
8
* Lead Developer: [ Alberto Pettarin] ( http://www.albertopettarin.it/ )
9
9
* License: the GNU Affero General Public License Version 3 (AGPL v3)
@@ -75,10 +75,11 @@ or raw CSV/SSV/TSV/TXT/XML for further processing.
75
75
1 . a reasonably recent machine (recommended 4 GB RAM, 2 GHz 64bit CPU)
76
76
2 . ` ffmpeg ` and ` ffprobe ` executables available in your ` $PATH `
77
77
3 . ` espeak ` executable available in your ` $PATH `
78
- 4 . Python 2.7.x
79
- 5 . Python modules ` BeautifulSoup ` , ` lxml ` , and ` numpy `
80
- 6 . (Optional, but strongly recommended) Python C headers to compile the Python C extensions
81
- 7 . (Optional, required only for downloading audio from YouTube) Python module ` pafy `
78
+ 4 . Python 2.7 (Linux, OS X, Windows) or 3.4 or later (Linux, OS X)
79
+ 5 . Python modules ` BeautifulSoup4 ` , ` lxml ` , and ` numpy `
80
+ 6 . (Optional, strongly recommended) Python C headers to compile the Python C extensions
81
+ 7 . (Optional, strongly recommended if you plan to use the CLI tools) A shell supporting UTF-8
82
+ 8 . (Optional, only required if you plan to download audio from YouTube) Python module ` pafy `
82
83
83
84
Depending on the format(s) of audio files you work with,
84
85
you might need to install additional audio codecs for ` ffmpeg ` .
@@ -87,38 +88,59 @@ for `espeak`, depending on the language(s) you work on.
87
88
(Installing _ all_ the codecs and _ all_ the voices available
88
89
might be a good idea.)
89
90
90
- If installing the above dependencies proves difficult on your OS,
91
- you are strongly encouraged to use
92
- [ aeneas-vagrant] ( https://github.com/readbeyond/aeneas-vagrant ) ,
93
- which provides ** aeneas** inside a virtualized Debian image
94
- running under [ VirtualBox] ( https://www.virtualbox.org/ )
95
- and [ Vagrant] ( http://www.vagrantup.com/ ) .
96
-
97
91
### Supported Platforms
98
92
99
93
** aeneas** has been developed and tested on ** Debian 64bit** ,
100
94
which is the ** only supported OS** at the moment.
95
+
101
96
(Do you need official support for another OS?
102
97
Consider [ sponsoring] ( #supporting ) this project!)
103
98
104
- However, ** aeneas** has been confirmed to work
105
- on other Linux distributions (Ubuntu, Slackware),
106
- on Mac OS X 10.9 and 10.10,
107
- and on Windows Vista/7/8.1/10.
108
-
109
- Whatever your OS is, make sure
110
- ` ffmpeg ` , ` ffprobe ` (which is part of ` ffmpeg ` distribution), and ` espeak `
111
- are properly installed and
112
- callable by the ` subprocess ` Python module.
99
+ However, ** aeneas** has been confirmed to work on the following systems:
100
+
101
+ | OS | 32/64 bit | Python 2.7 | Python 3.4/3.5 |
102
+ | ----------------| -----------| ------------| -----------------|
103
+ | Debian | 64 | Yes | Yes |
104
+ | Debian | 32 | Yes | Yes |
105
+ | Ubuntu | 64 | Yes | Yes |
106
+ | Gentoo | 64 | Yes | Unknown |
107
+ | Slackware | 64 | Yes | Unknown |
108
+ | Mac OS X 10.9 | 64 | Yes (1) | Unknown (1) |
109
+ | Mac OS X 10.10 | 64 | Yes (1) | Unknown (1) |
110
+ | Mac OS X 10.11 | 64 | Yes (1) | Unknown (1) |
111
+ | Windows Vista | 32 | Yes (1) | Yes (1, 2) |
112
+ | Windows 7 | 64 | Yes (1) | Yes (1, 2) |
113
+ | Windows 8.1 | 64 | Yes (1) | Unknown (1, 2) |
114
+ | Windows 10 | 64 | Yes (1) | Yes (1, 2) |
115
+
116
+ ** Notes**
117
+ (1) The `` cew `` Python C extension to speed up text synthesis
118
+ is available only on Linux at the moment.
119
+ (2) On Windows and Python 3.4/3.5, compiling the Python C extensions
120
+ is quite complex; however, running ** aeneas** in pure Python mode
121
+ has been confirmed to work.
122
+
123
+ Anyway, ** aeneas** should work on any OS, at least in pure Python mode,
124
+ provided that:
125
+
126
+ 1 . the required Python modules ` BeautifulSoup4 ` , ` lxml ` , and ` numpy ` are installed, and
127
+ 2 . ` ffmpeg ` , ` ffprobe ` (which is part of ` ffmpeg ` distribution), and ` espeak `
128
+ are installed and callable by the ` subprocess ` Python module.
113
129
A way to ensure the latter consists
114
130
in adding these three executables to your ` PATH ` environment variable.
115
131
132
+ All strings and text files read by ** aeneas** are expected to be UTF-8 encoded,
133
+ and all text files written by ** aeneas** are UTF-8 encoded.
134
+ Therefore, it is strongly recommended to run the ** aeneas** CLI tools
135
+ on a shell with UTF-8 encoding and to convert any input text file to UTF-8.
136
+
116
137
If installing ** aeneas** natively on your OS proves difficult,
117
138
you are strongly encouraged to use
118
139
[ aeneas-vagrant] ( https://github.com/readbeyond/aeneas-vagrant ) ,
119
140
which provides ** aeneas** inside a virtualized Debian image
120
141
running under [ VirtualBox] ( https://www.virtualbox.org/ )
121
- and [ Vagrant] ( http://www.vagrantup.com/ ) .
142
+ and [ Vagrant] ( http://www.vagrantup.com/ ) , which can be installed
143
+ on any modern OS (Linux, Mac OS X, Windows).
122
144
123
145
### Installation
124
146
@@ -127,7 +149,7 @@ and [Vagrant](http://www.vagrantup.com/).
127
149
1 . Make sure you have
128
150
` ffmpeg ` , ` ffprobe ` (usually provided by the ` ffmpeg ` package),
129
151
and ` espeak ` installed and available on your command line.
130
- You also need Python 2.x and its "developer" package
152
+ You also need Python and its "developer" package
131
153
containing the C headers (` python-dev ` or similar).
132
154
133
155
2 . Install ` aeneas ` system-wise with ` pip ` :
@@ -160,7 +182,7 @@ you can install all the dependencies by downloading and running
160
182
just make sure you have
161
183
` ffmpeg` , ` ffprobe` (usually provided by the ` ffmpeg` package),
162
184
and ` espeak` installed and available on your command line.
163
- You also need Python 2.x and its " developer" package
185
+ You also need Python and its " developer" package
164
186
containing the C headers (` python-dev` or similar).
165
187
166
188
2. Clone the ` aeneas` repo, install Python dependencies, and compile C extensions:
@@ -195,6 +217,10 @@ based on
195
217
[these directions](https://groups.google.com/d/msg/aeneas-forced-alignment/p9cb1FA0X0I/8phzUgIqBAAJ),
196
218
written by Richard Margetts.
197
219
220
+ Please note that on Windows it is recommended to run ** aeneas**
221
+ with Python 2.7, since compiling the C extensions on Python 3.4 or 3.5
222
+ requires [a complex setup process](http://stackoverflow.com/questions/29909330/microsoft-visual-c-compiler-for-python-3-4).
223
+
198
224
# ### Mac OS X
199
225
200
226
Feel free to jump to step 9 if you already have
@@ -282,55 +308,55 @@ Feel free to jump to step 9 if you already have
282
308
1. Install ` aeneas` as described above. (Only the first time! )
283
309
284
310
2. Open a command prompt/shell/terminal and go to the root directory
285
- of the aeneas repository, that is, the one containing the ` README.md` and ` VERSION` files.
286
- (This step is not needed if you installed ` aeneas` with ` pip` ,
287
- since you will have the ` aeneas` module available system-wise.)
311
+ of the aeneas repository, that is, the one containing the ` README.md` and ` VERSION` files.
312
+ (This step is not needed if you installed ` aeneas` with ` pip` ,
313
+ since you will have the ` aeneas` module available system-wise.)
288
314
289
315
3. To compute a synchronization map ` map.json` for a pair
290
- (` audio.mp3` , ` text.txt` in ` plain` text format), you can run:
316
+ (` audio.mp3` , ` text.txt` in ` plain` text format), you can run:
291
317
292
318
` ` ` bash
293
319
$ python -m aeneas.tools.execute_task audio.mp3 text.txt " task_language=en|os_task_file_format=json|is_text_type=plain" map.json
294
320
` ` `
295
321
296
- The third parameter (the _configuration string_) can specify several parameters/options.
297
- See the [documentation](http://www.readbeyond.it/aeneas/docs/)
298
- or use the ` -h` switch for details.
299
-
300
- 4. To compute a synchronization map ` map.smil` for a pair
301
- (` audio.mp3` , ` page.xhtml` containing fragments marked by ` id` attributes like ` f001` ),
302
- you can run:
322
+ To compute a synchronization map ` map.smil` for a pair
323
+ (` audio.mp3` , ` page.xhtml` containing fragments marked by ` id` attributes like ` f001` ),
324
+ you can run:
303
325
304
326
` ` ` bash
305
327
$ python -m aeneas.tools.execute_task audio.mp3 page.xhtml " task_language=en|os_task_file_format=smil|os_task_file_smil_audio_ref=audio.mp3|os_task_file_smil_page_ref=page.xhtml|is_text_type=unparsed|is_text_unparsed_id_regex=f[0-9]+|is_text_unparsed_id_sort=numeric" map.smil
306
328
` ` `
307
329
308
- 5. If you have several tasks to run,
309
- you can create a job container and a configuration file,
310
- and run them all at once:
330
+ The third parameter (the _configuration string_) can specify several other parameters/options.
331
+ See the [documentation](http://www.readbeyond.it/aeneas/docs/)
332
+ or use the ` -h` switch for details.
333
+
334
+ 4. If you have several tasks to run,
335
+ you can create a job container and a configuration file,
336
+ and run them all at once:
311
337
312
338
` ` ` bash
313
339
$ python -m aeneas.tools.execute_job job.zip /tmp/
314
340
` ` `
315
341
316
- File ` job.zip` should contain a ` config.txt` or ` config.xml`
317
- configuration file, providing ** aeneas**
318
- with all the information needed to parse the input assets
319
- and format the output sync map files.
320
- See the [documentation](http://www.readbeyond.it/aeneas/docs/)
321
- or use the ` -h` switch for details.
342
+ File ` job.zip` should contain a ` config.txt` or ` config.xml`
343
+ configuration file, providing ** aeneas**
344
+ with all the information needed to parse the input assets
345
+ and format the output sync map files.
346
+ See the [documentation](http://www.readbeyond.it/aeneas/docs/)
347
+ or use the ` -h` switch for details.
322
348
323
- You might want to run ` execute_task` or ` execute_job`
324
- with ` -h` to get an usage message and some examples:
349
+ 5. You might want to run ` execute_task` or ` execute_job`
350
+ with ` -h` to get an usage message and some examples:
325
351
326
- ` ` ` bash
327
- $ python -m aeneas.tools.execute_task -h
328
- $ python -m aeneas.tools.execute_job -h
329
- ` ` `
352
+ ` ` ` bash
353
+ $ python -m aeneas.tools.execute_task -h
354
+ $ python -m aeneas.tools.execute_job -h
355
+ ` ` `
330
356
331
- See the [documentation](http://www.readbeyond.it/aeneas/docs/)
332
- for an introduction to the concepts of ` task` and ` job` ,
333
- and for the list of all the available options.
357
+ See the [documentation](http://www.readbeyond.it/aeneas/docs/)
358
+ for an introduction to the concepts of ` task` and ` job` ,
359
+ and for the list of all the available options.
334
360
335
361
336
362
# # Documentation
@@ -366,28 +392,30 @@ Changelog: [http://www.readbeyond.it/aeneas/docs/changelog.html](http://www.read
366
392
* Code suitable for a Web app deployment (e.g., on-demand AWS instances)
367
393
* Adjustable splitting times, including a max character/second constraint for CC applications
368
394
* Automated detection of audio head/tail
369
- * MFCC and DTW computed as Python C extensions to reduce the processing time
395
+ * MFCC and DTW computed via Python C extensions to reduce the processing time
370
396
* On Linux, ` espeak` called via a Python C extension for faster audio synthesis
371
397
* Output an HTML file (from ` finetuneas` project) for fine tuning the sync map manually
372
398
399
+
373
400
# # Limitations and Missing Features
374
401
375
402
* Audio should match the text: large portions of spurious text or audio might produce a wrong sync map
376
403
* Audio is assumed to be spoken: not suitable/YMMV for song captioning
377
404
* No protection against memory trashing if you feed extremely long audio files
378
405
* On Mac OS X and Windows, audio synthesis might be slow if you have thousands of text fragments
379
406
407
+
380
408
# # TODO List
381
409
382
410
* Improving robustness against music in background
383
- * Isolate non-speech intervals (music, prolonged silence)
411
+ * Isolating non-speech intervals (music, prolonged silence)
384
412
* Automated text fragmentation based on audio analysis
385
413
* Auto-tuning DTW parameters
386
414
* Reporting the alignment score
387
415
* Improving (removing? ) dependency from ` espeak` , ` ffmpeg` , ` ffprobe` executables
388
416
* Multilevel sync map granularity (e.g., multilevel SMIL output)
389
417
* Better documentation
390
- * Testing other approaches, like HMM
418
+ * Testing other approaches, like GMM/ HMM/NN (e.g., using HTK or Kaldi)
391
419
* Publishing the package on Debian repo
392
420
393
421
Would you like to see one of the above points done?
@@ -572,6 +600,9 @@ of downloading audio from YouTube
572
600
for the first time available
573
601
also on [PyPI](https://pypi.python.org/pypi/aeneas/)
574
602
603
+ ** January 2016** : release of v1.4.0,
604
+ supporting both Python 2.7 and 3.4 or later
605
+
575
606
# # Acknowledgments
576
607
577
608
Many thanks to ** Nicola Montecchio** ,
0 commit comments