2
2
3
3
** aeneas** is a Python library and a set of tools to automagically synchronize audio and text.
4
4
5
- * Version: 1.1.2
6
- * Date: 2015-09-24
5
+ * Version: 1.2.0
6
+ * Date: 2015-09-27
7
7
* Developed by: [ ReadBeyond] ( http://www.readbeyond.it/ )
8
8
* Lead Developer: [ Alberto Pettarin] ( http://www.albertopettarin.it/ )
9
9
* License: the GNU Affero General Public License Version 3 (AGPL v3)
@@ -17,7 +17,7 @@ and an audio file containing the narration of the (same) text.
17
17
18
18
For example, given [ this text file] ( aeneas/tests/res/container/job/assets/p001.xhtml )
19
19
and [ this audio file] ( aeneas/tests/res/container/job/assets/p001.mp3 ) ,
20
- ** aeneas** computes the following map:
20
+ ** aeneas** computes the following abstract map:
21
21
22
22
```
23
23
[00:00:00.000, 00:00:02.680] <=> 1
@@ -37,28 +37,28 @@ and [this audio file](aeneas/tests/res/container/job/assets/p001.mp3),
37
37
[00:00:48.000, 00:00:53.280] <=> To eat the world's due, by the grave and thee.
38
38
```
39
39
40
- Moreover, the map can be output in several formats: SMIL for EPUB 3,
41
- SRT/TTML/VTT for closed captioning, JS for Web usage,
40
+ The map can be output to file in several formats: SMIL for EPUB 3,
41
+ SRT/TTML/VTT for closed captioning, JSON/RBSE for Web usage,
42
42
or raw CSV/SSV/TSV/TXT/XML for further processing.
43
43
44
44
45
45
## System Requirements, Supported Platforms and Installation
46
46
47
47
### System Requirements
48
48
49
- 1 . 2 GB RAM ( 4 GB recommended) , 2 GHz CPU (3 GHz 64bit recommended )
50
- 2 . ` ffmpeg ` and ` ffprobe ` executable available in your ` $PATH ` ( ` apt-get install ffmpeg* ` from [ ` deb-multimedia ` ] ( http://www.deb-multimedia.org/ ) )
51
- 3 . ` espeak ` executable available in your ` $PATH ` ( ` apt-get install espeak* ` )
49
+ 1 . a reasonably recent machine (recommended 4 GB RAM , 2 GHz 64bit CPU )
50
+ 2 . ` ffmpeg ` and ` ffprobe ` executables available in your ` $PATH `
51
+ 3 . ` espeak ` executable available in your ` $PATH `
52
52
4 . Python 2.7.x
53
- 5 . Python optional modules ` BeautifulSoup ` , ` lxml ` , ` numpy ` , and ` scikits.audiolab ` ( ` pip install ... ` )
54
- 6 . (Optional but strongly suggested) Python C headers to compile the Python C extensions ( ` apt-get install python-dev ` )
53
+ 5 . Python modules ` BeautifulSoup ` , ` lxml ` , ` numpy ` , and ` scikits.audiolab `
54
+ 6 . (Optional but strongly suggested) Python C headers to compile the Python C extensions
55
55
56
56
Depending on the format(s) of audio files you work with,
57
57
you might need to install additional audio codecs for ` ffmpeg ` .
58
58
Similarly, you might need to install additional voices
59
59
for ` espeak ` , depending on the language(s) you work on.
60
60
(Installing _ all_ the codecs and _ all_ the voices available
61
- in the Debian repository might be a good idea.)
61
+ might be a good idea.)
62
62
63
63
If installing the above dependencies proves difficult on your OS,
64
64
consider using the [ Vagrant box] ( http://www.vagrantup.com )
@@ -68,87 +68,92 @@ created by [aeneas-vagrant](https://github.com/readbeyond/aeneas-vagrant).
68
68
69
69
** aeneas** has been developed and tested on ** Debian 64bit** ,
70
70
which is the ** only supported OS** at the moment.
71
- Other Linux distributions should be good too.
72
71
73
- However, it should work on Mac OS X and Windows as well,
74
- once you make sure ` ffmpeg ` , ` ffprobe ` and ` espeak `
72
+ However, ** aeneas** has been confirmed to work
73
+ on other Linux distributions (Ubuntu, Slackware),
74
+ on Mac OS X (with developer tools installed) and on Windows Vista/7/8.1/10.
75
+
76
+ Whatever your OS is, make sure
77
+ ` ffmpeg ` , ` ffprobe ` (which is part of ` ffmpeg ` distribution), and ` espeak `
75
78
are properly installed and
76
79
callable by the ` subprocess ` Python module.
77
80
A way to ensure the latter consists
78
- in adding the three executables to your ` $PATH ` .
79
- Alternatively, you can use VirtualBox
81
+ in adding these three executables to your ` $PATH ` .
82
+
83
+ If installing ** aeneas** natively on your OS proves difficult,
84
+ you can use VirtualBox and [ Vagrant] ( http://www.vagrantup.com )
80
85
to run ** aeneas** inside a virtualized Debian image,
81
- for example using [ aeneas-vagrant] ( https://github.com/readbeyond/aeneas-vagrant ) .
86
+ using [ aeneas-vagrant] ( https://github.com/readbeyond/aeneas-vagrant ) .
82
87
83
88
### Installation
84
89
85
- ``` bash
86
- $ git clone https://github.com/readbeyond/aeneas.git
87
- $ cd aeneas
88
- $ pip install -r requirements.txt
89
- $ python setup.py build_ext --inplace
90
- $ python check_dependencies.py
91
- ```
90
+ #### Linux and Mac OS X
92
91
93
- If the last command prints a success message,
94
- you have all the required dependencies installed
95
- and you can confidently run ** aeneas** in production.
96
-
97
- If you are a user of a ` deb ` -based Linux distribution
98
- (e.g., Debian, Ubuntu),
92
+ 1 . If you are a user of a ` deb ` -based Linux distribution
93
+ (e.g., Debian or Ubuntu),
99
94
you can install all the dependencies by running
100
95
[ the provided ` install_dependencies.sh ` script] ( install_dependencies.sh )
101
96
102
- ``` bash
103
- $ sudo bash install_dependencies.sh
104
- ```
97
+ ```bash
98
+ $ sudo bash install_dependencies.sh
99
+ ```
100
+
101
+ 2 . If you have another Linux distribution or Mac OS X,
102
+ just make sure you have
103
+ ` ffmpeg ` , ` ffprobe ` (part of the ` ffmpeg ` package),
104
+ and ` espeak ` installed and available on your command line.
105
+ You also need Python 2.x and its "developer" package
106
+ containing the C headers.
107
+
108
+ 3 . Run the following commands:
109
+
110
+ ``` bash
111
+ $ git clone https://github.com/readbeyond/aeneas.git
112
+ $ cd aeneas
113
+ $ pip install -r requirements.txt
114
+ $ python setup.py build_ext --inplace
115
+ $ python check_dependencies.py
116
+ ```
105
117
106
- Then, run ` python setup.py build_ext --inplace ` and ` python check_dependencies.py ` as above.
118
+ If the last command prints a success message,
119
+ you have all the required dependencies installed
120
+ and you can confidently run ** aeneas** in production.
107
121
108
- If you are a Windows user, please read the installation instructions
122
+ # ### Windows
123
+
124
+ Please read the installation instructions
109
125
contained in the
110
- [ "Using aeneas for Audio-Text Synchronization" PDF] ( http://software.sil.org/scriptureappbuilder/resources/ )
126
+ [" Using aeneas for Audio-Text Synchronization" PDF](http://software.sil.org/scriptureappbuilder/resources/),
111
127
based on
112
128
[these directions](https://groups.google.com/d/msg/aeneas-forced-alignment/p9cb1FA0X0I/8phzUgIqBAAJ),
113
129
written by Richard Margetts.
114
130
115
- If installing natively proves difficult on your OS,
116
- consider using the [ Vagrant box] ( http://www.vagrantup.com )
117
- created by [ aeneas-vagrant] ( https://github.com/readbeyond/aeneas-vagrant ) .
118
-
119
131
120
132
# # Usage
121
133
122
- 1 . Clone this GitHub repo:
134
+ 1. Install ` aeneas ` as described above. (Only the first time ! )
123
135
124
- ``` bash
125
- $ git clone https://github.com/readbeyond/aeneas.git
126
- ```
136
+ 2. Open a command prompt/shell/terminal and go to the root directory
137
+ of the aeneas repository, that is, the one containing this ` README.md` file.
127
138
128
- 2. Enter the root directory:
139
+ 3. To compute a synchronization map ` map.json` for a pair
140
+ (` audio.mp3` , ` text.txt` in ` plain` format), you can run:
129
141
130
142
` ` ` bash
131
- $ cd aeneas
143
+ $ python -m aeneas.tools.execute_task audio.mp3 text.txt " task_language=en|os_task_file_format=json|is_text_type=plain " map.json
132
144
` ` `
133
145
134
- 3. (Optional, but strongly suggested) Compile the Python C extensions:
135
-
136
- ` ` ` bash
137
- $ python setup.py build_ext --inplace
138
- ` ` `
146
+ The third parameter (the _configuration string_) can specify several parameters/options.
147
+ See the [documentation](http://www.readbeyond.it/aeneas/docs/) for details.
139
148
140
- 4. To compute a SMIL synchronization map ` map.smil` for a pair
141
- (` audio.mp3` , ` text.txt` ), you can run:
149
+ 4. To compute a synchronization map ` map.smil` for a pair
150
+ (` audio.mp3` , ` page.xhtml` containing fragments marked by ` id` attributes like ` f001` ),
151
+ you can run:
142
152
143
153
` ` ` bash
144
- $ python -m aeneas.tools.execute_task audio.mp3 text.txt config_string map.smil
154
+ $ python -m aeneas.tools.execute_task audio.mp3 page.xhtml " task_language=en|os_task_file_format=smil|os_task_file_smil_audio_ref=audio.mp3|os_task_file_smil_page_ref=page.xhtml|is_text_type=unparsed|is_text_unparsed_id_regex=f[0-9]+|is_text_unparsed_id_sort=numeric " map.smil
145
155
` ` `
146
156
147
- ` config_string` is string containing all the
148
- parameters to parse ` text.txt` correctly and to
149
- format ` map.smil` as desired.
150
- See the [documentation](http://www.readbeyond.it/aeneas/docs/) for details.
151
-
152
157
5. If you have several tasks to run,
153
158
you can create a job container and a configuration file,
154
159
and run them all at once:
@@ -163,8 +168,8 @@ and run them all at once:
163
168
and format the output sync map files.
164
169
See the [documentation](http://www.readbeyond.it/aeneas/docs/) for details.
165
170
166
- You might want to run the above modules without arguments
167
- to get their manual :
171
+ You might want to run ` execute_task ` or ` execute_job `
172
+ without arguments to get an usage message and some examples :
168
173
169
174
` ` ` bash
170
175
$ python -m aeneas.tools.execute_task
@@ -202,20 +207,20 @@ Changelog: [http://www.readbeyond.it/aeneas/docs/changelog.html](http://www.read
202
207
* Arbitrary text fragment granularity (single word, subphrase, phrase, paragraph, etc.)
203
208
* Input audio file formats: all those supported by ` ffmpeg`
204
209
* Batch processing
205
- * Output sync map formats: CSV, JS , SMIL, TSV, TTML, TXT, VTT, XML
206
- * Supported (= tested) languages: BG, CA, CY, DA, DE, EL, EN, ES, ET, FI, FR, GA, GRC, HR, HU, IS, IT, LA, LT, LV, NL, NO, RO, RU, PL, PT, SK, SR, SV, TR, UK
210
+ * Output sync map formats: CSV, JSON , SMIL, SSV , TSV, TTML, TXT, VTT, XML
211
+ * Tested languages: BG, CA, CY, DA, DE, EL, EN, ES, ET, FA, FI, FR, GA, GRC, HR, HU, IS, IT, LA, LT, LV, NL, NO, RO, RU, PL, PT, SK, SR, SV, SW , TR, UK
207
212
* Robust against misspelled/mispronounced words, local rearrangements of words, background noise/sporadic spikes
208
213
* Code suitable for a Web app deployment (e.g., on-demand AWS instances)
209
214
* Adjustable splitting times, including a max character/second constraint for CC applications
215
+ * Automated detection of audio head/tail
210
216
* MFCC and DTW computed as Python C extensions to reduce the processing time
211
217
212
218
213
219
# # Limitations and Missing Features
214
220
215
221
* Audio should match the text: large portions of spurious text or audio might produce a wrong sync map
216
222
* Audio is assumed to be spoken: not suitable/YMMV for song captioning
217
- * DTW computation is memory hungry
218
- * No protection against memory trashing
223
+ * No protection against memory trashing if you feed extremely long audio files
219
224
220
225
221
226
# # TODO List
@@ -228,7 +233,6 @@ Changelog: [http://www.readbeyond.it/aeneas/docs/changelog.html](http://www.read
228
233
* Improving (removing? ) dependency from ` espeak` , ` ffmpeg` , ` ffprobe` executables
229
234
* Multilevel sync map granularity (e.g., multilevel SMIL output)
230
235
* Supporting input text encodings other than UTF-8
231
- * Adding (i.e., testing) more languages
232
236
* Better documentation
233
237
* Testing other approaches, like HMM
234
238
* Publishing the package on PyPI
@@ -292,6 +296,8 @@ No copy rights were harmed in the making of this project.
292
296
293
297
* ** August 2015** : [Michele Gianella](https://plus.google.com/+michelegianella/about) partially sponsored the port of the MFCC/DTW code to C (v1.1.0)
294
298
299
+ * ** September 2015** : friends in West Africa partially sponsored the development of the head/tail detection code (v1.2.0)
300
+
295
301
# ## Supporting
296
302
297
303
Would you like supporting the development of ** aeneas**?
311
317
312
318
If you are able to contribute code directly,
313
319
that' s great!
314
- Feel free to open a pull request,
315
- we will be glad to have a look at it.
320
+
321
+ Please do not work on the `master` branch.
322
+ Instead, please create a new branch,
323
+ and open a pull request from there.
324
+ I will be glad to have a look at it!
316
325
317
326
Please make your code consistent with
318
327
the existing code base style
@@ -366,6 +375,9 @@ and a Web application
366
375
**August 2015**: release of v1.1.0, including Python C extensions
367
376
to speed the computation of audio/text alignment up
368
377
378
+ **September 2015**: release of v1.2.0,
379
+ including code to automatically detect the audio head/tail
380
+
369
381
## Acknowledgments
370
382
371
383
Many thanks to **Nicola Montecchio**,
0 commit comments