Skip to content

Commit ccab249

Browse files
committed
Merge pull request #9 from pettarin/master
Release as v1.2.0.
2 parents 47da67b + 18d9dfb commit ccab249

File tree

171 files changed

+8797
-2672
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

171 files changed

+8797
-2672
lines changed

README.md

+80-68
Original file line numberDiff line numberDiff line change
@@ -2,8 +2,8 @@
22

33
**aeneas** is a Python library and a set of tools to automagically synchronize audio and text.
44

5-
* Version: 1.1.2
6-
* Date: 2015-09-24
5+
* Version: 1.2.0
6+
* Date: 2015-09-27
77
* Developed by: [ReadBeyond](http://www.readbeyond.it/)
88
* Lead Developer: [Alberto Pettarin](http://www.albertopettarin.it/)
99
* License: the GNU Affero General Public License Version 3 (AGPL v3)
@@ -17,7 +17,7 @@ and an audio file containing the narration of the (same) text.
1717

1818
For example, given [this text file](aeneas/tests/res/container/job/assets/p001.xhtml)
1919
and [this audio file](aeneas/tests/res/container/job/assets/p001.mp3),
20-
**aeneas** computes the following map:
20+
**aeneas** computes the following abstract map:
2121

2222
```
2323
[00:00:00.000, 00:00:02.680] <=> 1
@@ -37,28 +37,28 @@ and [this audio file](aeneas/tests/res/container/job/assets/p001.mp3),
3737
[00:00:48.000, 00:00:53.280] <=> To eat the world's due, by the grave and thee.
3838
```
3939

40-
Moreover, the map can be output in several formats: SMIL for EPUB 3,
41-
SRT/TTML/VTT for closed captioning, JS for Web usage,
40+
The map can be output to file in several formats: SMIL for EPUB 3,
41+
SRT/TTML/VTT for closed captioning, JSON/RBSE for Web usage,
4242
or raw CSV/SSV/TSV/TXT/XML for further processing.
4343

4444

4545
## System Requirements, Supported Platforms and Installation
4646

4747
### System Requirements
4848

49-
1. 2 GB RAM (4 GB recommended), 2 GHz CPU (3 GHz 64bit recommended)
50-
2. `ffmpeg` and `ffprobe` executable available in your `$PATH` (`apt-get install ffmpeg*` from [`deb-multimedia`](http://www.deb-multimedia.org/))
51-
3. `espeak` executable available in your `$PATH` (`apt-get install espeak*`)
49+
1. a reasonably recent machine (recommended 4 GB RAM, 2 GHz 64bit CPU)
50+
2. `ffmpeg` and `ffprobe` executables available in your `$PATH`
51+
3. `espeak` executable available in your `$PATH`
5252
4. Python 2.7.x
53-
5. Python optional modules `BeautifulSoup`, `lxml`, `numpy`, and `scikits.audiolab` (`pip install ...`)
54-
6. (Optional but strongly suggested) Python C headers to compile the Python C extensions (`apt-get install python-dev`)
53+
5. Python modules `BeautifulSoup`, `lxml`, `numpy`, and `scikits.audiolab`
54+
6. (Optional but strongly suggested) Python C headers to compile the Python C extensions
5555

5656
Depending on the format(s) of audio files you work with,
5757
you might need to install additional audio codecs for `ffmpeg`.
5858
Similarly, you might need to install additional voices
5959
for `espeak`, depending on the language(s) you work on.
6060
(Installing _all_ the codecs and _all_ the voices available
61-
in the Debian repository might be a good idea.)
61+
might be a good idea.)
6262

6363
If installing the above dependencies proves difficult on your OS,
6464
consider using the [Vagrant box](http://www.vagrantup.com)
@@ -68,87 +68,92 @@ created by [aeneas-vagrant](https://github.com/readbeyond/aeneas-vagrant).
6868

6969
**aeneas** has been developed and tested on **Debian 64bit**,
7070
which is the **only supported OS** at the moment.
71-
Other Linux distributions should be good too.
7271

73-
However, it should work on Mac OS X and Windows as well,
74-
once you make sure `ffmpeg`, `ffprobe` and `espeak`
72+
However, **aeneas** has been confirmed to work
73+
on other Linux distributions (Ubuntu, Slackware),
74+
on Mac OS X (with developer tools installed) and on Windows Vista/7/8.1/10.
75+
76+
Whatever your OS is, make sure
77+
`ffmpeg`, `ffprobe` (which is part of `ffmpeg` distribution), and `espeak`
7578
are properly installed and
7679
callable by the `subprocess` Python module.
7780
A way to ensure the latter consists
78-
in adding the three executables to your `$PATH`.
79-
Alternatively, you can use VirtualBox
81+
in adding these three executables to your `$PATH`.
82+
83+
If installing **aeneas** natively on your OS proves difficult,
84+
you can use VirtualBox and [Vagrant](http://www.vagrantup.com)
8085
to run **aeneas** inside a virtualized Debian image,
81-
for example using [aeneas-vagrant](https://github.com/readbeyond/aeneas-vagrant).
86+
using [aeneas-vagrant](https://github.com/readbeyond/aeneas-vagrant).
8287

8388
### Installation
8489

85-
```bash
86-
$ git clone https://github.com/readbeyond/aeneas.git
87-
$ cd aeneas
88-
$ pip install -r requirements.txt
89-
$ python setup.py build_ext --inplace
90-
$ python check_dependencies.py
91-
```
90+
#### Linux and Mac OS X
9291

93-
If the last command prints a success message,
94-
you have all the required dependencies installed
95-
and you can confidently run **aeneas** in production.
96-
97-
If you are a user of a `deb`-based Linux distribution
98-
(e.g., Debian, Ubuntu),
92+
1. If you are a user of a `deb`-based Linux distribution
93+
(e.g., Debian or Ubuntu),
9994
you can install all the dependencies by running
10095
[the provided `install_dependencies.sh` script](install_dependencies.sh)
10196

102-
```bash
103-
$ sudo bash install_dependencies.sh
104-
```
97+
```bash
98+
$ sudo bash install_dependencies.sh
99+
```
100+
101+
2. If you have another Linux distribution or Mac OS X,
102+
just make sure you have
103+
`ffmpeg`, `ffprobe` (part of the `ffmpeg` package),
104+
and `espeak` installed and available on your command line.
105+
You also need Python 2.x and its "developer" package
106+
containing the C headers.
107+
108+
3. Run the following commands:
109+
110+
```bash
111+
$ git clone https://github.com/readbeyond/aeneas.git
112+
$ cd aeneas
113+
$ pip install -r requirements.txt
114+
$ python setup.py build_ext --inplace
115+
$ python check_dependencies.py
116+
```
105117

106-
Then, run `python setup.py build_ext --inplace` and `python check_dependencies.py` as above.
118+
If the last command prints a success message,
119+
you have all the required dependencies installed
120+
and you can confidently run **aeneas** in production.
107121

108-
If you are a Windows user, please read the installation instructions
122+
#### Windows
123+
124+
Please read the installation instructions
109125
contained in the
110-
["Using aeneas for Audio-Text Synchronization" PDF](http://software.sil.org/scriptureappbuilder/resources/)
126+
["Using aeneas for Audio-Text Synchronization" PDF](http://software.sil.org/scriptureappbuilder/resources/),
111127
based on
112128
[these directions](https://groups.google.com/d/msg/aeneas-forced-alignment/p9cb1FA0X0I/8phzUgIqBAAJ),
113129
written by Richard Margetts.
114130

115-
If installing natively proves difficult on your OS,
116-
consider using the [Vagrant box](http://www.vagrantup.com)
117-
created by [aeneas-vagrant](https://github.com/readbeyond/aeneas-vagrant).
118-
119131

120132
## Usage
121133

122-
1. Clone this GitHub repo:
134+
1. Install `aeneas` as described above. (Only the first time!)
123135

124-
```bash
125-
$ git clone https://github.com/readbeyond/aeneas.git
126-
```
136+
2. Open a command prompt/shell/terminal and go to the root directory
137+
of the aeneas repository, that is, the one containing this `README.md` file.
127138

128-
2. Enter the root directory:
139+
3. To compute a synchronization map `map.json` for a pair
140+
(`audio.mp3`, `text.txt` in `plain` format), you can run:
129141

130142
```bash
131-
$ cd aeneas
143+
$ python -m aeneas.tools.execute_task audio.mp3 text.txt "task_language=en|os_task_file_format=json|is_text_type=plain" map.json
132144
```
133145

134-
3. (Optional, but strongly suggested) Compile the Python C extensions:
135-
136-
```bash
137-
$ python setup.py build_ext --inplace
138-
```
146+
The third parameter (the _configuration string_) can specify several parameters/options.
147+
See the [documentation](http://www.readbeyond.it/aeneas/docs/) for details.
139148

140-
4. To compute a SMIL synchronization map `map.smil` for a pair
141-
(`audio.mp3`, `text.txt`), you can run:
149+
4. To compute a synchronization map `map.smil` for a pair
150+
(`audio.mp3`, `page.xhtml` containing fragments marked by `id` attributes like `f001`),
151+
you can run:
142152

143153
```bash
144-
$ python -m aeneas.tools.execute_task audio.mp3 text.txt config_string map.smil
154+
$ python -m aeneas.tools.execute_task audio.mp3 page.xhtml "task_language=en|os_task_file_format=smil|os_task_file_smil_audio_ref=audio.mp3|os_task_file_smil_page_ref=page.xhtml|is_text_type=unparsed|is_text_unparsed_id_regex=f[0-9]+|is_text_unparsed_id_sort=numeric" map.smil
145155
```
146156

147-
`config_string` is string containing all the
148-
parameters to parse `text.txt` correctly and to
149-
format `map.smil` as desired.
150-
See the [documentation](http://www.readbeyond.it/aeneas/docs/) for details.
151-
152157
5. If you have several tasks to run,
153158
you can create a job container and a configuration file,
154159
and run them all at once:
@@ -163,8 +168,8 @@ and run them all at once:
163168
and format the output sync map files.
164169
See the [documentation](http://www.readbeyond.it/aeneas/docs/) for details.
165170

166-
You might want to run the above modules without arguments
167-
to get their manual:
171+
You might want to run `execute_task` or `execute_job`
172+
without arguments to get an usage message and some examples:
168173

169174
```bash
170175
$ python -m aeneas.tools.execute_task
@@ -202,20 +207,20 @@ Changelog: [http://www.readbeyond.it/aeneas/docs/changelog.html](http://www.read
202207
* Arbitrary text fragment granularity (single word, subphrase, phrase, paragraph, etc.)
203208
* Input audio file formats: all those supported by `ffmpeg`
204209
* Batch processing
205-
* Output sync map formats: CSV, JS, SMIL, TSV, TTML, TXT, VTT, XML
206-
* Supported (= tested) languages: BG, CA, CY, DA, DE, EL, EN, ES, ET, FI, FR, GA, GRC, HR, HU, IS, IT, LA, LT, LV, NL, NO, RO, RU, PL, PT, SK, SR, SV, TR, UK
210+
* Output sync map formats: CSV, JSON, SMIL, SSV, TSV, TTML, TXT, VTT, XML
211+
* Tested languages: BG, CA, CY, DA, DE, EL, EN, ES, ET, FA, FI, FR, GA, GRC, HR, HU, IS, IT, LA, LT, LV, NL, NO, RO, RU, PL, PT, SK, SR, SV, SW, TR, UK
207212
* Robust against misspelled/mispronounced words, local rearrangements of words, background noise/sporadic spikes
208213
* Code suitable for a Web app deployment (e.g., on-demand AWS instances)
209214
* Adjustable splitting times, including a max character/second constraint for CC applications
215+
* Automated detection of audio head/tail
210216
* MFCC and DTW computed as Python C extensions to reduce the processing time
211217

212218

213219
## Limitations and Missing Features
214220

215221
* Audio should match the text: large portions of spurious text or audio might produce a wrong sync map
216222
* Audio is assumed to be spoken: not suitable/YMMV for song captioning
217-
* DTW computation is memory hungry
218-
* No protection against memory trashing
223+
* No protection against memory trashing if you feed extremely long audio files
219224

220225

221226
## TODO List
@@ -228,7 +233,6 @@ Changelog: [http://www.readbeyond.it/aeneas/docs/changelog.html](http://www.read
228233
* Improving (removing?) dependency from `espeak`, `ffmpeg`, `ffprobe` executables
229234
* Multilevel sync map granularity (e.g., multilevel SMIL output)
230235
* Supporting input text encodings other than UTF-8
231-
* Adding (i.e., testing) more languages
232236
* Better documentation
233237
* Testing other approaches, like HMM
234238
* Publishing the package on PyPI
@@ -292,6 +296,8 @@ No copy rights were harmed in the making of this project.
292296

293297
* **August 2015**: [Michele Gianella](https://plus.google.com/+michelegianella/about) partially sponsored the port of the MFCC/DTW code to C (v1.1.0)
294298

299+
* **September 2015**: friends in West Africa partially sponsored the development of the head/tail detection code (v1.2.0)
300+
295301
### Supporting
296302

297303
Would you like supporting the development of **aeneas**?
@@ -311,8 +317,11 @@ Feel free to [get in touch](mailto:[email protected]).
311317

312318
If you are able to contribute code directly,
313319
that's great!
314-
Feel free to open a pull request,
315-
we will be glad to have a look at it.
320+
321+
Please do not work on the `master` branch.
322+
Instead, please create a new branch,
323+
and open a pull request from there.
324+
I will be glad to have a look at it!
316325
317326
Please make your code consistent with
318327
the existing code base style
@@ -366,6 +375,9 @@ and a Web application
366375
**August 2015**: release of v1.1.0, including Python C extensions
367376
to speed the computation of audio/text alignment up
368377
378+
**September 2015**: release of v1.2.0,
379+
including code to automatically detect the audio head/tail
380+
369381
## Acknowledgments
370382
371383
Many thanks to **Nicola Montecchio**,

0 commit comments

Comments
 (0)