Skip to content

Commit 2d7c18d

Browse files
committed
Merge pull request #78 from pettarin/nextmajor
aeneas v1.5.0
2 parents 5554379 + a50fce0 commit 2d7c18d

File tree

389 files changed

+20014
-9675
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

389 files changed

+20014
-9675
lines changed

MANIFEST.in

+7
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,10 @@
1+
recursive-include aeneas/cdtw *
2+
recursive-include aeneas/cew *
3+
recursive-include aeneas/cint *
4+
recursive-include aeneas/cmfcc *
5+
recursive-include aeneas/cwave *
6+
recursive-include aeneas/extra *
7+
prune aeneas/extra/ctw_speect
18
recursive-include aeneas/res *
29
recursive-include aeneas/tools/res *
310
include aeneas_check_setup.py

README.md

+131-85
Large diffs are not rendered by default.

README.rst

+116-90
Original file line numberDiff line numberDiff line change
@@ -4,16 +4,18 @@ aeneas
44
**aeneas** is a Python/C library and a set of tools to automagically
55
synchronize audio and text (aka forced alignment).
66

7-
- Version: 1.4.1
8-
- Date: 2016-02-13
7+
- Version: 1.5.0
8+
- Date: 2016-04-02
99
- Developed by: `ReadBeyond <http://www.readbeyond.it/>`__
1010
- Lead Developer: `Alberto Pettarin <http://www.albertopettarin.it/>`__
1111
- License: the GNU Affero General Public License Version 3 (AGPL v3)
1212
1313
- Quick Links: `Home <http://www.readbeyond.it/aeneas/>`__ -
1414
`GitHub <https://github.com/readbeyond/aeneas/>`__ -
15-
`PyPI <https://pypi.python.org/pypi/aeneas/>`__ - `API
16-
Docs <http://www.readbeyond.it/aeneas/docs/>`__ - `Mailing
15+
`PyPI <https://pypi.python.org/pypi/aeneas/>`__ -
16+
`Docs <http://www.readbeyond.it/aeneas/docs/>`__ -
17+
`Tutorial <http://www.readbeyond.it/aeneas/docs/clitutorial.html>`__
18+
- `Mailing
1719
List <https://groups.google.com/d/forum/aeneas-forced-alignment>`__ -
1820
`Web App <http://aeneasweb.org>`__
1921

@@ -34,25 +36,31 @@ interval in the audio file:
3436

3537
::
3638

37-
1 => [00:00:00.000, 00:00:02.680]
38-
From fairest creatures we desire increase, => [00:00:02.680, 00:00:05.480]
39-
That thereby beauty's rose might never die, => [00:00:05.480, 00:00:08.640]
40-
But as the riper should by time decease, => [00:00:08.640, 00:00:11.960]
41-
His tender heir might bear his memory: => [00:00:11.960, 00:00:15.280]
42-
But thou contracted to thine own bright eyes, => [00:00:15.280, 00:00:18.520]
43-
Feed'st thy light's flame with self-substantial fuel, => [00:00:18.520, 00:00:22.760]
44-
Making a famine where abundance lies, => [00:00:22.760, 00:00:25.720]
45-
Thy self thy foe, to thy sweet self too cruel: => [00:00:25.720, 00:00:31.240]
46-
Thou that art now the world's fresh ornament, => [00:00:31.240, 00:00:34.280]
47-
And only herald to the gaudy spring, => [00:00:34.280, 00:00:36.960]
48-
Within thine own bud buriest thy content, => [00:00:36.960, 00:00:40.640]
49-
And tender churl mak'st waste in niggarding: => [00:00:40.640, 00:00:43.600]
50-
Pity the world, or else this glutton be, => [00:00:43.600, 00:00:48.000]
51-
To eat the world's due, by the grave and thee. => [00:00:48.000, 00:00:53.280]
52-
53-
This synchronization map can be output to file in several formats: SMIL
54-
for EPUB 3, SBV/SRT/SUB/TTML/VTT for closed captioning, JSON/RBSE for
55-
Web usage, or raw CSV/SSV/TSV/TXT/XML for further processing.
39+
1 => [00:00:00.000, 00:00:02.640]
40+
From fairest creatures we desire increase, => [00:00:02.640, 00:00:05.880]
41+
That thereby beauty's rose might never die, => [00:00:05.880, 00:00:09.240]
42+
But as the riper should by time decease, => [00:00:09.240, 00:00:11.920]
43+
His tender heir might bear his memory: => [00:00:11.920, 00:00:15.280]
44+
But thou contracted to thine own bright eyes, => [00:00:15.280, 00:00:18.800]
45+
Feed'st thy light's flame with self-substantial fuel, => [00:00:18.800, 00:00:22.760]
46+
Making a famine where abundance lies, => [00:00:22.760, 00:00:25.680]
47+
Thy self thy foe, to thy sweet self too cruel: => [00:00:25.680, 00:00:31.240]
48+
Thou that art now the world's fresh ornament, => [00:00:31.240, 00:00:34.400]
49+
And only herald to the gaudy spring, => [00:00:34.400, 00:00:36.920]
50+
Within thine own bud buriest thy content, => [00:00:36.920, 00:00:40.640]
51+
And tender churl mak'st waste in niggarding: => [00:00:40.640, 00:00:43.640]
52+
Pity the world, or else this glutton be, => [00:00:43.640, 00:00:48.080]
53+
To eat the world's due, by the grave and thee. => [00:00:48.080, 00:00:53.240]
54+
55+
.. figure:: wiki/align.png
56+
:alt: Waveform with aligned labels, detail
57+
58+
Waveform with aligned labels, detail
59+
60+
This synchronization map can be output to file in several formats: EAF
61+
for research purposes, SMIL for EPUB 3, SBV/SRT/SUB/TTML/VTT for closed
62+
captioning, JSON for Web usage, or raw AUD/CSV/SSV/TSV/TXT/XML for
63+
further processing.
5664

5765
System Requirements, Supported Platforms and Installation
5866
---------------------------------------------------------
@@ -66,20 +74,17 @@ System Requirements
6674
3. `FFmpeg <https://www.ffmpeg.org/>`__
6775
4. `eSpeak <http://espeak.sourceforge.net/>`__
6876
5. Python modules ``BeautifulSoup4``, ``lxml``, and ``numpy``
69-
6. Python C headers to compile the Python C extensions (Optional but
77+
6. Python C headers to compile the Python C extensions (optional but
7078
strongly recommended)
71-
7. A shell supporting UTF-8 (Optional but strongly recommended)
72-
8. Python module ``pafy`` (Optional, only required if you want to
73-
download audio from YouTube)
79+
7. A shell supporting UTF-8 (optional but strongly recommended)
7480

7581
Supported Platforms
7682
~~~~~~~~~~~~~~~~~~~
7783

7884
**aeneas** has been developed and tested on **Debian 64bit**, which is
79-
the **only supported OS** at the moment.
80-
81-
However, **aeneas** has been confirmed to work on other Linux
82-
distributions, OS X, and Windows. See the `PLATFORMS
85+
the **only supported OS** at the moment. Nevertheless, **aeneas** has
86+
been confirmed to work on other Linux distributions, OS X, and Windows.
87+
See the `PLATFORMS
8388
file <https://github.com/readbeyond/aeneas/blob/master/wiki/PLATFORMS.md>`__
8489
for the details.
8590

@@ -115,37 +120,45 @@ for detailed, step-by-step procedures for Linux, OS X, and Windows.
115120
Usage
116121
-----
117122

118-
1. To check that you installed ``aeneas`` correctly, run:
123+
1. To **check** whether you installed **aeneas** correctly, run:
119124

120125
``bash python -m aeneas.diagnostics``
121126

122-
2. Run ``execute_task`` or ``execute_job`` with ``-h`` (resp.,
123-
``--help``) to get a short (resp., long) usage message:
127+
2. Run without arguments to get the **usage message**:
124128

125129
.. code:: bash
126130
127-
python -m aeneas.tools.execute_task -h
128-
python -m aeneas.tools.execute_job -h
131+
python -m aeneas.tools.execute_task
132+
python -m aeneas.tools.execute_job
133+
134+
You can also get a list of **live examples** that you can immediately
135+
run on your machine thanks to the included files:
129136

130-
The above commands also print a list of live usage examples that you
131-
can immediately run on your machine, thanks to the included example
132-
files.
137+
.. code:: bash
133138
134-
3. To compute a synchronization map ``map.json`` for a pair
139+
python -m aeneas.tools.execute_task --examples
140+
python -m aeneas.tools.execute_task --examples-all
141+
142+
3. To **compute a synchronization map** ``map.json`` for a pair
135143
(``audio.mp3``, ``text.txt`` in
136-
```plain`` <http://www.readbeyond.it/aeneas/docs/textfile.html#aeneas.textfile.TextFileFormat.PLAIN>`__
144+
`plain <http://www.readbeyond.it/aeneas/docs/textfile.html#aeneas.textfile.TextFileFormat.PLAIN>`__
137145
text format), you can run:
138146

139147
.. code:: bash
140148
141149
python -m aeneas.tools.execute_task \
142150
audio.mp3 \
143151
text.txt \
144-
"task_language=en|os_task_file_format=json|is_text_type=plain" \
152+
"task_language=eng|os_task_file_format=json|is_text_type=plain" \
145153
map.json
146154
147-
To compute a synchronization map ``map.smil`` for a pair (``audio.mp3``,
148-
```page.xhtml`` <http://www.readbeyond.it/aeneas/docs/textfile.html#aeneas.textfile.TextFileFormat.UNPARSED>`__
155+
(The command has been split into lines with ``\`` for visual clarity; in
156+
production you can have the entire command on a single line and/or you
157+
can use shell variables.)
158+
159+
To **compute a synchronization map** ``map.smil`` for a pair
160+
(``audio.mp3``,
161+
`page.xhtml <http://www.readbeyond.it/aeneas/docs/textfile.html#aeneas.textfile.TextFileFormat.UNPARSED>`__
149162
containing fragments marked by ``id`` attributes like ``f001``), you can
150163
run:
151164

@@ -155,80 +168,89 @@ run:
155168
python -m aeneas.tools.execute_task \
156169
audio.mp3 \
157170
page.xhtml \
158-
"task_language=en|os_task_file_format=smil|os_task_file_smil_audio_ref=audio.mp3|os_task_file_smil_page_ref=page.xhtml|is_text_type=unparsed|is_text_unparsed_id_regex=f[0-9]+|is_text_unparsed_id_sort=numeric" \
171+
"task_language=eng|os_task_file_format=smil|os_task_file_smil_audio_ref=audio.mp3|os_task_file_smil_page_ref=page.xhtml|is_text_type=unparsed|is_text_unparsed_id_regex=f[0-9]+|is_text_unparsed_id_sort=numeric" \
159172
map.smil
160173
```
161174

162-
The third parameter (the *configuration string*) can specify several
163-
other parameters/options. See the
175+
As you can see, the third argument (the *configuration string*)
176+
specifies the parameters controlling the I/O formats and the processing
177+
options for the task. Consult the
164178
`documentation <http://www.readbeyond.it/aeneas/docs/>`__ for details.
165179

166-
4. If you have several tasks to process, you can create a job container
167-
and a configuration file, to process them all at once:
180+
4. If you have several tasks to process, you can create a **job
181+
container** to batch process them:
168182

169183
.. code:: bash
170184
171185
python -m aeneas.tools.execute_job job.zip output_directory
172186
173187
File ``job.zip`` should contain a ``config.txt`` or ``config.xml``
174188
configuration file, providing **aeneas** with all the information needed
175-
to parse the input assets and format the output sync map files. See the
176-
`documentation <http://www.readbeyond.it/aeneas/docs/>`__ for details.
189+
to parse the input assets and format the output sync map files. Consult
190+
the `documentation <http://www.readbeyond.it/aeneas/docs/>`__ for
191+
details.
177192

178-
The `documentation <http://www.readbeyond.it/aeneas/docs/>`__ provides
179-
an introduction to the concepts of
180-
```task`` <http://www.readbeyond.it/aeneas/docs/#tasks>`__ and
181-
```job`` <http://www.readbeyond.it/aeneas/docs/#job>`__, and it lists of
182-
all the options and tools available in the library.
193+
The `documentation <http://www.readbeyond.it/aeneas/docs/>`__ contains a
194+
highly suggested
195+
`tutorial <http://www.readbeyond.it/aeneas/docs/clitutorial.html>`__
196+
which explains how to use the built-in command line tools.
183197

184198
Documentation and Support
185199
-------------------------
186200

187-
Documentation: http://www.readbeyond.it/aeneas/docs/
188-
189-
High level description of how aeneas works:
190-
`HOWITWORKS <https://github.com/readbeyond/aeneas/blob/master/wiki/HOWITWORKS.md>`__
191-
192-
Tutorial: `A Practical Introduction To The aeneas
193-
Package <http://www.albertopettarin.it/blog/2015/05/21/a-practical-introduction-to-the-aeneas-package.html>`__
194-
195-
Mailing list: https://groups.google.com/d/forum/aeneas-forced-alignment
196-
197-
Changelog: http://www.readbeyond.it/aeneas/docs/changelog.html
198-
199-
Development history:
200-
`HISTORY <https://github.com/readbeyond/aeneas/blob/master/wiki/HISTORY.md>`__
201+
- Documentation: http://www.readbeyond.it/aeneas/docs/
202+
- Command line tools tutorial:
203+
http://www.readbeyond.it/aeneas/docs/clitutorial.html
204+
- Library tutorial:
205+
http://www.readbeyond.it/aeneas/docs/libtutorial.html
206+
- Old, verbose tutorial: `A Practical Introduction To The aeneas
207+
Package <http://www.albertopettarin.it/blog/2015/05/21/a-practical-introduction-to-the-aeneas-package.html>`__
208+
- Mailing list:
209+
https://groups.google.com/d/forum/aeneas-forced-alignment
210+
- Changelog: http://www.readbeyond.it/aeneas/docs/changelog.html
211+
- High level description of how **aeneas** works:
212+
`HOWITWORKS <https://github.com/readbeyond/aeneas/blob/master/wiki/HOWITWORKS.md>`__
213+
- Development history:
214+
`HISTORY <https://github.com/readbeyond/aeneas/blob/master/wiki/HISTORY.md>`__
201215

202216
Supported Features
203217
------------------
204218

205-
- Input text files in plain, parsed, subtitles, or unparsed format
219+
- Input text files in ``parsed``, ``plain``, ``subtitles``, or
220+
``unparsed`` (XML) format
221+
- Multilevel input text files in ``mplain`` and ``munparsed`` (XML)
222+
format
206223
- Text extraction from XML (e.g., XHTML) files using ``id`` and
207224
``class`` attributes
208225
- Arbitrary text fragment granularity (single word, subphrase, phrase,
209226
paragraph, etc.)
210-
- Input audio file formats: all those supported by ``ffmpeg``
211-
- Possibility of downloading the audio file from a YouTube video
212-
- Batch processing
213-
- Output sync map formats: CSV, JSON, RBSE, SMIL, SSV, TSV, TTML, TXT,
214-
VTT, XML
215-
- Tested languages: BG, CA, CY, CS, DA, DE, EL, EN, EO, ES, ET, FA, FI,
216-
FR, GA, GRC, HR, HU, IS, IT, LA, LT, LV, NL, NO, RO, RU, PL, PT, SK,
217-
SR, SV, SW, TR, UK
227+
- Input audio file formats: all those readable by ``ffmpeg``
228+
- Output sync map formats: AUD, CSV, EAF, JSON, SMIL, SRT, SSV, SUB,
229+
TSV, TTML, TXT, VTT, XML
230+
- Tested languages: ARA, BUL, CAT, CYM, CES, DAN, DEU, ELL, ENG, EPO,
231+
EST, FAS, FIN, FRA, GLE, GRC, HRV, HUN, ISL, ITA, LAT, LAV, LIT, NLD,
232+
NOR, RON, RUS, POL, POR, SLK, SPA, SRP, SWA, SWE, TUR, UKR
233+
- MFCC and DTW computed via Python C extensions to reduce the
234+
processing time
235+
- On Linux, eSpeak called via a Python C extension for faster audio
236+
synthesis
237+
- Batch processing of multiple audio/text pairs
238+
- Several built-in TTS engine wrappers: eSpeak (default, FLOSS),
239+
Festival (FLOSS), Nuance TTS API (commercial)
240+
- Use custom TTS engine wrappers besides the built-in ones
241+
- Download audio from a YouTube video
242+
- In multilevel mode, recursive alignment from paragraph to sentence to
243+
word level
218244
- Robust against misspelled/mispronounced words, local rearrangements
219245
of words, background noise/sporadic spikes
220-
- Code suitable for a Web app deployment (e.g., on-demand AWS
221-
instances)
222246
- Adjustable splitting times, including a max character/second
223247
constraint for CC applications
224248
- Automated detection of audio head/tail
225-
- MFCC and DTW computed via Python C extensions to reduce the
226-
processing time
227-
- On Linux, ``espeak`` called via a Python C extension for faster audio
228-
synthesis
229-
- Output an HTML file (from ``finetuneas`` project) for fine tuning the
230-
sync map manually
249+
- Output an HTML file for fine tuning the sync map manually
250+
(``finetuneas`` project)
231251
- Execution parameters tunable at runtime
252+
- Code suitable for Web app deployment (e.g., on-demand cloud
253+
computing)
232254

233255
Limitations and Missing Features
234256
--------------------------------
@@ -238,8 +260,6 @@ Limitations and Missing Features
238260
- Audio is assumed to be spoken: not suitable/YMMV for song captioning
239261
- No protection against memory trashing if you feed extremely long
240262
audio files
241-
- On Mac OS X and Windows, audio synthesis might be slow if you have
242-
thousands of text fragments
243263
- `Open issues <https://github.com/readbeyond/aeneas/issues>`__
244264

245265
License
@@ -252,7 +272,7 @@ details.
252272

253273
Licenses for third party code and files included in **aeneas** can be
254274
found in the
255-
`licenses/ <https://github.com/readbeyond/aeneas/blob/master/licenses/README.md>`__
275+
`licenses <https://github.com/readbeyond/aeneas/blob/master/licenses/README.md>`__
256276
directory.
257277

258278
No copy rights were harmed in the making of this project.
@@ -278,6 +298,9 @@ Sponsors
278298
- **October 2015**: an anonymous donation sponsored the development of
279299
the "YouTube downloader" option (v1.3.0)
280300

301+
- **April 2016**: the Fruch Foundation kindly sponsored the development
302+
and documentation of v1.5.0
303+
281304
Supporting
282305
~~~~~~~~~~
283306

@@ -337,6 +360,9 @@ asynchronous usage.
337360
**Chris Hubbard** prepared the files for packaging aeneas as a
338361
Debian/Ubuntu ``.deb``.
339362

363+
**Firat Ozdemir** contributed the ``finetuneas`` HTML/JS code for fine
364+
tuning sync maps in the browser.
365+
340366
All the mighty `GitHub
341367
contributors <https://github.com/readbeyond/aeneas/graphs/contributors>`__,
342368
and the members of the `Google

VERSION

+1-1
Original file line numberDiff line numberDiff line change
@@ -1 +1 @@
1-
1.4.1
1+
1.5.0

0 commit comments

Comments
 (0)