Skip to content

Commit d7dbb8c

Browse files
authored
Merge pull request #94 from readbeyond/devel
aeneas v1.5.1
2 parents faeaff6 + d30ad36 commit d7dbb8c

File tree

154 files changed

+2064
-1705
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

154 files changed

+2064
-1705
lines changed

MANIFEST.in

+1
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,7 @@ prune docs/build
1414
include CHANGELOG
1515
include LICENSE
1616
recursive-include licenses *
17+
include output/.gitignore
1718
include README.md
1819
include README.rst
1920
include requirements.txt

README.md

+29-19
Original file line numberDiff line numberDiff line change
@@ -2,8 +2,8 @@
22

33
**aeneas** is a Python/C library and a set of tools to automagically synchronize audio and text (aka forced alignment).
44

5-
* Version: 1.5.0.3
6-
* Date: 2016-04-23
5+
* Version: 1.5.1.0
6+
* Date: 2016-07-25
77
* Developed by: [ReadBeyond](http://www.readbeyond.it/)
88
* Lead Developer: [Alberto Pettarin](http://www.albertopettarin.it/)
99
* License: the GNU Affero General Public License Version 3 (AGPL v3)
@@ -87,6 +87,16 @@ which can be installed on any modern OS (Linux, Mac OS X, Windows).
8787

8888
### Installation
8989

90+
All-in-one installers are available for Mac OS X and Windows,
91+
and a Bash script for deb-based Linux distributions (Debian, Ubuntu)
92+
is provided in this repository.
93+
It is also possible to download a VirtualBox+Vagrant virtual machine.
94+
Please see the
95+
[INSTALL file](https://github.com/readbeyond/aeneas/blob/master/wiki/INSTALL.md)
96+
for detailed, step-by-step installation procedures for different operating systems.
97+
98+
The generic OS-independent procedure is simple:
99+
90100
1. Install
91101
[Python](https://python.org/) (2.7.x preferred),
92102
[FFmpeg](https://www.ffmpeg.org/), and
@@ -102,20 +112,16 @@ which can be installed on any modern OS (Linux, Mac OS X, Windows).
102112
pip install aeneas
103113
```
104114

105-
See the
106-
[INSTALL file](https://github.com/readbeyond/aeneas/blob/master/wiki/INSTALL.md)
107-
for detailed, step-by-step procedures for Linux, OS X, and Windows.
108-
109-
110-
## Usage
111-
112-
1. To **check** whether you installed **aeneas** correctly, run:
115+
4. To **check** whether you installed **aeneas** correctly, run:
113116

114117
```bash
115118
python -m aeneas.diagnostics
116119
```
117120

118-
2. Run without arguments to get the **usage message**:
121+
122+
## Usage
123+
124+
1. Run without arguments to get the **usage message**:
119125

120126
```bash
121127
python -m aeneas.tools.execute_task
@@ -131,7 +137,7 @@ for detailed, step-by-step procedures for Linux, OS X, and Windows.
131137
python -m aeneas.tools.execute_task --examples-all
132138
```
133139

134-
3. To **compute a synchronization map** `map.json` for a pair
140+
2. To **compute a synchronization map** `map.json` for a pair
135141
(`audio.mp3`, `text.txt` in
136142
[plain](http://www.readbeyond.it/aeneas/docs/textfile.html#aeneas.textfile.TextFileFormat.PLAIN)
137143
text format), you can run:
@@ -169,7 +175,7 @@ for detailed, step-by-step procedures for Linux, OS X, and Windows.
169175
[documentation](http://www.readbeyond.it/aeneas/docs/)
170176
for details.
171177
172-
4. If you have several tasks to process,
178+
3. If you have several tasks to process,
173179
you can create a **job container**
174180
to batch process them:
175181
@@ -222,12 +228,12 @@ which explains how to use the built-in command line tools.
222228
* Arbitrary text fragment granularity (single word, subphrase, phrase, paragraph, etc.)
223229
* Input audio file formats: all those readable by `ffmpeg`
224230
* Output sync map formats: AUD, CSV, EAF, JSON, SMIL, SRT, SSV, SUB, TSV, TTML, TXT, VTT, XML
225-
* Tested languages: ARA, BUL, CAT, CYM, CES, DAN, DEU, ELL, ENG, EPO, EST, FAS, FIN, FRA, GLE, GRC, HRV, HUN, ISL, ITA, LAT, LAV, LIT, NLD, NOR, RON, RUS, POL, POR, SLK, SPA, SRP, SWA, SWE, TUR, UKR
231+
* Confirmed working on languages: ARA, BUL, CAT, CYM, CES, DAN, DEU, ELL, ENG, EPO, EST, FAS, FIN, FRA, GLE, GRC, HRV, HUN, ISL, ITA, JPN, LAT, LAV, LIT, NLD, NOR, RON, RUS, POL, POR, SLK, SPA, SRP, SWA, SWE, TUR, UKR
226232
* MFCC and DTW computed via Python C extensions to reduce the processing time
227-
* On Linux, eSpeak called via a Python C extension for faster audio synthesis
228-
* Batch processing of multiple audio/text pairs
229233
* Several built-in TTS engine wrappers: eSpeak (default, FLOSS), Festival (FLOSS), Nuance TTS API (commercial)
230-
* Use custom TTS engine wrappers besides the built-in ones
234+
* Default TTS (eSpeak) called via a Python C extension for fast audio synthesis
235+
* A custom, user-provided TTS engine Python wrapper can be used instead of the built-in ones (included example for speect)
236+
* Batch processing of multiple audio/text pairs
231237
* Download audio from a YouTube video
232238
* In multilevel mode, recursive alignment from paragraph to sentence to word level
233239
* Robust against misspelled/mispronounced words, local rearrangements of words, background noise/sporadic spikes
@@ -236,13 +242,14 @@ which explains how to use the built-in command line tools.
236242
* Output an HTML file for fine tuning the sync map manually (`finetuneas` project)
237243
* Execution parameters tunable at runtime
238244
* Code suitable for Web app deployment (e.g., on-demand cloud computing)
245+
* Extensive test suite including 898 unit/integration/performance tests, that run and must pass before each release
239246
240247
241248
## Limitations and Missing Features
242249
243250
* Audio should match the text: large portions of spurious text or audio might produce a wrong sync map
244-
* Audio is assumed to be spoken: not suitable/YMMV for song captioning
245-
* No protection against memory trashing if you feed extremely long audio files
251+
* Audio is assumed to be spoken: not suitable for song captioning, YMMV for CC applications
252+
* No protection against memory trashing if you feed extremely long audio files (>1.5h per single audio file)
246253
* [Open issues](https://github.com/readbeyond/aeneas/issues)
247254
248255
@@ -340,6 +347,9 @@ for its asynchronous usage.
340347
**Chris Hubbard** prepared the files for
341348
packaging aeneas as a Debian/Ubuntu `.deb`.
342349
350+
**Daniel Bair**, **Chris Hubbard**, and **Richard Margetts**
351+
packaged the installers for Mac OS X and Windows.
352+
343353
**Firat Ozdemir** contributed the `finetuneas`
344354
HTML/JS code for fine tuning sync maps in the browser.
345355

README.rst

+35-21
Original file line numberDiff line numberDiff line change
@@ -4,8 +4,8 @@ aeneas
44
**aeneas** is a Python/C library and a set of tools to automagically
55
synchronize audio and text (aka forced alignment).
66

7-
- Version: 1.5.0.3
8-
- Date: 2016-04-23
7+
- Version: 1.5.1.0
8+
- Date: 2016-07-25
99
- Developed by: `ReadBeyond <http://www.readbeyond.it/>`__
1010
- Lead Developer: `Alberto Pettarin <http://www.albertopettarin.it/>`__
1111
- License: the GNU Affero General Public License Version 3 (AGPL v3)
@@ -100,6 +100,16 @@ modern OS (Linux, Mac OS X, Windows).
100100
Installation
101101
~~~~~~~~~~~~
102102

103+
All-in-one installers are available for Mac OS X and Windows, and a Bash
104+
script for deb-based Linux distributions (Debian, Ubuntu) is provided in
105+
this repository. It is also possible to download a VirtualBox+Vagrant
106+
virtual machine. Please see the `INSTALL
107+
file <https://github.com/readbeyond/aeneas/blob/master/wiki/INSTALL.md>`__
108+
for detailed, step-by-step installation procedures for different
109+
operating systems.
110+
111+
The generic OS-independent procedure is simple:
112+
103113
1. Install `Python <https://python.org/>`__ (2.7.x preferred),
104114
`FFmpeg <https://www.ffmpeg.org/>`__, and
105115
`eSpeak <http://espeak.sourceforge.net/>`__
@@ -114,18 +124,14 @@ Installation
114124
pip install numpy
115125
pip install aeneas
116126
117-
See the `INSTALL
118-
file <https://github.com/readbeyond/aeneas/blob/master/wiki/INSTALL.md>`__
119-
for detailed, step-by-step procedures for Linux, OS X, and Windows.
127+
4. To **check** whether you installed **aeneas** correctly, run:
128+
129+
``bash python -m aeneas.diagnostics``
120130

121131
Usage
122132
-----
123133

124-
1. To **check** whether you installed **aeneas** correctly, run:
125-
126-
``bash python -m aeneas.diagnostics``
127-
128-
2. Run without arguments to get the **usage message**:
134+
1. Run without arguments to get the **usage message**:
129135

130136
.. code:: bash
131137
@@ -140,7 +146,7 @@ Usage
140146
python -m aeneas.tools.execute_task --examples
141147
python -m aeneas.tools.execute_task --examples-all
142148
143-
3. To **compute a synchronization map** ``map.json`` for a pair
149+
2. To **compute a synchronization map** ``map.json`` for a pair
144150
(``audio.mp3``, ``text.txt`` in
145151
`plain <http://www.readbeyond.it/aeneas/docs/textfile.html#aeneas.textfile.TextFileFormat.PLAIN>`__
146152
text format), you can run:
@@ -178,7 +184,7 @@ specifies the parameters controlling the I/O formats and the processing
178184
options for the task. Consult the
179185
`documentation <http://www.readbeyond.it/aeneas/docs/>`__ for details.
180186

181-
4. If you have several tasks to process, you can create a **job
187+
3. If you have several tasks to process, you can create a **job
182188
container** to batch process them:
183189

184190
.. code:: bash
@@ -229,17 +235,19 @@ Supported Features
229235
- Input audio file formats: all those readable by ``ffmpeg``
230236
- Output sync map formats: AUD, CSV, EAF, JSON, SMIL, SRT, SSV, SUB,
231237
TSV, TTML, TXT, VTT, XML
232-
- Tested languages: ARA, BUL, CAT, CYM, CES, DAN, DEU, ELL, ENG, EPO,
233-
EST, FAS, FIN, FRA, GLE, GRC, HRV, HUN, ISL, ITA, LAT, LAV, LIT, NLD,
234-
NOR, RON, RUS, POL, POR, SLK, SPA, SRP, SWA, SWE, TUR, UKR
238+
- Confirmed working on languages: ARA, BUL, CAT, CYM, CES, DAN, DEU,
239+
ELL, ENG, EPO, EST, FAS, FIN, FRA, GLE, GRC, HRV, HUN, ISL, ITA, JPN,
240+
LAT, LAV, LIT, NLD, NOR, RON, RUS, POL, POR, SLK, SPA, SRP, SWA, SWE,
241+
TUR, UKR
235242
- MFCC and DTW computed via Python C extensions to reduce the
236243
processing time
237-
- On Linux, eSpeak called via a Python C extension for faster audio
238-
synthesis
239-
- Batch processing of multiple audio/text pairs
240244
- Several built-in TTS engine wrappers: eSpeak (default, FLOSS),
241245
Festival (FLOSS), Nuance TTS API (commercial)
242-
- Use custom TTS engine wrappers besides the built-in ones
246+
- Default TTS (eSpeak) called via a Python C extension for fast audio
247+
synthesis
248+
- A custom, user-provided TTS engine Python wrapper can be used instead
249+
of the built-in ones (included example for speect)
250+
- Batch processing of multiple audio/text pairs
243251
- Download audio from a YouTube video
244252
- In multilevel mode, recursive alignment from paragraph to sentence to
245253
word level
@@ -253,15 +261,18 @@ Supported Features
253261
- Execution parameters tunable at runtime
254262
- Code suitable for Web app deployment (e.g., on-demand cloud
255263
computing)
264+
- Extensive test suite including 898 unit/integration/performance
265+
tests, that run and must pass before each release
256266

257267
Limitations and Missing Features
258268
--------------------------------
259269

260270
- Audio should match the text: large portions of spurious text or audio
261271
might produce a wrong sync map
262-
- Audio is assumed to be spoken: not suitable/YMMV for song captioning
272+
- Audio is assumed to be spoken: not suitable for song captioning, YMMV
273+
for CC applications
263274
- No protection against memory trashing if you feed extremely long
264-
audio files
275+
audio files (>1.5h per single audio file)
265276
- `Open issues <https://github.com/readbeyond/aeneas/issues>`__
266277

267278
License
@@ -362,6 +373,9 @@ asynchronous usage.
362373
**Chris Hubbard** prepared the files for packaging aeneas as a
363374
Debian/Ubuntu ``.deb``.
364375

376+
**Daniel Bair**, **Chris Hubbard**, and **Richard Margetts** packaged
377+
the installers for Mac OS X and Windows.
378+
365379
**Firat Ozdemir** contributed the ``finetuneas`` HTML/JS code for fine
366380
tuning sync maps in the browser.
367381

VERSION

+1-1
Original file line numberDiff line numberDiff line change
@@ -1 +1 @@
1-
1.5.0
1+
1.5.1

aeneas/__init__.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,7 @@
1313
Copyright 2015-2016, Alberto Pettarin (www.albertopettarin.it)
1414
"""
1515
__license__ = "GNU AGPL v3"
16-
__version__ = "1.5.0"
16+
__version__ = "1.5.1"
1717
__email__ = "[email protected]"
1818
__status__ = "Production"
1919

aeneas/adjustboundaryalgorithm.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -30,7 +30,7 @@
3030
Copyright 2015-2016, Alberto Pettarin (www.albertopettarin.it)
3131
"""
3232
__license__ = "GNU AGPL v3"
33-
__version__ = "1.5.0"
33+
__version__ = "1.5.1"
3434
__email__ = "[email protected]"
3535
__status__ = "Production"
3636

aeneas/analyzecontainer.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -32,7 +32,7 @@
3232
Copyright 2015-2016, Alberto Pettarin (www.albertopettarin.it)
3333
"""
3434
__license__ = "GNU AGPL v3"
35-
__version__ = "1.5.0"
35+
__version__ = "1.5.1"
3636
__email__ = "[email protected]"
3737
__status__ = "Production"
3838

aeneas/audiofile.py

+72-1
Original file line numberDiff line numberDiff line change
@@ -37,7 +37,7 @@
3737
Copyright 2015-2016, Alberto Pettarin (www.albertopettarin.it)
3838
"""
3939
__license__ = "GNU AGPL v3"
40-
__version__ = "1.5.0"
40+
__version__ = "1.5.1"
4141
__email__ = "[email protected]"
4242
__status__ = "Production"
4343

@@ -116,6 +116,77 @@ class AudioFile(Loggable):
116116
:type logger: :class:`~aeneas.logger.Logger`
117117
"""
118118

119+
FILE_EXTENSIONS = [
120+
u"3g2",
121+
u"3gp",
122+
u"aa",
123+
u"aa3",
124+
u"aac",
125+
u"aax",
126+
u"aiff",
127+
u"alac",
128+
u"amr",
129+
u"ape",
130+
u"asf",
131+
u"at3",
132+
u"at9",
133+
u"au",
134+
u"avi",
135+
u"awb",
136+
u"celt",
137+
u"dct",
138+
u"dss",
139+
u"dvf",
140+
u"eac",
141+
u"flac",
142+
u"flv",
143+
u"gsm",
144+
u"m4a",
145+
u"m4b",
146+
u"m4p",
147+
u"m4v",
148+
u"mid",
149+
u"midi",
150+
u"mkv",
151+
u"mmf",
152+
u"mov",
153+
u"mp2",
154+
u"mp3",
155+
u"mp4",
156+
u"mpc",
157+
u"mpeg",
158+
u"mpg",
159+
u"mpv",
160+
u"msv",
161+
u"oga",
162+
u"ogg",
163+
u"ogv",
164+
u"oma",
165+
u"opus",
166+
u"pcm",
167+
u"qt",
168+
u"ra",
169+
u"ram",
170+
u"raw",
171+
u"riff",
172+
u"rm",
173+
u"rmvb",
174+
u"shn",
175+
u"sln",
176+
u"theora",
177+
u"tta",
178+
u"vob",
179+
u"vorbis",
180+
u"vox",
181+
u"wav",
182+
u"webm",
183+
u"wma",
184+
u"wmv",
185+
u"wv",
186+
u"yuv",
187+
]
188+
""" Extensions of common formats for audio (and video) files. """
189+
119190
TAG = u"AudioFile"
120191

121192
def __init__(self, file_path=None, is_mono_wave=False, rconf=None, logger=None):

aeneas/audiofilemfcc.py

+2-2
Original file line numberDiff line numberDiff line change
@@ -29,7 +29,7 @@
2929
Copyright 2015-2016, Alberto Pettarin (www.albertopettarin.it)
3030
"""
3131
__license__ = "GNU AGPL v3"
32-
__version__ = "1.5.0"
32+
__version__ = "1.5.1"
3333
__email__ = "[email protected]"
3434
__status__ = "Production"
3535

@@ -134,7 +134,7 @@ def __init__(
134134
self._compute_mfcc_c_extension,
135135
self._compute_mfcc_pure_python,
136136
(),
137-
c_extension=self.rconf[RuntimeConfiguration.C_EXTENSIONS]
137+
rconf=self.rconf
138138
)
139139
self.audio_length = self.audio_file.audio_length
140140
if audio_file_was_none:

aeneas/cdtw/000_compile_driver.sh

+1-1
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
#!/bin/bash
22

3-
gcc cdtw_driver.c cdtw_func.c cint.c -o cdtw_driver -lm -Wall -pedantic -std=c99
3+
gcc cdtw_driver.c cdtw_func.c ../cint/cint.c -o cdtw_driver -lm -Wall -pedantic -std=c99
44

55

66

aeneas/cdtw/900_clean.sh

+3
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
#!/bin/bash
2+
3+
rm -rf build __pycache__ *.so cdtw_driver

0 commit comments

Comments
 (0)