Skip to content

Commit b399383

Browse files
committed
Merge pull request #51 from pettarin/port3
Added release date for v1.4.0
2 parents 893bbc7 + 3f3bd9a commit b399383

17 files changed

+193
-148
lines changed

.gitignore

+1
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,7 @@
33
*.pyo
44
*.swp
55
*.so
6+
.pybuild
67
aeneas.egg-info
78
aeneas/build
89
bak

README.md

+35-32
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@
33
**aeneas** is a Python library and a set of tools to automagically synchronize audio and text.
44

55
* Version: 1.4.0
6-
* Date: 2016-01-??
6+
* Date: 2016-01-15
77
* Developed by: [ReadBeyond](http://www.readbeyond.it/)
88
* Lead Developer: [Alberto Pettarin](http://www.albertopettarin.it/)
99
* License: the GNU Affero General Public License Version 3 (AGPL v3)
@@ -107,6 +107,7 @@ However, **aeneas** has been confirmed to work on the following systems:
107107
| Slackware | 64 | Yes | Unknown |
108108
| Mac OS X 10.9 | 64 | Yes (1) | Unknown (1) |
109109
| Mac OS X 10.10 | 64 | Yes (1) | Unknown (1) |
110+
| Mac OS X 10.11 | 64 | Yes (1) | Unknown (1) |
110111
| Windows Vista | 32 | Yes (1) | Yes (1, 2) |
111112
| Windows 7 | 64 | Yes (1) | Yes (1, 2) |
112113
| Windows 8.1 | 64 | Yes (1) | Unknown (1, 2) |
@@ -119,7 +120,7 @@ is available only on Linux at the moment.
119120
is quite complex; however, running **aeneas** in pure Python mode
120121
has been confirmed to work.
121122

122-
In any case, **aeneas** should work on any OS, at least in pure Python mode,
123+
Anyway, **aeneas** should work on any OS, at least in pure Python mode,
123124
provided that:
124125

125126
1. the required Python modules `BeautifulSoup4`, `lxml`, and `numpy` are installed, and
@@ -307,55 +308,55 @@ Feel free to jump to step 9 if you already have
307308
1. Install `aeneas` as described above. (Only the first time!)
308309

309310
2. Open a command prompt/shell/terminal and go to the root directory
310-
of the aeneas repository, that is, the one containing the `README.md` and `VERSION` files.
311-
(This step is not needed if you installed `aeneas` with `pip`,
312-
since you will have the `aeneas` module available system-wise.)
311+
of the aeneas repository, that is, the one containing the `README.md` and `VERSION` files.
312+
(This step is not needed if you installed `aeneas` with `pip`,
313+
since you will have the `aeneas` module available system-wise.)
313314

314315
3. To compute a synchronization map `map.json` for a pair
315-
(`audio.mp3`, `text.txt` in `plain` text format), you can run:
316+
(`audio.mp3`, `text.txt` in `plain` text format), you can run:
316317

317318
```bash
318319
$ python -m aeneas.tools.execute_task audio.mp3 text.txt "task_language=en|os_task_file_format=json|is_text_type=plain" map.json
319320
```
320321

321-
The third parameter (the _configuration string_) can specify several parameters/options.
322-
See the [documentation](http://www.readbeyond.it/aeneas/docs/)
323-
or use the `-h` switch for details.
324-
325-
4. To compute a synchronization map `map.smil` for a pair
326-
(`audio.mp3`, `page.xhtml` containing fragments marked by `id` attributes like `f001`),
327-
you can run:
322+
To compute a synchronization map `map.smil` for a pair
323+
(`audio.mp3`, `page.xhtml` containing fragments marked by `id` attributes like `f001`),
324+
you can run:
328325

329326
```bash
330327
$ python -m aeneas.tools.execute_task audio.mp3 page.xhtml "task_language=en|os_task_file_format=smil|os_task_file_smil_audio_ref=audio.mp3|os_task_file_smil_page_ref=page.xhtml|is_text_type=unparsed|is_text_unparsed_id_regex=f[0-9]+|is_text_unparsed_id_sort=numeric" map.smil
331328
```
332329

333-
5. If you have several tasks to run,
334-
you can create a job container and a configuration file,
335-
and run them all at once:
330+
The third parameter (the _configuration string_) can specify several other parameters/options.
331+
See the [documentation](http://www.readbeyond.it/aeneas/docs/)
332+
or use the `-h` switch for details.
333+
334+
4. If you have several tasks to run,
335+
you can create a job container and a configuration file,
336+
and run them all at once:
336337

337338
```bash
338339
$ python -m aeneas.tools.execute_job job.zip /tmp/
339340
```
340341

341-
File `job.zip` should contain a `config.txt` or `config.xml`
342-
configuration file, providing **aeneas**
343-
with all the information needed to parse the input assets
344-
and format the output sync map files.
345-
See the [documentation](http://www.readbeyond.it/aeneas/docs/)
346-
or use the `-h` switch for details.
342+
File `job.zip` should contain a `config.txt` or `config.xml`
343+
configuration file, providing **aeneas**
344+
with all the information needed to parse the input assets
345+
and format the output sync map files.
346+
See the [documentation](http://www.readbeyond.it/aeneas/docs/)
347+
or use the `-h` switch for details.
347348

348-
You might want to run `execute_task` or `execute_job`
349-
with `-h` to get an usage message and some examples:
349+
5. You might want to run `execute_task` or `execute_job`
350+
with `-h` to get an usage message and some examples:
350351

351-
```bash
352-
$ python -m aeneas.tools.execute_task -h
353-
$ python -m aeneas.tools.execute_job -h
354-
```
352+
```bash
353+
$ python -m aeneas.tools.execute_task -h
354+
$ python -m aeneas.tools.execute_job -h
355+
```
355356

356-
See the [documentation](http://www.readbeyond.it/aeneas/docs/)
357-
for an introduction to the concepts of `task` and `job`,
358-
and for the list of all the available options.
357+
See the [documentation](http://www.readbeyond.it/aeneas/docs/)
358+
for an introduction to the concepts of `task` and `job`,
359+
and for the list of all the available options.
359360

360361

361362
## Documentation
@@ -391,17 +392,19 @@ Changelog: [http://www.readbeyond.it/aeneas/docs/changelog.html](http://www.read
391392
* Code suitable for a Web app deployment (e.g., on-demand AWS instances)
392393
* Adjustable splitting times, including a max character/second constraint for CC applications
393394
* Automated detection of audio head/tail
394-
* MFCC and DTW computed as Python C extensions to reduce the processing time
395+
* MFCC and DTW computed via Python C extensions to reduce the processing time
395396
* On Linux, `espeak` called via a Python C extension for faster audio synthesis
396397
* Output an HTML file (from `finetuneas` project) for fine tuning the sync map manually
397398

399+
398400
## Limitations and Missing Features
399401

400402
* Audio should match the text: large portions of spurious text or audio might produce a wrong sync map
401403
* Audio is assumed to be spoken: not suitable/YMMV for song captioning
402404
* No protection against memory trashing if you feed extremely long audio files
403405
* On Mac OS X and Windows, audio synthesis might be slow if you have thousands of text fragments
404406

407+
405408
## TODO List
406409

407410
* Improving robustness against music in background

README.rst

+29-26
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@ aeneas
55
synchronize audio and text.
66

77
- Version: 1.4.0
8-
- Date: 2016-01-??
8+
- Date: 2016-01-15
99
- Developed by: `ReadBeyond <http://www.readbeyond.it/>`__
1010
- Lead Developer: `Alberto Pettarin <http://www.albertopettarin.it/>`__
1111
- License: the GNU Affero General Public License Version 3 (AGPL v3)
@@ -130,6 +130,8 @@ However, **aeneas** has been confirmed to work on the following systems:
130130
+------------------+-------------+--------------+------------------+
131131
| Mac OS X 10.10 | 64 | Yes (1) | Unknown (1) |
132132
+------------------+-------------+--------------+------------------+
133+
| Mac OS X 10.11 | 64 | Yes (1) | Unknown (1) |
134+
+------------------+-------------+--------------+------------------+
133135
| Windows Vista | 32 | Yes (1) | Yes (1, 2) |
134136
+------------------+-------------+--------------+------------------+
135137
| Windows 7 | 64 | Yes (1) | Yes (1, 2) |
@@ -144,8 +146,8 @@ is available only on Linux at the moment. (2) On Windows and Python
144146
3.4/3.5, compiling the Python C extensions is quite complex; however,
145147
running **aeneas** in pure Python mode has been confirmed to work.
146148

147-
In any case, **aeneas** should work on any OS, at least in pure Python
148-
mode, provided that:
149+
Anyway, **aeneas** should work on any OS, at least in pure Python mode,
150+
provided that:
149151

150152
1. the required Python modules ``BeautifulSoup4``, ``lxml``, and
151153
``numpy`` are installed, and
@@ -356,40 +358,41 @@ Usage
356358
357359
$ python -m aeneas.tools.execute_task audio.mp3 text.txt "task_language=en|os_task_file_format=json|is_text_type=plain" map.json
358360
359-
The third parameter (the *configuration string*) can specify several
360-
parameters/options. See the
361-
`documentation <http://www.readbeyond.it/aeneas/docs/>`__ or use the
362-
``-h`` switch for details.
361+
To compute a synchronization map ``map.smil`` for a pair (``audio.mp3``,
362+
``page.xhtml`` containing fragments marked by ``id`` attributes like
363+
``f001``), you can run:
363364

364-
4. To compute a synchronization map ``map.smil`` for a pair
365-
(``audio.mp3``, ``page.xhtml`` containing fragments marked by ``id``
366-
attributes like ``f001``), you can run:
365+
::
367366

368-
.. code:: bash
367+
```bash
368+
$ python -m aeneas.tools.execute_task audio.mp3 page.xhtml "task_language=en|os_task_file_format=smil|os_task_file_smil_audio_ref=audio.mp3|os_task_file_smil_page_ref=page.xhtml|is_text_type=unparsed|is_text_unparsed_id_regex=f[0-9]+|is_text_unparsed_id_sort=numeric" map.smil
369+
```
369370

370-
$ python -m aeneas.tools.execute_task audio.mp3 page.xhtml "task_language=en|os_task_file_format=smil|os_task_file_smil_audio_ref=audio.mp3|os_task_file_smil_page_ref=page.xhtml|is_text_type=unparsed|is_text_unparsed_id_regex=f[0-9]+|is_text_unparsed_id_sort=numeric" map.smil
371+
The third parameter (the *configuration string*) can specify several
372+
other parameters/options. See the
373+
`documentation <http://www.readbeyond.it/aeneas/docs/>`__ or use the
374+
``-h`` switch for details.
371375

372-
5. If you have several tasks to run, you can create a job container and
376+
4. If you have several tasks to run, you can create a job container and
373377
a configuration file, and run them all at once:
374378

375379
.. code:: bash
376380
377381
$ python -m aeneas.tools.execute_job job.zip /tmp/
378382
379-
File ``job.zip`` should contain a ``config.txt`` or ``config.xml``
380-
configuration file, providing **aeneas** with all the information
381-
needed to parse the input assets and format the output sync map
382-
files. See the
383-
`documentation <http://www.readbeyond.it/aeneas/docs/>`__ or use the
384-
``-h`` switch for details.
383+
File ``job.zip`` should contain a ``config.txt`` or ``config.xml``
384+
configuration file, providing **aeneas** with all the information needed
385+
to parse the input assets and format the output sync map files. See the
386+
`documentation <http://www.readbeyond.it/aeneas/docs/>`__ or use the
387+
``-h`` switch for details.
385388

386-
You might want to run ``execute_task`` or ``execute_job`` with ``-h`` to
387-
get an usage message and some examples:
389+
5. You might want to run ``execute_task`` or ``execute_job`` with ``-h``
390+
to get an usage message and some examples:
388391

389-
.. code:: bash
392+
.. code:: bash
390393
391-
$ python -m aeneas.tools.execute_task -h
392-
$ python -m aeneas.tools.execute_job -h
394+
$ python -m aeneas.tools.execute_task -h
395+
$ python -m aeneas.tools.execute_job -h
393396
394397
See the `documentation <http://www.readbeyond.it/aeneas/docs/>`__ for an
395398
introduction to the concepts of ``task`` and ``job``, and for the list
@@ -438,8 +441,8 @@ Supported Features
438441
- Adjustable splitting times, including a max character/second
439442
constraint for CC applications
440443
- Automated detection of audio head/tail
441-
- MFCC and DTW computed as Python C extensions to reduce the processing
442-
time
444+
- MFCC and DTW computed via Python C extensions to reduce the
445+
processing time
443446
- On Linux, ``espeak`` called via a Python C extension for faster audio
444447
synthesis
445448
- Output an HTML file (from ``finetuneas`` project) for fine tuning the

aeneas/tests/long_test_job.py

+2-2
Original file line numberDiff line numberDiff line change
@@ -13,10 +13,10 @@ def execute(self, path):
1313
output_path = gf.tmp_directory()
1414
executor = ExecuteJob(job=None)
1515
executor.load_job_from_container(input_path)
16-
self.assertNotEqual(executor.job, None)
16+
self.assertIsNotNone(executor.job)
1717
executor.execute()
1818
result_path = executor.write_output_container(output_path)
19-
self.assertNotEqual(result_path, None)
19+
self.assertIsNotNone(result_path)
2020
self.assertTrue(gf.file_exists(result_path))
2121
executor.clean()
2222
gf.delete_directory(output_path)

aeneas/tests/long_test_task.py

+3-1
Original file line numberDiff line numberDiff line change
@@ -19,7 +19,7 @@ def execute(self, config_string, audio_path, text_path):
1919
executor.execute()
2020
task.sync_map_file_path_absolute = tmp_path
2121
result_path = task.output_sync_map_file()
22-
self.assertNotEqual(result_path, None)
22+
self.assertIsNotNone(result_path)
2323
self.assertEqual(result_path, tmp_path)
2424
self.assertGreater(len(gf.read_file_bytes(result_path)), 0)
2525
gf.delete_file(handler, tmp_path)
@@ -68,6 +68,8 @@ def test_formats(self):
6868
"res/inputtext/sonnet_plain.txt"
6969
)
7070

71+
# TODO more tests
72+
7173
if __name__ == '__main__':
7274
unittest.main()
7375

aeneas/tests/test_analyzecontainer.py

+5-5
Original file line numberDiff line numberDiff line change
@@ -85,13 +85,13 @@ def test_not_container(self):
8585
def test_container_not_existing(self):
8686
analyzer = AnalyzeContainer(Container(self.NOT_EXISTING_PATH))
8787
job = analyzer.analyze()
88-
self.assertEqual(job, None)
88+
self.assertIsNone(job)
8989

9090
def test_analyze_empty_container(self):
9191
for f in self.EMPTY_CONTAINERS:
9292
analyzer = AnalyzeContainer(Container(f))
9393
job = analyzer.analyze()
94-
self.assertEqual(job, None)
94+
self.assertIsNone(job)
9595

9696
def test_analyze(self):
9797
for f in self.FILES:
@@ -102,19 +102,19 @@ def test_analyze(self):
102102
def test_wizard_container_not_existing(self):
103103
analyzer = AnalyzeContainer(Container(self.NOT_EXISTING_PATH))
104104
job = analyzer.analyze(config_string=u"foo")
105-
self.assertEqual(job, None)
105+
self.assertIsNone(job)
106106

107107
def test_wizard_analyze_empty_container(self):
108108
for f in self.EMPTY_CONTAINERS:
109109
analyzer = AnalyzeContainer(Container(f))
110110
job = analyzer.analyze(config_string=u"foo")
111-
self.assertEqual(job, None)
111+
self.assertIsNone(job)
112112

113113
def test_wizard_analyze_valid(self):
114114
f = self.FILES[0]
115115
analyzer = AnalyzeContainer(Container(gf.absolute_path(f["path"], __file__)))
116116
job = analyzer.analyze(config_string=self.CONFIG_STRING)
117-
self.assertNotEqual(job, None)
117+
self.assertIsNotNone(job)
118118
self.assertEqual(len(job), f["length"])
119119

120120
if __name__ == '__main__':

aeneas/tests/test_audiofile.py

+3-3
Original file line numberDiff line numberDiff line change
@@ -149,21 +149,21 @@ def test_load_not_wave_file(self):
149149
def test_load_data(self):
150150
audiofile = self.load(self.AUDIO_FILE_PATH_MFCC)
151151
audiofile.load_data()
152-
self.assertNotEqual(audiofile.audio_data, None)
152+
self.assertIsNotNone(audiofile.audio_data)
153153
audiofile.clear_data()
154154

155155
def test_clear_data(self):
156156
audiofile = self.load(self.AUDIO_FILE_PATH_MFCC)
157157
audiofile.load_data()
158158
audiofile.clear_data()
159-
self.assertEqual(audiofile.audio_data, None)
159+
self.assertIsNone(audiofile.audio_data)
160160

161161
def test_extract_mfcc(self):
162162
audiofile = self.load(self.AUDIO_FILE_PATH_MFCC)
163163
audiofile.load_data()
164164
audiofile.extract_mfcc()
165165
audiofile.clear_data()
166-
self.assertNotEqual(audiofile.audio_mfcc, None)
166+
self.assertIsNone(audiofile.audio_data)
167167
self.assertEqual(audiofile.audio_mfcc.shape[0], 13)
168168
self.assertEqual(audiofile.audio_mfcc.shape[1], 1332)
169169

aeneas/tests/test_container.py

+8-8
Original file line numberDiff line numberDiff line change
@@ -188,18 +188,18 @@ def test_is_entry_safe_true(self):
188188
def test_read_entry_not_existing(self):
189189
cont = Container(self.NOT_EXISTING)
190190
with self.assertRaises(TypeError):
191-
self.assertEqual(cont.read_entry(self.EXPECTED_ENTRIES[0]), None)
191+
self.assertIsNone(cont.read_entry(self.EXPECTED_ENTRIES[0]))
192192

193193
def test_read_entry_empty_file(self):
194194
for f in self.EMPTY_FILES:
195195
cont = Container(f)
196196
with self.assertRaises(OSError):
197-
self.assertEqual(cont.read_entry(self.EXPECTED_ENTRIES[0]), None)
197+
self.assertIsNone(cont.read_entry(self.EXPECTED_ENTRIES[0]))
198198

199199
def test_read_entry_empty_directory(self):
200200
output_path = gf.tmp_directory()
201201
cont = Container(output_path)
202-
self.assertEqual(cont.read_entry(self.EXPECTED_ENTRIES[0]), None)
202+
self.assertIsNone(cont.read_entry(self.EXPECTED_ENTRIES[0]))
203203
gf.delete_directory(output_path)
204204

205205
def test_read_entry_existing(self):
@@ -208,24 +208,24 @@ def test_read_entry_existing(self):
208208
f = self.FILES[key]
209209
cont = Container(f["path"])
210210
result = cont.read_entry(entry)
211-
self.assertNotEqual(result, None)
211+
self.assertIsNotNone(result)
212212
self.assertEqual(len(result), f["config_size"])
213213

214214
def test_find_entry_not_existing(self):
215215
cont = Container(self.NOT_EXISTING)
216216
with self.assertRaises(TypeError):
217-
self.assertEqual(cont.find_entry(self.EXPECTED_ENTRIES[0]), None)
217+
self.assertIsNone(cont.find_entry(self.EXPECTED_ENTRIES[0]))
218218

219219
def test_find_entry_empty_file(self):
220220
for f in self.EMPTY_FILES:
221221
cont = Container(f)
222222
with self.assertRaises(OSError):
223-
self.assertEqual(cont.find_entry(self.EXPECTED_ENTRIES[0]), None)
223+
self.assertIsNone(cont.find_entry(self.EXPECTED_ENTRIES[0]))
224224

225225
def test_find_entry_empty_directory(self):
226226
output_path = gf.tmp_directory()
227227
cont = Container(output_path)
228-
self.assertEqual(cont.find_entry(self.EXPECTED_ENTRIES[0]), None)
228+
self.assertIsNone(cont.find_entry(self.EXPECTED_ENTRIES[0]))
229229
gf.delete_directory(output_path)
230230

231231
def test_find_entry_existing(self):
@@ -250,7 +250,7 @@ def test_read_entry_missing(self):
250250
f = self.FILES[key]
251251
cont = Container(f["path"])
252252
result = cont.read_entry(entry)
253-
self.assertEqual(result, None)
253+
self.assertIsNone(result)
254254

255255
def test_find_entry_missing(self):
256256
entry = "config_not_existing.txt"

0 commit comments

Comments
 (0)