readbeyond
diff --git a/‎.gitignore
Lines changed: 1 addition & 0 deletions b/‎.gitignore
Lines changed: 1 addition & 0 deletions
diff --git a/‎README.md
Lines changed: 35 additions & 32 deletions b/‎README.md
Lines changed: 35 additions & 32 deletions
diff --git a/‎README.rst
Lines changed: 29 additions & 26 deletions b/‎README.rst
Lines changed: 29 additions & 26 deletions
diff --git a/‎aeneas/tests/long_test_job.py
Lines changed: 2 additions & 2 deletions b/‎aeneas/tests/long_test_job.py
Lines changed: 2 additions & 2 deletions
diff --git a/‎aeneas/tests/long_test_task.py
Lines changed: 3 additions & 1 deletion b/‎aeneas/tests/long_test_task.py
Lines changed: 3 additions & 1 deletion
diff --git a/‎aeneas/tests/test_analyzecontainer.py
Lines changed: 5 additions & 5 deletions b/‎aeneas/tests/test_analyzecontainer.py
Lines changed: 5 additions & 5 deletions
diff --git a/‎aeneas/tests/test_audiofile.py
Lines changed: 3 additions & 3 deletions b/‎aeneas/tests/test_audiofile.py
Lines changed: 3 additions & 3 deletions
diff --git a/‎aeneas/tests/test_container.py
Lines changed: 8 additions & 8 deletions b/‎aeneas/tests/test_container.py
Lines changed: 8 additions & 8 deletions
@@ -3,6 +3,7 @@
 *.pyo
 *.swp
 *.so
+.pybuild
 aeneas.egg-info
 aeneas/build
 bak
 
@@ -3,7 +3,7 @@
 **aeneas** is a Python library and a set of tools to automagically synchronize audio and text.
 
 * Version: 1.4.0
-* Date: 2016-01-??
+* Date: 2016-01-15
 * Developed by: [ReadBeyond](http://www.readbeyond.it/)
 * Lead Developer: [Alberto Pettarin](http://www.albertopettarin.it/)
 * License: the GNU Affero General Public License Version 3 (AGPL v3)
@@ -107,6 +107,7 @@ However, **aeneas** has been confirmed to work on the following systems:
 | Slackware      | 64        | Yes        | Unknown         |
 | Mac OS X 10.9  | 64        | Yes (1)    | Unknown (1)     |
 | Mac OS X 10.10 | 64        | Yes (1)    | Unknown (1)     |
+| Mac OS X 10.11 | 64        | Yes (1)    | Unknown (1)     |
 | Windows Vista  | 32        | Yes (1)    | Yes (1, 2)      |
 | Windows 7      | 64        | Yes (1)    | Yes (1, 2)      |
 | Windows 8.1    | 64        | Yes (1)    | Unknown (1, 2)  |
@@ -119,7 +120,7 @@ is available only on Linux at the moment.
 is quite complex; however, running **aeneas** in pure Python mode
 has been confirmed to work.
 
-In any case, **aeneas** should work on any OS, at least in pure Python mode,
+Anyway, **aeneas** should work on any OS, at least in pure Python mode,
 provided that:
 
 1. the required Python modules `BeautifulSoup4`, `lxml`, and `numpy` are installed, and
@@ -307,55 +308,55 @@ Feel free to jump to step 9 if you already have
 1. Install `aeneas` as described above. (Only the first time!)
 
 2. Open a command prompt/shell/terminal and go to the root directory
-of the aeneas repository, that is, the one containing the `README.md` and `VERSION` files.
-(This step is not needed if you installed `aeneas` with `pip`,
-since you will have the `aeneas` module available system-wise.)
+   of the aeneas repository, that is, the one containing the `README.md` and `VERSION` files.
+   (This step is not needed if you installed `aeneas` with `pip`,
+   since you will have the `aeneas` module available system-wise.)
 
 3. To compute a synchronization map `map.json` for a pair
-(`audio.mp3`, `text.txt` in `plain` text format), you can run:
+   (`audio.mp3`, `text.txt` in `plain` text format), you can run:
 
     ```bash
     $ python -m aeneas.tools.execute_task audio.mp3 text.txt "task_language=en|os_task_file_format=json|is_text_type=plain" map.json
     ```
 
-    The third parameter (the _configuration string_) can specify several parameters/options.
-    See the [documentation](http://www.readbeyond.it/aeneas/docs/)
-    or use the `-h` switch for details.
-
-4. To compute a synchronization map `map.smil` for a pair
-(`audio.mp3`, `page.xhtml` containing fragments marked by `id` attributes like `f001`),
-you can run:
+   To compute a synchronization map `map.smil` for a pair
+   (`audio.mp3`, `page.xhtml` containing fragments marked by `id` attributes like `f001`),
+   you can run:
 
     ```bash
     $ python -m aeneas.tools.execute_task audio.mp3 page.xhtml "task_language=en|os_task_file_format=smil|os_task_file_smil_audio_ref=audio.mp3|os_task_file_smil_page_ref=page.xhtml|is_text_type=unparsed|is_text_unparsed_id_regex=f[0-9]+|is_text_unparsed_id_sort=numeric" map.smil
     ```
 
-5. If you have several tasks to run,
-you can create a job container and a configuration file,
-and run them all at once:
+   The third parameter (the _configuration string_) can specify several other parameters/options.
+   See the [documentation](http://www.readbeyond.it/aeneas/docs/)
+   or use the `-h` switch for details.
+
+4. If you have several tasks to run,
+   you can create a job container and a configuration file,
+   and run them all at once:
 
     ```bash
     $ python -m aeneas.tools.execute_job job.zip /tmp/
     ```
 
-    File `job.zip` should contain a `config.txt` or `config.xml`
-    configuration file, providing **aeneas**
-    with all the information needed to parse the input assets
-    and format the output sync map files.
-    See the [documentation](http://www.readbeyond.it/aeneas/docs/)
-    or use the `-h` switch for details.
+   File `job.zip` should contain a `config.txt` or `config.xml`
+   configuration file, providing **aeneas**
+   with all the information needed to parse the input assets
+   and format the output sync map files.
+   See the [documentation](http://www.readbeyond.it/aeneas/docs/)
+   or use the `-h` switch for details.
 
-You might want to run `execute_task` or `execute_job`
-with `-h` to get an usage message and some examples:
+5. You might want to run `execute_task` or `execute_job`
+   with `-h` to get an usage message and some examples:
 
-```bash
-$ python -m aeneas.tools.execute_task -h
-$ python -m aeneas.tools.execute_job -h
-```
+    ```bash
+    $ python -m aeneas.tools.execute_task -h
+    $ python -m aeneas.tools.execute_job -h
+    ```
 
-See the [documentation](http://www.readbeyond.it/aeneas/docs/)
-for an introduction to the concepts of `task` and  `job`,
-and for the list of all the available options.
+   See the [documentation](http://www.readbeyond.it/aeneas/docs/)
+   for an introduction to the concepts of `task` and  `job`,
+   and for the list of all the available options.
 
 
 ## Documentation
@@ -391,17 +392,19 @@ Changelog: [http://www.readbeyond.it/aeneas/docs/changelog.html](http://www.read
 * Code suitable for a Web app deployment (e.g., on-demand AWS instances)
 * Adjustable splitting times, including a max character/second constraint for CC applications
 * Automated detection of audio head/tail
-* MFCC and DTW computed as Python C extensions to reduce the processing time
+* MFCC and DTW computed via Python C extensions to reduce the processing time
 * On Linux, `espeak` called via a Python C extension for faster audio synthesis
 * Output an HTML file (from `finetuneas` project) for fine tuning the sync map manually
 
+
 ## Limitations and Missing Features 
 
 * Audio should match the text: large portions of spurious text or audio might produce a wrong sync map
 * Audio is assumed to be spoken: not suitable/YMMV for song captioning
 * No protection against memory trashing if you feed extremely long audio files
 * On Mac OS X and Windows, audio synthesis might be slow if you have thousands of text fragments
 
+
 ## TODO List
 
 * Improving robustness against music in background
 
@@ -5,7 +5,7 @@ aeneas
 synchronize audio and text.
 
 -  Version: 1.4.0
--  Date: 2016-01-??
+-  Date: 2016-01-15
 -  Developed by: `ReadBeyond <http://www.readbeyond.it/>`__
 -  Lead Developer: `Alberto Pettarin <http://www.albertopettarin.it/>`__
 -  License: the GNU Affero General Public License Version 3 (AGPL v3)
@@ -130,6 +130,8 @@ However, **aeneas** has been confirmed to work on the following systems:
 +------------------+-------------+--------------+------------------+
 | Mac OS X 10.10   | 64          | Yes (1)      | Unknown (1)      |
 +------------------+-------------+--------------+------------------+
+| Mac OS X 10.11   | 64          | Yes (1)      | Unknown (1)      |
++------------------+-------------+--------------+------------------+
 | Windows Vista    | 32          | Yes (1)      | Yes (1, 2)       |
 +------------------+-------------+--------------+------------------+
 | Windows 7        | 64          | Yes (1)      | Yes (1, 2)       |
@@ -144,8 +146,8 @@ is available only on Linux at the moment. (2) On Windows and Python
 3.4/3.5, compiling the Python C extensions is quite complex; however,
 running **aeneas** in pure Python mode has been confirmed to work.
 
-In any case, **aeneas** should work on any OS, at least in pure Python
-mode, provided that:
+Anyway, **aeneas** should work on any OS, at least in pure Python mode,
+provided that:
 
 1. the required Python modules ``BeautifulSoup4``, ``lxml``, and
    ``numpy`` are installed, and
@@ -356,40 +358,41 @@ Usage
 
        $ python -m aeneas.tools.execute_task audio.mp3 text.txt "task_language=en|os_task_file_format=json|is_text_type=plain" map.json
 
-   The third parameter (the *configuration string*) can specify several
-   parameters/options. See the
-   `documentation <http://www.readbeyond.it/aeneas/docs/>`__ or use the
-   ``-h`` switch for details.
+To compute a synchronization map ``map.smil`` for a pair (``audio.mp3``,
+``page.xhtml`` containing fragments marked by ``id`` attributes like
+``f001``), you can run:
 
-4. To compute a synchronization map ``map.smil`` for a pair
-   (``audio.mp3``, ``page.xhtml`` containing fragments marked by ``id``
-   attributes like ``f001``), you can run:
+::
 
-   .. code:: bash
+    ```bash
+    $ python -m aeneas.tools.execute_task audio.mp3 page.xhtml "task_language=en|os_task_file_format=smil|os_task_file_smil_audio_ref=audio.mp3|os_task_file_smil_page_ref=page.xhtml|is_text_type=unparsed|is_text_unparsed_id_regex=f[0-9]+|is_text_unparsed_id_sort=numeric" map.smil
+    ```
 
-       $ python -m aeneas.tools.execute_task audio.mp3 page.xhtml "task_language=en|os_task_file_format=smil|os_task_file_smil_audio_ref=audio.mp3|os_task_file_smil_page_ref=page.xhtml|is_text_type=unparsed|is_text_unparsed_id_regex=f[0-9]+|is_text_unparsed_id_sort=numeric" map.smil
+The third parameter (the *configuration string*) can specify several
+other parameters/options. See the
+`documentation <http://www.readbeyond.it/aeneas/docs/>`__ or use the
+``-h`` switch for details.
 
-5. If you have several tasks to run, you can create a job container and
+4. If you have several tasks to run, you can create a job container and
    a configuration file, and run them all at once:
 
    .. code:: bash
 
        $ python -m aeneas.tools.execute_job job.zip /tmp/
 
-   File ``job.zip`` should contain a ``config.txt`` or ``config.xml``
-   configuration file, providing **aeneas** with all the information
-   needed to parse the input assets and format the output sync map
-   files. See the
-   `documentation <http://www.readbeyond.it/aeneas/docs/>`__ or use the
-   ``-h`` switch for details.
+File ``job.zip`` should contain a ``config.txt`` or ``config.xml``
+configuration file, providing **aeneas** with all the information needed
+to parse the input assets and format the output sync map files. See the
+`documentation <http://www.readbeyond.it/aeneas/docs/>`__ or use the
+``-h`` switch for details.
 
-You might want to run ``execute_task`` or ``execute_job`` with ``-h`` to
-get an usage message and some examples:
+5. You might want to run ``execute_task`` or ``execute_job`` with ``-h``
+   to get an usage message and some examples:
 
-.. code:: bash
+   .. code:: bash
 
-    $ python -m aeneas.tools.execute_task -h
-    $ python -m aeneas.tools.execute_job -h
+       $ python -m aeneas.tools.execute_task -h
+       $ python -m aeneas.tools.execute_job -h
 
 See the `documentation <http://www.readbeyond.it/aeneas/docs/>`__ for an
 introduction to the concepts of ``task`` and ``job``, and for the list
@@ -438,8 +441,8 @@ Supported Features
 -  Adjustable splitting times, including a max character/second
    constraint for CC applications
 -  Automated detection of audio head/tail
--  MFCC and DTW computed as Python C extensions to reduce the processing
-   time
+-  MFCC and DTW computed via Python C extensions to reduce the
+   processing time
 -  On Linux, ``espeak`` called via a Python C extension for faster audio
    synthesis
 -  Output an HTML file (from ``finetuneas`` project) for fine tuning the
 
@@ -13,10 +13,10 @@ def execute(self, path):
         output_path = gf.tmp_directory()
         executor = ExecuteJob(job=None)
         executor.load_job_from_container(input_path)
-        self.assertNotEqual(executor.job, None)
+        self.assertIsNotNone(executor.job)
         executor.execute()
         result_path = executor.write_output_container(output_path)
-        self.assertNotEqual(result_path, None)
+        self.assertIsNotNone(result_path)
         self.assertTrue(gf.file_exists(result_path))
         executor.clean()
         gf.delete_directory(output_path)
 
@@ -19,7 +19,7 @@ def execute(self, config_string, audio_path, text_path):
         executor.execute()
         task.sync_map_file_path_absolute = tmp_path
         result_path = task.output_sync_map_file()
-        self.assertNotEqual(result_path, None)
+        self.assertIsNotNone(result_path)
         self.assertEqual(result_path, tmp_path)
         self.assertGreater(len(gf.read_file_bytes(result_path)), 0)
         gf.delete_file(handler, tmp_path)
@@ -68,6 +68,8 @@ def test_formats(self):
                 "res/inputtext/sonnet_plain.txt"
             )
 
+    # TODO more tests
+
 if __name__ == '__main__':
     unittest.main()
 
 
@@ -85,13 +85,13 @@ def test_not_container(self):
     def test_container_not_existing(self):
         analyzer = AnalyzeContainer(Container(self.NOT_EXISTING_PATH))
         job = analyzer.analyze()
-        self.assertEqual(job, None)
+        self.assertIsNone(job)
 
     def test_analyze_empty_container(self):
         for f in self.EMPTY_CONTAINERS:
             analyzer = AnalyzeContainer(Container(f))
             job = analyzer.analyze()
-            self.assertEqual(job, None)
+            self.assertIsNone(job)
 
     def test_analyze(self):
         for f in self.FILES:
@@ -102,19 +102,19 @@ def test_analyze(self):
     def test_wizard_container_not_existing(self):
         analyzer = AnalyzeContainer(Container(self.NOT_EXISTING_PATH))
         job = analyzer.analyze(config_string=u"foo")
-        self.assertEqual(job, None)
+        self.assertIsNone(job)
 
     def test_wizard_analyze_empty_container(self):
         for f in self.EMPTY_CONTAINERS:
             analyzer = AnalyzeContainer(Container(f))
             job = analyzer.analyze(config_string=u"foo")
-            self.assertEqual(job, None)
+            self.assertIsNone(job)
 
     def test_wizard_analyze_valid(self):
         f = self.FILES[0]
         analyzer = AnalyzeContainer(Container(gf.absolute_path(f["path"], __file__)))
         job = analyzer.analyze(config_string=self.CONFIG_STRING)
-        self.assertNotEqual(job, None)
+        self.assertIsNotNone(job)
         self.assertEqual(len(job), f["length"])
 
 if __name__ == '__main__':
 
@@ -149,21 +149,21 @@ def test_load_not_wave_file(self):
     def test_load_data(self):
         audiofile = self.load(self.AUDIO_FILE_PATH_MFCC)
         audiofile.load_data()
-        self.assertNotEqual(audiofile.audio_data, None)
+        self.assertIsNotNone(audiofile.audio_data)
         audiofile.clear_data()
 
     def test_clear_data(self):
         audiofile = self.load(self.AUDIO_FILE_PATH_MFCC)
         audiofile.load_data()
         audiofile.clear_data()
-        self.assertEqual(audiofile.audio_data, None)
+        self.assertIsNone(audiofile.audio_data)
 
     def test_extract_mfcc(self):
         audiofile = self.load(self.AUDIO_FILE_PATH_MFCC)
         audiofile.load_data()
         audiofile.extract_mfcc()
         audiofile.clear_data()
-        self.assertNotEqual(audiofile.audio_mfcc, None)
+        self.assertIsNone(audiofile.audio_data)
         self.assertEqual(audiofile.audio_mfcc.shape[0], 13)
         self.assertEqual(audiofile.audio_mfcc.shape[1], 1332)
 
 
@@ -188,18 +188,18 @@ def test_is_entry_safe_true(self):
     def test_read_entry_not_existing(self):
         cont = Container(self.NOT_EXISTING)
         with self.assertRaises(TypeError):
-            self.assertEqual(cont.read_entry(self.EXPECTED_ENTRIES[0]), None)
+            self.assertIsNone(cont.read_entry(self.EXPECTED_ENTRIES[0]))
 
     def test_read_entry_empty_file(self):
         for f in self.EMPTY_FILES:
             cont = Container(f)
             with self.assertRaises(OSError):
-                self.assertEqual(cont.read_entry(self.EXPECTED_ENTRIES[0]), None)
+                self.assertIsNone(cont.read_entry(self.EXPECTED_ENTRIES[0]))
 
     def test_read_entry_empty_directory(self):
         output_path = gf.tmp_directory()
         cont = Container(output_path)
-        self.assertEqual(cont.read_entry(self.EXPECTED_ENTRIES[0]), None)
+        self.assertIsNone(cont.read_entry(self.EXPECTED_ENTRIES[0]))
         gf.delete_directory(output_path)
 
     def test_read_entry_existing(self):
@@ -208,24 +208,24 @@ def test_read_entry_existing(self):
             f = self.FILES[key]
             cont = Container(f["path"])
             result = cont.read_entry(entry)
-            self.assertNotEqual(result, None)
+            self.assertIsNotNone(result)
             self.assertEqual(len(result), f["config_size"])
 
     def test_find_entry_not_existing(self):
         cont = Container(self.NOT_EXISTING)
         with self.assertRaises(TypeError):
-            self.assertEqual(cont.find_entry(self.EXPECTED_ENTRIES[0]), None)
+            self.assertIsNone(cont.find_entry(self.EXPECTED_ENTRIES[0]))
 
     def test_find_entry_empty_file(self):
         for f in self.EMPTY_FILES:
             cont = Container(f)
             with self.assertRaises(OSError):
-                self.assertEqual(cont.find_entry(self.EXPECTED_ENTRIES[0]), None)
+                self.assertIsNone(cont.find_entry(self.EXPECTED_ENTRIES[0]))
 
     def test_find_entry_empty_directory(self):
         output_path = gf.tmp_directory()
         cont = Container(output_path)
-        self.assertEqual(cont.find_entry(self.EXPECTED_ENTRIES[0]), None)
+        self.assertIsNone(cont.find_entry(self.EXPECTED_ENTRIES[0]))
         gf.delete_directory(output_path)
 
     def test_find_entry_existing(self):
@@ -250,7 +250,7 @@ def test_read_entry_missing(self):
             f = self.FILES[key]
             cont = Container(f["path"])
             result = cont.read_entry(entry)
-            self.assertEqual(result, None)
+            self.assertIsNone(result)
 
     def test_find_entry_missing(self):
         entry = "config_not_existing.txt"