Skip to content

Commit ca562a7

Browse files
committed
Merge branch 'master' of github.com:aleks-v-k/textract
2 parents 124c44f + 102a584 commit ca562a7

24 files changed

+147
-84
lines changed

.gitignore

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -25,6 +25,9 @@ var/
2525
pip-log.txt
2626
pip-delete-this-directory.txt
2727

28+
# Virtual environments
29+
**/venv*
30+
2831
# Unit test / coverage reports
2932
htmlcov/
3033
.tox/

.pyup.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
update: all
22
branch: master
33
schedule: "every two weeks"
4-
pin: True
4+
pin: False
55
requirements:
66
- requirements/python:
77
updates: all

.travis.yml

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
1-
sudo: required
2-
dist: bionic
1+
dist: focal
2+
os: linux
33

44
language: python
55
python:
@@ -9,6 +9,7 @@ python:
99
# install system dependencies here with apt-get.
1010
before_install:
1111
- sudo ./provision/debian.sh
12+
- python -m pip install --upgrade pip
1213

1314
# install python dependencies including this package in the travis
1415
# virtualenv
@@ -27,7 +28,7 @@ script:
2728
- cd tests && make && cd -
2829
- nosetests --with-coverage --cover-package=textract
2930
- cd tests && pytest && cd -
30-
- pycodestyle textract/ bin/textract
31+
# - pycodestyle textract/ bin/textract
3132
- if [[ $TRAVIS_PYTHON_VERSION == 3.7 ]];
3233
then cd docs && make html && cd -;
3334
fi

README.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22
..
33
.. * bumpversion {major|minor|patch}
44
.. * git push && git push --tags
5-
.. * python setup.py sdist upload
5+
.. * twine upload -r textract dist/*
66
.. * convert into release https://github.com/deanmalmgren/textract/releases
77
88
textract

docs/changelog.rst

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,14 @@ latest changes in development for next release
1010
----------------------------------------------
1111

1212
.. THANKS FOR CONTRIBUTING; ADD YOUR UNRELEASED CHANGES HERE!
13+
1.6.5
14+
-------------------
15+
16+
* switched epub parsing to MIT license compatible package (`#411`_ by
17+
`@jhale1805`_)
18+
19+
1.6.4
20+
-------------------
1321

1422
* several bug fixes, including:
1523

@@ -276,6 +284,7 @@ latest changes in development for next release
276284
.. _@eiotec: https://github.com/eiotec
277285
.. _@evfredericksen: https://github.com/evfredericksen
278286
.. _@jaraco: https://github.com/jaraco
287+
.. _@jhale1805: https://github.com/jhale1805
279288
.. _@jsmith-mploir: https://github.com/jsmith-mploir
280289
.. _@kokxx: https://github.com/Kokxx
281290
.. _@levivm: https://github.com/levivm
@@ -356,3 +365,4 @@ latest changes in development for next release
356365
.. _#149: https://github.com/deanmalmgren/textract/issues/149
357366
.. _#150: https://github.com/deanmalmgren/textract/issues/150
358367
.. _#162: https://github.com/deanmalmgren/textract/issues/162
368+
.. _#411: https://github.com/deanmalmgren/textract/issues/411

docs/conf.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -58,7 +58,7 @@
5858
# built documents.
5959
#
6060
# The short X.Y version.
61-
release = version = "1.6.3"
61+
release = version = "1.6.5"
6262

6363
# The language for content autogenerated by Sphinx. Refer to documentation
6464
# for a list of supported languages.

docs/index.rst

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@
66
textract
77
================================
88

9-
As undesireable as it might be, more often than not there is extremely
9+
As undesirable as it might be, more often than not there is extremely
1010
useful information embedded in Word documents, PowerPoint
1111
presentations, PDFs, etc---so-called "dark data"---that would be
1212
valuable for further textual analysis and visualization. While
@@ -44,6 +44,8 @@ file types by either mentioning them on the `issue tracker
4444

4545
* ``.csv`` via python builtins
4646

47+
* ``.tsv`` and ``.tab`` via python builtins
48+
4749
* ``.doc`` via `antiword`_
4850

4951
* ``.docx`` via `python-docx2txt`_

docs/installation.rst

Lines changed: 13 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -43,7 +43,7 @@ pypi.
4343

4444
.. code-block:: bash
4545
46-
brew cask install xquartz
46+
brew install --cask xquartz
4747
brew install poppler antiword unrtf tesseract swig
4848
pip install textract
4949
@@ -62,6 +62,18 @@ pypi.
6262
homebrew, you may also need to install the python
6363
development header files for textract to properly install.
6464

65+
FreeBSD
66+
-------
67+
68+
Setting up this package on FreeBSD pretty much follows the steps for
69+
Ubuntu / Debian while using ``pkg`` as package manager.
70+
71+
.. code-block:: bash
72+
73+
pkg install lang/python38 devel/py-pip textproc/libxml2 textproc/libxslt textproc/antiword textproc/unrtf \
74+
graphics/poppler print/pstotext graphics/tesseract audio/flac multimedia/ffmpeg audio/lame audio/sox \
75+
graphics/jpeg-turbo
76+
pip install textract
6577
6678
Don't see your operating system installation instructions here?
6779
---------------------------------------------------------------

provision/debian.sh

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -16,6 +16,5 @@ base=$(pwd)
1616

1717
# Install all of the dependencies required in the examples.
1818
# http://docs.travis-ci.com/user/installing-dependencies/#Installing-Ubuntu-packages
19-
add-apt-repository ppa:mc3man/trusty-media -y
2019
apt-get update -qq
2120
sed 's/\(.*\)\#.*/\1/' < $base/requirements/debian | xargs apt-get install -y --fix-missing

requirements/debian

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,6 @@ make
1010
# these packages are required by python-docx, which depends on lxml
1111
# and requires these things
1212
python-dev
13-
python-pip
1413
libxml2-dev
1514
libxslt1-dev
1615

0 commit comments

Comments
 (0)