Skip to content

Commit cd6902d

Browse files
author
Guanheng Zhang
committed
Merge branch 'master' into release/0.8
2 parents 3546693 + 45b21de commit cd6902d

File tree

6 files changed

+31
-15
lines changed

6 files changed

+31
-15
lines changed

.circleci/unittest/linux/scripts/environment.yml

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,6 @@ dependencies:
55
- codecov
66
- pip
77
- pip:
8-
- clang-format
98
- dataclasses
109
- nltk
1110
- requests

.circleci/unittest/linux/scripts/run_style_checks.sh

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -20,9 +20,9 @@ if [ "${status}" -ne 0 ]; then
2020
fi
2121

2222
printf "\x1b[34mRunning clang-format: "
23-
clang-format --version
23+
./clang-format --version
2424
printf "\x1b[0m\n"
25-
git-clang-format origin/master
25+
git-clang-format --binary ./clang-format origin/master
2626
git diff --exit-code
2727
status=$?
2828
exit_status="$((exit_status+status))"

.circleci/unittest/linux/scripts/setup_env.sh

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,11 @@ env_dir="${root_dir}/env"
1414

1515
cd "${root_dir}"
1616

17+
case "$(uname -s)" in
18+
Darwin*) os=MacOSX;;
19+
*) os=Linux
20+
esac
21+
1722
# 1. Install conda at ./conda
1823
if [ ! -d "${conda_dir}" ]; then
1924
printf "* Installing conda\n"
@@ -32,6 +37,11 @@ conda activate "${env_dir}"
3237
# 3. Install Conda dependencies
3338
printf "* Installing dependencies (except PyTorch)\n"
3439
conda env update --file "${this_dir}/environment.yml" --prune
40+
if [ "${os}" == Linux ] ; then
41+
clangformat_path="${root_dir}/clang-format"
42+
curl https://oss-clang-format.s3.us-east-2.amazonaws.com/linux64/clang-format-linux64 -o "${clangformat_path}"
43+
chmod +x "${clangformat_path}"
44+
fi
3545

3646
# 4. Download
3747
printf "* Downloading SpaCy English models\n"

README.rst

Lines changed: 16 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -15,8 +15,11 @@ This repository consists of:
1515
* `torchtext.data <#data>`_: Generic data loaders, abstractions, and iterators for text (including vocabulary and word vectors)
1616
* `torchtext.datasets <#datasets>`_: Pre-built loaders for common NLP datasets
1717

18-
Note: we are currently re-designing the torchtext library to make it more compatible with pytorch (e.g. ``torch.utils.data``). Several datasets have been written with the new abstractions in `torchtext.experimental <https://github.com/pytorch/text/tree/master/torchtext/experimental>`_ folder. We also created an issue to discuss the new abstraction, and users are welcome to leave feedback `link <https://github.com/pytorch/text/issues/664>`_.
18+
Note: we are currently re-designing the torchtext library to make it more compatible with pytorch (e.g. ``torch.utils.data``). Several datasets have been written with the new abstractions in `torchtext.experimental <https://github.com/pytorch/text/tree/master/torchtext/experimental>`_ folder. We also created an issue to discuss the new abstraction, and users are welcome to leave feedback `link <https://github.com/pytorch/text/issues/664>`_. These prototype building blocks and datasets in the experimental folder are available in the nightly release only. The nightly packages are accessible via Pip and Conda for Windows, Mac, and Linux. For example, Linux users can install the nightly wheels with the following command::
1919

20+
pip install --pre torch torchtext -f https://download.pytorch.org/whl/nightly/cpu/torch_nightly.html
21+
22+
For more detail instructions, please refer to `Install PyTorch <https://pytorch.org/get-started/locally/>`_. It should be noted that the new building blocks are still under development, and the APIs have not been solidified.
2023

2124
Installation
2225
============
@@ -28,15 +31,17 @@ We recommend Anaconda as Python package management system. Please refer to `pyto
2831
:widths: 10, 10, 10
2932

3033
nightly build, master, 3.6+
31-
1.5, 0.5, 3.5+
32-
1.4, 0.4, "2.7, 3.5+"
34+
1.7, 0.8, 3.6+
35+
1.6, 0.7, 3.6+
36+
1.5, 0.6, 3.5+
37+
1.4, 0.5, "2.7, 3.5+"
3338
0.4 and below, 0.2.3, "2.7, 3.5+"
3439

35-
Using conda;::
40+
Using conda::
3641

3742
conda install -c pytorch torchtext
3843

39-
Using pip;::
44+
Using pip::
4045

4146
pip install torchtext
4247

@@ -64,7 +69,13 @@ To build torchtext from source, you need ``git``, ``CMake`` and C++11 compiler s
6469
git clone https://github.com/pytorch/text torchtext
6570
cd torchtext
6671
git submodule update --init --recursive
72+
73+
# Linux
6774
python setup.py clean install
75+
76+
# OSX
77+
MACOSX_DEPLOYMENT_TARGET=10.9 CC=clang CXX=clang++ python setup.py clean install
78+
6879
# or ``python setup.py develop`` if you are making modifications.
6980

7081
**Note**

setup.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -82,7 +82,7 @@ def run(self):
8282
license='BSD',
8383

8484
install_requires=[
85-
'tqdm', 'requests', 'torch', 'numpy', 'sentencepiece'
85+
'tqdm', 'requests', 'torch', 'numpy'
8686
],
8787
python_requires='>=3.5',
8888
classifiers=[

test/data/test_functional.py

Lines changed: 2 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,6 @@
44
import uuid
55
import unittest
66

7-
import sentencepiece as spm
87
import torch
98
from torchtext.data.functional import (
109
generate_sp_model,
@@ -38,11 +37,8 @@ def test_generate_sp_model(self):
3837
model_prefix = os.path.join(dir_name, f'spm_user_{uuid.uuid4()}')
3938
model_file = f'{model_prefix}.model'
4039
generate_sp_model(data_path, vocab_size=23456, model_prefix=model_prefix)
41-
42-
sp_user = spm.SentencePieceProcessor()
43-
sp_user.Load(model_file)
44-
45-
self.assertEqual(len(sp_user), 23456)
40+
sp_model = load_sp_model(model_file)
41+
self.assertEqual(sp_model.GetPieceSize(), 23456)
4642

4743
def test_sentencepiece_numericalizer(self):
4844
test_sample = 'SentencePiece is an unsupervised text tokenizer and detokenizer'

0 commit comments

Comments
 (0)