Skip to content

Commit 2c4ce95

Browse files
huydhnNicolasHug
andauthored
Cherry pick #2241 and #2242 (#2244)
* Remove torchdata dependency from package and from CI (#2241) * Fix torchdata import error (#2242) * Remove stuff * stuff * lint --------- Co-authored-by: Nicolas Hug <[email protected]>
1 parent 57ed43c commit 2c4ce95

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

55 files changed

+72
-191
lines changed

.circleci/unittest/linux/scripts/install.sh

-5
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,6 @@
11
#!/usr/bin/env bash
22

33
unset PYTORCH_VERSION
4-
unset TORCHDATA_VERSION
54
# For unittest, nightly PyTorch is used as the following section,
65
# so no need to set PYTORCH_VERSION.
76
# In fact, keeping PYTORCH_VERSION forces us to hardcode PyTorch version in config.
@@ -30,10 +29,6 @@ printf "* Installing PyTorch\n"
3029
)
3130

3231

33-
printf "Installing torchdata nightly with portalocker\n"
34-
pip install "portalocker>=2.0.0"
35-
pip install --pre torchdata --index-url https://download.pytorch.org/whl/nightly/cpu
36-
3732
printf "* Installing torchtext\n"
3833
python setup.py develop
3934

.circleci/unittest/windows/scripts/install.sh

-5
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,6 @@
11
#!/usr/bin/env bash
22

33
unset PYTORCH_VERSION
4-
unset TORCHDATA_VERSION
54
# For unittest, nightly PyTorch is used as the following section,
65
# so no need to set PYTORCH_VERSION.
76
# In fact, keeping PYTORCH_VERSION forces us to hardcode PyTorch version in config.
@@ -19,10 +18,6 @@ conda activate ./env
1918
printf "* Installing PyTorch\n"
2019
conda install -y -c "pytorch-${UPLOAD_CHANNEL}" ${CONDA_CHANNEL_FLAGS} pytorch cpuonly
2120

22-
printf "* Installing torchdata nightly with portalocker\n"
23-
pip install "portalocker>=2.0.0"
24-
pip install --pre torchdata --index-url https://download.pytorch.org/whl/nightly/cpu
25-
2621
printf "* Installing pywin32_postinstall script\n"
2722
curl --output pywin32_postinstall.py https://raw.githubusercontent.com/mhammond/pywin32/main/pywin32_postinstall.py
2823
python pywin32_postinstall.py -install

.github/workflows/build-conda-linux.yml

+1-1
Original file line numberDiff line numberDiff line change
@@ -29,7 +29,7 @@ jobs:
2929
matrix:
3030
include:
3131
- repository: pytorch/text
32-
pre-script: packaging/install_torchdata.sh
32+
pre-script: ""
3333
post-script: ""
3434
conda-package-directory: packaging/torchtext
3535
smoke-test-script: test/smoke_tests/smoke_tests.py

.github/workflows/build-conda-m1.yml

+1-1
Original file line numberDiff line numberDiff line change
@@ -28,7 +28,7 @@ jobs:
2828
matrix:
2929
include:
3030
- repository: pytorch/text
31-
pre-script: packaging/install_torchdata.sh
31+
pre-script: ""
3232
post-script: ""
3333
conda-package-directory: packaging/torchtext
3434
smoke-test-script: test/smoke_tests/smoke_tests.py

.github/workflows/build-conda-windows.yml

+1-1
Original file line numberDiff line numberDiff line change
@@ -29,7 +29,7 @@ jobs:
2929
matrix:
3030
include:
3131
- repository: pytorch/text
32-
pre-script: packaging/install_torchdata.sh
32+
pre-script: ""
3333
post-script: ""
3434
conda-package-directory: packaging/torchtext
3535
smoke-test-script: test/smoke_tests/smoke_tests.py

.github/workflows/build-wheels-linux.yml

+1-1
Original file line numberDiff line numberDiff line change
@@ -34,7 +34,7 @@ jobs:
3434
matrix:
3535
include:
3636
- repository: pytorch/text
37-
pre-script: packaging/install_torchdata.sh
37+
pre-script: ""
3838
post-script: ""
3939
smoke-test-script: test/smoke_tests/smoke_tests.py
4040
package-name: torchtext

.github/workflows/build-wheels-m1.yml

+1-1
Original file line numberDiff line numberDiff line change
@@ -32,7 +32,7 @@ jobs:
3232
matrix:
3333
include:
3434
- repository: pytorch/text
35-
pre-script: packaging/install_torchdata.sh
35+
pre-script: ""
3636
post-script: ""
3737
package-name: torchtext
3838
smoke-test-script: test/smoke_tests/smoke_tests.py

.github/workflows/build-wheels-windows.yml

+1-1
Original file line numberDiff line numberDiff line change
@@ -33,7 +33,7 @@ jobs:
3333
matrix:
3434
include:
3535
- repository: pytorch/text
36-
pre-script: packaging/install_torchdata.sh
36+
pre-script: ""
3737
env-script: packaging/vc_env_helper.bat
3838
post-script: ""
3939
smoke-test-script: test/smoke_tests/smoke_tests.py

.github/workflows/codeql.yml

-1
Original file line numberDiff line numberDiff line change
@@ -31,7 +31,6 @@ jobs:
3131
- name: Install Torch
3232
run: |
3333
python -m pip install cmake
34-
python -m pip install --quiet --pre torch torchdata -f https://download.pytorch.org/whl/nightly/cpu/torch_nightly.html
3534
sudo ln -s /usr/bin/ninja /usr/bin/ninja-build
3635
3736
- name: Build TorchText

.github/workflows/integration-test.yml

+1-3
Original file line numberDiff line numberDiff line change
@@ -39,15 +39,13 @@ jobs:
3939
python -m spacy download en_core_web_sm
4040
printf "* Downloading SpaCy German models\n"
4141
python -m spacy download de_core_news_sm
42-
# Install PyTorch, Torchvision, and TorchData
42+
# Install PyTorch, Torchvision
4343
set -ex
4444
conda install \
4545
--yes \
4646
-c "pytorch-${CHANNEL}" \
4747
-c nvidia "pytorch-${CHANNEL}"::pytorch[build="*${VERSION}*"] \
4848
"${CUDATOOLKIT}"
49-
printf "Installing torchdata nightly\n"
50-
python3 -m pip install --pre torchdata --index-url https://download.pytorch.org/whl/nightly/cpu
5149
python3 setup.py develop
5250
# Install integration test dependencies
5351
python3 -m pip --quiet install parameterized

.github/workflows/test-linux-cpu.yml

+1-4
Original file line numberDiff line numberDiff line change
@@ -50,16 +50,13 @@ jobs:
5050
printf "* Downloading SpaCy German models\n"
5151
python -m spacy download de_core_news_sm
5252
53-
# Install PyTorch, Torchvision, and TorchData
53+
# Install PyTorch, Torchvision
5454
set -ex
5555
conda install \
5656
--yes \
5757
-c "pytorch-${CHANNEL}" \
5858
-c nvidia "pytorch-${CHANNEL}"::pytorch[build="*${VERSION}*"] \
5959
"${CUDATOOLKIT}"
60-
printf "Installing torchdata nightly\n"
61-
python3 -m pip install "portalocker>=2.0.0"
62-
python3 -m pip install --pre torchdata --index-url https://download.pytorch.org/whl/nightly/cpu
6360
python3 setup.py develop
6461
python3 -m pip install parameterized
6562

.github/workflows/test-linux-gpu.yml

+1-4
Original file line numberDiff line numberDiff line change
@@ -54,17 +54,14 @@ jobs:
5454
printf "* Downloading SpaCy German models\n"
5555
python -m spacy download de_core_news_sm
5656
57-
# Install PyTorch and TorchData
57+
# Install PyTorch
5858
set -ex
5959
conda install \
6060
--yes \
6161
--quiet \
6262
-c "pytorch-${CHANNEL}" \
6363
-c nvidia "pytorch-${CHANNEL}"::pytorch[build="*${VERSION}*"] \
6464
"${CUDATOOLKIT}"
65-
printf "Installing torchdata nightly\n"
66-
python3 -m pip install "portalocker>=2.0.0"
67-
python3 -m pip install --pre torchdata --index-url https://download.pytorch.org/whl/nightly/cpu --quiet
6865
python3 setup.py develop
6966
python3 -m pip install parameterized --quiet
7067

.github/workflows/test-macos-cpu.yml

+1-4
Original file line numberDiff line numberDiff line change
@@ -55,7 +55,7 @@ jobs:
5555
printf "* Downloading SpaCy German models\n"
5656
python -m spacy download de_core_news_sm
5757
58-
# Install PyTorch, Torchvision, and TorchData
58+
# Install PyTorch, Torchvision
5959
set -ex
6060
conda install \
6161
--yes \
@@ -64,9 +64,6 @@ jobs:
6464
"${MKL_CONSTRAINT}" \
6565
pytorch \
6666
"${CUDATOOLKIT}"
67-
printf "Installing torchdata nightly\n"
68-
python3 -m pip install "portalocker>=2.0.0"
69-
python3 -m pip install --pre torchdata --index-url https://download.pytorch.org/whl/nightly/cpu
7067
python3 setup.py develop
7168
python3 -m pip install parameterized
7269

.github/workflows/test-windows-cpu.yml

+1-4
Original file line numberDiff line numberDiff line change
@@ -51,15 +51,12 @@ jobs:
5151
printf "* Downloading SpaCy German models\n"
5252
python -m spacy download de_core_news_sm
5353
54-
# Install PyTorch, Torchvision, and TorchData
54+
# Install PyTorch, Torchvision
5555
conda install \
5656
--yes \
5757
-c "pytorch-${CHANNEL}" \
5858
pytorch \
5959
cpuonly
60-
printf "Installing torchdata nightly\n"
61-
python -m pip install "portalocker>=2.0.0"
62-
python -m pip install --pre torchdata --index-url https://download.pytorch.org/whl/nightly/cpu
6360
6461
printf "* Installing pywin32_postinstall script\n"
6562
curl --output pywin32_postinstall.py https://raw.githubusercontent.com/mhammond/pywin32/main/pywin32_postinstall.py

.github/workflows/validate-binaries.yml

+5
Original file line numberDiff line numberDiff line change
@@ -43,6 +43,11 @@ on:
4343
default: ""
4444
required: false
4545
type: string
46+
pytorch_version:
47+
description: "PyTorch version to validate (ie. 2.0, 2.2.2, etc.) - optional"
48+
default: ""
49+
required: false
50+
type: string
4651
jobs:
4752
validate-binaries:
4853
uses: pytorch/test-infra/.github/workflows/validate-domain-library.yml@release/2.2

README.rst

+3
Original file line numberDiff line numberDiff line change
@@ -12,6 +12,9 @@
1212
torchtext
1313
+++++++++
1414

15+
CAUTION: As of September 2023 we have paused active development of TorchText because our focus has shifted away from building out this library offering.
16+
We will continue to release new versions but do not anticipate any new feature development as we figure out future investments in this space.
17+
1518
This repository consists of:
1619

1720
* `torchtext.datasets <https://github.com/pytorch/text/tree/main/torchtext/datasets>`_: The raw text iterators for common NLP datasets

packaging/install_torchdata.sh

-40
This file was deleted.

packaging/pkg_helpers.bash

-12
Original file line numberDiff line numberDiff line change
@@ -190,14 +190,6 @@ setup_pip_pytorch_version() {
190190
-f https://download.pytorch.org/whl/torch_stable.html \
191191
-f "https://download.pytorch.org/whl/${UPLOAD_CHANNEL}/torch_${UPLOAD_CHANNEL}.html"
192192
fi
193-
if [[ -z "$TORCHDATA_VERSION" ]]; then
194-
pip_install --pre torchdata -f "https://download.pytorch.org/whl/nightly/cpu/torch_nightly.html"
195-
export TORCHDATA_VERSION="$(pip show torchdata | grep ^Version: | sed 's/Version: *//' | sed 's/+.\+//')"
196-
else
197-
pip_install "torchdata==$TORCHDATA_VERSION" \
198-
-f https://download.pytorch.org/whl/torch_stable.html \
199-
-f "https://download.pytorch.org/whl/${UPLOAD_CHANNEL}/torch_${UPLOAD_CHANNEL}.html"
200-
fi
201193
}
202194

203195
# Fill PYTORCH_VERSION with the latest conda nightly version, and
@@ -232,10 +224,6 @@ setup_conda_pytorch_constraint() {
232224
export CONDA_EXTRA_BUILD_CONSTRAINT="- mkl<=2021.2.0"
233225
fi
234226
fi
235-
if [[ -z "$TORCHDATA_VERSION" ]]; then
236-
export TORCHDATA_VERSION="$(conda search --json 'torchdata[channel=pytorch-nightly]' | ${PYTHON} -c "import sys, json, re; print(re.sub(r'\\+.*$', '', json.load(sys.stdin)['torchdata'][-1]['version']))")"
237-
fi
238-
export CONDA_TORCHDATA_CONSTRAINT="- torchdata==$TORCHDATA_VERSION"
239227
}
240228

241229
# Translate CUDA_VERSION into CUDA_CUDATOOLKIT_CONSTRAINT

packaging/torchtext/meta.yaml

-1
Original file line numberDiff line numberDiff line change
@@ -24,7 +24,6 @@ requirements:
2424
- requests
2525
- tqdm
2626
{{ environ.get('CONDA_PYTORCH_CONSTRAINT') }}
27-
{{ environ.get('CONDA_TORCHDATA_CONSTRAINT') }}
2827

2928
build:
3029
string: py{{py}}

pytest.ini

+1
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,5 @@
11
[pytest]
2+
addopts = --ignore-glob=test/torchtext_unittest/datasets/*
23
testpaths = test/
34
python_paths = ./
45
markers =

requirements.txt

-1
Original file line numberDiff line numberDiff line change
@@ -19,7 +19,6 @@ Sphinx
1919
pytest
2020
expecttest
2121
parameterized
22-
torchdata>0.5
2322

2423
# Lets pytest find our code by automatically modifying PYTHONPATH
2524
pytest-pythonpath

setup.py

+1-5
Original file line numberDiff line numberDiff line change
@@ -63,14 +63,10 @@ def _init_submodule():
6363
print("-- Building version " + VERSION)
6464

6565
pytorch_package_version = os.getenv("PYTORCH_VERSION")
66-
torchdata_package_version = os.getenv("TORCHDATA_VERSION")
6766

6867
pytorch_package_dep = "torch"
6968
if pytorch_package_version is not None:
7069
pytorch_package_dep += "==" + pytorch_package_version
71-
torchdata_package_dep = "torchdata"
72-
if torchdata_package_version is not None:
73-
torchdata_package_dep += "==" + torchdata_package_version
7470

7571

7672
class clean(distutils.command.clean.clean):
@@ -104,7 +100,7 @@ def run(self):
104100
description="Text utilities, models, transforms, and datasets for PyTorch.",
105101
long_description=read("README.rst"),
106102
license="BSD",
107-
install_requires=["tqdm", "requests", pytorch_package_dep, "numpy", torchdata_package_dep],
103+
install_requires=["tqdm", "requests", pytorch_package_dep, "numpy"],
108104
python_requires=">=3.8",
109105
classifiers=[
110106
"Programming Language :: Python :: 3.8",

test/smoke_tests/smoke_tests.py

-22
Original file line numberDiff line numberDiff line change
@@ -1,28 +1,6 @@
11
"""Run smoke tests"""
22

3-
import os
4-
import re
5-
6-
import torchdata
73
import torchtext
8-
import torchtext.version # noqa: F401
9-
10-
NIGHTLY_ALLOWED_DELTA = 3
11-
channel = os.getenv("MATRIX_CHANNEL")
12-
13-
14-
def validateTorchdataVersion():
15-
from datetime import datetime
16-
17-
date_t_str = re.findall(r"dev\d+", torchdata.__version__)[0]
18-
date_t_delta = datetime.now() - datetime.strptime(date_t_str[3:], "%Y%m%d")
19-
20-
if date_t_delta.days >= NIGHTLY_ALLOWED_DELTA:
21-
raise RuntimeError(f"torchdata binary {torchdata.__version__} is more than {NIGHTLY_ALLOWED_DELTA} days old!")
22-
234

24-
if channel == "nightly":
25-
validateTorchdataVersion()
265

276
print("torchtext version is ", torchtext.__version__)
28-
print("torchdata version is ", torchdata.__version__)

torchtext/_download_hooks.py

-1
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,6 @@
44

55
# This is to allow monkey-patching in fbcode
66
from torch.hub import load_state_dict_from_url # noqa
7-
from torchdata.datapipes.iter import HttpReader, GDriveReader # noqa F401
87
from tqdm import tqdm
98

109

torchtext/datasets/ag_news.py

+1-2
Original file line numberDiff line numberDiff line change
@@ -2,8 +2,6 @@
22
from functools import partial
33
from typing import Union, Tuple
44

5-
from torchdata.datapipes.iter import FileOpener, IterableWrapper
6-
from torchtext._download_hooks import HttpReader
75
from torchtext._internal.module_utils import is_module_available
86
from torchtext.data.datasets_utils import (
97
_wrap_split_argument,
@@ -65,6 +63,7 @@ def AG_NEWS(root: str, split: Union[Tuple[str], str]):
6563
raise ModuleNotFoundError(
6664
"Package `torchdata` not found. Please install following instructions at https://github.com/pytorch/data"
6765
)
66+
from torchdata.datapipes.iter import FileOpener, GDriveReader, HttpReader, IterableWrapper # noqa
6867

6968
url_dp = IterableWrapper([URL[split]])
7069
cache_dp = url_dp.on_disk_cache(

torchtext/datasets/amazonreviewfull.py

+1-2
Original file line numberDiff line numberDiff line change
@@ -2,8 +2,6 @@
22
from functools import partial
33
from typing import Union, Tuple
44

5-
from torchdata.datapipes.iter import FileOpener, IterableWrapper
6-
from torchtext._download_hooks import GDriveReader
75
from torchtext._internal.module_utils import is_module_available
86
from torchtext.data.datasets_utils import (
97
_wrap_split_argument,
@@ -79,6 +77,7 @@ def AmazonReviewFull(root: str, split: Union[Tuple[str], str]):
7977
raise ModuleNotFoundError(
8078
"Package `torchdata` not found. Please install following instructions at https://github.com/pytorch/data"
8179
)
80+
from torchdata.datapipes.iter import FileOpener, GDriveReader, HttpReader, IterableWrapper # noqa
8281

8382
url_dp = IterableWrapper([URL])
8483
cache_compressed_dp = url_dp.on_disk_cache(

0 commit comments

Comments
 (0)