Skip to content

Commit d817cc2

Browse files
authored
Merge pull request #99 from libffcv/main
Update no_jit_assert branch with bug fixes
2 parents eb06acf + 1d94d23 commit d817cc2

File tree

7 files changed

+51
-10
lines changed

7 files changed

+51
-10
lines changed

docker/Dockerfile

Lines changed: 32 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,32 @@
1+
FROM pytorch/pytorch:latest
2+
3+
RUN apt-get update && apt-get install -y --no-install-recommends \
4+
software-properties-common \
5+
build-essential \
6+
curl \
7+
git \
8+
ffmpeg
9+
10+
RUN conda create -n ffcv python=3.9 \
11+
cupy \
12+
pkg-config \
13+
compilers \
14+
libjpeg-turbo \
15+
opencv \
16+
pytorch \
17+
torchvision \
18+
cudatoolkit=11.3 \
19+
numba -c pytorch -c conda-forge
20+
21+
RUN echo "source activate" >> ~/.bashrc
22+
RUN echo "conda activate ffcv" >> ~/.bashrc
23+
24+
RUN git clone https://github.com/libffcv/ffcv.git
25+
26+
RUN conda run -n ffcv pip install ffcv
27+
28+
# To test:
29+
# 1- build the Dockerfile (e.g. docker build -t ffcv .)
30+
# 2- login to the docker container (e.g. docker run -it --gpus all ffcv bash)
31+
# 3- cd ffcv/examples/cifar
32+
# 4- bash train_cifar.sh

docs/ffcv_examples/cifar10.rst

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -106,8 +106,8 @@ For the model, we use a custom ResNet-9 architecture from `KakaoBrain <https://g
106106
107107
class Mul(ch.nn.Module):
108108
def __init__(self, weight):
109-
super(Mul, self).__init__()
110-
self.weight = weight
109+
super(Mul, self).__init__()
110+
self.weight = weight
111111
def forward(self, x): return x * self.weight
112112
113113
class Flatten(ch.nn.Module):

docs/index.rst

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -16,6 +16,8 @@ Install ``ffcv``:
1616
conda activate ffcv
1717
pip install ffcv
1818
19+
We also provide a `Dockerfile <https://github.com/libffcv/ffcv/blob/main/docker/Dockerfile>`_ that installs ``ffcv`` in few steps.
20+
1921

2022
Introduction
2123
------------

docs/making_dataloaders.rst

Lines changed: 9 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -49,6 +49,13 @@ takes an ``enum`` provided by :class:`ffcv.loader.OrderOption`:
4949
# Memory-efficient but not truly random loading
5050
# Speeds up loading over RANDOM when the whole dataset does not fit in RAM!
5151
ORDERING = OrderOption.QUASI_RANDOM
52+
53+
.. note::
54+
``order`` options require different amounts of RAM, thus should be used considering how much RAM available in a case-by-case basis.
55+
56+
- ``RANDOM`` requires RAM the most since it will have to cache the entire dataset to sample perfectly at random. If the available RAM is not enough, it will throw an exception.
57+
- ``QUASI_RANDOM`` requires much less RAM than ``RANDOM``, but a bit more than ``SEQUENTIAL``, in order to cache a part of samples. It is used when the entire dataset can not fit RAM.
58+
- ``SEQUENTIAL`` requires least RAM. It only keeps several samples loaded ahead of time used in incoming training iterations.
5259

5360
Pipelines
5461
'''''''''
@@ -165,12 +172,12 @@ Other options
165172

166173
You can also specify the following additional options when constructing an :class:`ffcv.loader.Loader`:
167174

168-
- ``os_cache``: If True, the entire dataset is cached
175+
- ``os_cache``: If ``True``, the OS automatically determines whether the dataset is held in memory or not, depending on available RAM. If ``False``, FFCV manages the caching, and the amount of RAM needed depends on ``order`` option.
169176
- ``distributed``: For training on :ref:`multiple GPUs<Scenario: Multi-GPU training (1 model, multiple GPUs)>`
170177
- ``seed``: Specify the random seed for batch ordering
171178
- ``indices``: Provide indices to load a subset of the dataset
172179
- ``custom_fields``: For specifying decoders for fields with custom encoders
173-
- ``drop_last``: If True, drops the last non-full batch from each iteration
180+
- ``drop_last``: If ``True``, drops the last non-full batch from each iteration
174181
- ``batches_ahead``: Set the number of batches prepared in advance. Increasing it absorbs variation in processing time to make sure the training loop does not stall for too long to process batches. Decreasing it reduces RAM usage.
175182
- ``recompile``: Recompile every iteration. Useful if you have transforms that change their behavior from epoch to epoch, for instance code that uses the shape as a compile time param. (But if they just change their memory usage, e.g., the resolution changes, it's not necessary.)
176183

docs/parameter_tuning.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -22,7 +22,7 @@ Scenario: Large scale datasets
2222
If your dataset is too large to be cached on the machine we recommend:
2323

2424
- Use ``os_cache=False``. Since the data can't be cached, FFCV will have to read it over and over. Having FFCV take over the operating system for caching is beneficial as it knows in advance the which samples will be needed in the future and can load them ahead of time.
25-
- For ``order``, we recommend using the ``QUASI_RANDOM`` traversal order if you need randomness but perfect uniform sampling isn't mission critical. This will optimize the order to minimize the reads on the underlying storage while maintaining very good randomness properties. If you have experience with the ``shuffle()`` function of ``webdataset`` and the quality of the randomness wasn't sufficient, we still suggest you give ``QUASI_RANDOM`` a try as it should be significantly better.
25+
- For ``order``, we recommend using the ``QUASI_RANDOM`` traversal order if you need randomness but perfect uniform sampling isn't mission critical. This will optimize the order to minimize the reads on the underlying storage while maintaining very good randomness properties. If you have experience with the ``shuffle()`` function of ``webdataset`` and the quality of the randomness wasn't sufficient, we still suggest you give ``QUASI_RANDOM`` a try as it should be significantly better. Using ``RANDOM`` is unfeasible in this situation because it needs to load the entire dataset in RAM, causing an out-of-memory exception.
2626

2727

2828
Scenario: Multi-GPU training (1 model, multiple GPUs)

docs/quickstart.rst

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -18,15 +18,15 @@ PyTorch datasets and `WebDatasets <https://github.com/webdataset/webdataset>`_):
1818
# Pass a type for each data field
1919
writer = DatasetWriter(write_path, {
2020
# Tune options to optimize dataset size, throughput at train-time
21-
'image': RGBImageField({
21+
'image': RGBImageField(
2222
max_resolution=256,
2323
jpeg_quality=jpeg_quality
24-
}),
24+
),
2525
'label': IntField()
2626
})
2727
2828
# Write dataset
29-
writer.from_indexed_dataset(ds)
29+
writer.from_indexed_dataset(my_dataset)
3030
3131
Then replace your old loader with the `ffcv` loader at train time (in PyTorch,
3232
no other changes required!):
@@ -58,4 +58,4 @@ no other changes required!):
5858
for epoch in range(epochs):
5959
...
6060
61-
See :ref:`here <Getting started>` for a more detailed guide to deploying `ffcv` for your dataset.
61+
See :ref:`here <Getting started>` for a more detailed guide to deploying `ffcv` for your dataset.

docs/writing_datasets.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -40,7 +40,7 @@ returns an input vector and its corresponding label:
4040
self.Y = np.randn(N)
4141
4242
def __getitem__(self, idx):
43-
return (self.X[idx], self.Y[idx])
43+
return (self.X[idx].astype('float32'), self.Y[idx])
4444
4545
def __len__(self):
4646
return len(self.X)

0 commit comments

Comments
 (0)