Commit a7e46d2
authored
switch GPU workers to generic-worker & dockerize GPU tasks (#700)
* switch GPU workers to d2g images
* Switch GPU tasks to run within docker
The new image we're upgrading GPU workers to uses Ubuntu 24.04, which makes it incompatible with various parts of the pipeline (mostly due to Python package pinning). As it turns out, the easiest way to fix this is to dockerize the GPU tasks.
We need slight updates to GPU task payloads to accommodate this.
This will fix #391.
* pull in cuda-toolkit in GPU tasks to make necessary libraries available
Now that we're running inside a docker image we don't have these available on the filesystem already. I explored the idea of installing them into the Docker image, but it's quite impractical. The host image is Ubuntu 24.04, and the containers Ubuntu 22.04. We need to have a matching toolkit version, and we require version 12 at this point, which isn't available on 22.04.
* enable cache for ~/.task-cache/pip
Without this we end up with these files being inaccessible in subsequent tasks.
* fix: add 'requests' to ctranslate2 requirements
This has always been needed, but it was found on the host system on the previous image.
* update interactivate task documentation
All tasks now use the docker-worker format; no need to distinguish between GPU and non-GPU tasks.
Also add a bit more detail about working in the container.
* drop now-unused cuda-toolkit-11 toolchain
* feat: use volume mount for artifacts
This ensures artifacts will be uploaded if a spot termination happens.1 parent d97a199 commit a7e46d2
File tree
49 files changed
+456
-232
lines changed- docs/training
- pipeline
- bicleaner
- translate/requirements
- taskcluster
- docker
- base
- inference
- test
- toolchain-build
- train
- kinds
- alignments-backtranslated
- alignments-original
- alignments-student
- analyze-corpus
- analyze-mono
- bicleaner-model
- bicleaner
- cefilter
- clean-corpus
- clean-mono
- collect-corpus
- collect-mono-src
- collect-mono-trg
- dataset
- evaluate-quantized
- evaluate-teacher-ensemble
- evaluate
- export
- extract-best
- fetch
- finetune-student
- merge-corpus
- merge-devset
- merge-mono
- merge-translated
- quantize
- score
- shortlist
- split-corpus
- split-mono-src
- split-mono-trg
- toolchain
- train-backwards
- train-student
- train-teacher
- train-vocab
- translate-corpus
- translate-mono-src
- translate-mono-trg
Some content is hidden
Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.
49 files changed
+456
-232
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
136 | 136 | | |
137 | 137 | | |
138 | 138 | | |
139 | | - | |
| 139 | + | |
140 | 140 | | |
141 | 141 | | |
142 | 142 | | |
143 | 143 | | |
144 | 144 | | |
145 | 145 | | |
146 | 146 | | |
147 | | - | |
148 | | - | |
149 | | - | |
150 | | - | |
151 | | - | |
152 | | - | |
153 | | - | |
154 | | - | |
155 | | - | |
156 | | - | |
157 | 147 | | |
158 | 148 | | |
159 | | - | |
| 149 | + | |
| 150 | + | |
| 151 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
13 | 13 | | |
14 | 14 | | |
15 | 15 | | |
16 | | - | |
17 | | - | |
18 | | - | |
19 | | - | |
20 | 16 | | |
21 | 17 | | |
22 | 18 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | 1 | | |
2 | 2 | | |
3 | 3 | | |
| 4 | + | |
Lines changed: 242 additions & 120 deletions
Large diffs are not rendered by default.
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
93 | 93 | | |
94 | 94 | | |
95 | 95 | | |
96 | | - | |
| 96 | + | |
97 | 97 | | |
98 | | - | |
| 98 | + | |
99 | 99 | | |
100 | 100 | | |
101 | | - | |
| 101 | + | |
102 | 102 | | |
103 | | - | |
| 103 | + | |
104 | 104 | | |
105 | 105 | | |
106 | | - | |
| 106 | + | |
107 | 107 | | |
108 | | - | |
| 108 | + | |
109 | 109 | | |
110 | 110 | | |
111 | | - | |
| 111 | + | |
112 | 112 | | |
113 | | - | |
| 113 | + | |
114 | 114 | | |
115 | 115 | | |
116 | | - | |
| 116 | + | |
117 | 117 | | |
118 | | - | |
| 118 | + | |
119 | 119 | | |
120 | 120 | | |
121 | | - | |
| 121 | + | |
122 | 122 | | |
123 | | - | |
| 123 | + | |
124 | 124 | | |
125 | 125 | | |
126 | | - | |
| 126 | + | |
127 | 127 | | |
128 | | - | |
| 128 | + | |
129 | 129 | | |
130 | 130 | | |
131 | 131 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
59 | 59 | | |
60 | 60 | | |
61 | 61 | | |
62 | | - | |
| 62 | + | |
63 | 63 | | |
64 | 64 | | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
68 | 68 | | |
69 | 69 | | |
70 | 70 | | |
71 | | - | |
| 71 | + | |
72 | 72 | | |
73 | 73 | | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
27 | 27 | | |
28 | 28 | | |
29 | 29 | | |
30 | | - | |
| 30 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
45 | 45 | | |
46 | 46 | | |
47 | 47 | | |
48 | | - | |
| 48 | + | |
49 | 49 | | |
50 | 50 | | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
16 | 16 | | |
17 | 17 | | |
18 | 18 | | |
19 | | - | |
20 | 19 | | |
| 20 | + | |
0 commit comments