|
5531 | 5531 | </span> |
5532 | 5532 | </a> |
5533 | 5533 |
|
5534 | | -</li> |
5535 | | - |
5536 | | - <li class="md-nav__item"> |
5537 | | - <a href="#apply-runtimeclassname-to-abcdesktop-config-release-43" class="md-nav__link"> |
5538 | | - <span class="md-ellipsis"> |
5539 | | - Apply runtimeClassName to abcdesktop config (release >= 4.3 ) |
5540 | | - </span> |
5541 | | - </a> |
5542 | | - |
5543 | 5534 | </li> |
5544 | 5535 |
|
5545 | 5536 | <li class="md-nav__item"> |
@@ -5752,6 +5743,9 @@ <h3 id="create-an-ephemeral-container-inside-simple-pod_1">Create an ephemeral c |
5752 | 5743 | - name: NVIDIA_DRIVER_CAPABILITIES |
5753 | 5744 | value: all |
5754 | 5745 | </code></pre></div> |
| 5746 | +<blockquote> |
| 5747 | +<p>The <code>NVIDIA_VISIBLE_DEVICES</code> is set to set GPU UUID.</p> |
| 5748 | +</blockquote> |
5755 | 5749 | <p>Run a debug ephemeral container in <code>nvidia-pod</code></p> |
5756 | 5750 | <div class="highlight"><pre><span></span><code>kubectl debug -it nvidia-pod --image=ubuntu --target=cuda-container --profile=general --custom=custom-profile-nvidia-gpu.yaml -- nvidia-smi -L |
5757 | 5751 | </code></pre></div> |
@@ -5802,44 +5796,6 @@ <h3 id="delete-nvidia-pod">Delete <code>nvidia-pod</code></h3> |
5802 | 5796 | </code></pre></div> |
5803 | 5797 | <h2 id="conclusion">Conclusion</h2> |
5804 | 5798 | <p>Setting <code>runtimeClassName: nvidia</code> on pod manifest allows ephemeral containers to share the pod's GPU.</p> |
5805 | | -<h2 id="apply-runtimeclassname-to-abcdesktop-config-release-43">Apply <code>runtimeClassName</code> to abcdesktop config (release >= 4.3 )</h2> |
5806 | | -<p>Get the <code>od.config</code> file</p> |
5807 | | -<p>If you don't already have the config file <code>od.config</code>, run the command line </p> |
5808 | | -<div class="highlight"><pre><span></span><code>kubectl -n abcdesktop get configmap abcdesktop-config -o jsonpath='{.data.od\.config}' > od.config |
5809 | | -</code></pre></div> |
5810 | | -<ul> |
5811 | | -<li>Edit <code>od.config</code> and update the dictionary <code>desktop.pod</code> to add <code>'runtimeClassName':'nvidia'</code> in <code>spec</code> and save your od.config file.</li> |
5812 | | -</ul> |
5813 | | -<div class="highlight"><pre><span></span><code>desktop.pod : { |
5814 | | - # default spec for all containers |
5815 | | - # can be overwritten on dedicated container spec |
5816 | | - # value inside mustrache like {{ uidNumber }} is replaced by context run value |
5817 | | - # for example {{ uidNumber }} is the uid number define in ldap server |
5818 | | - 'spec' : { |
5819 | | - 'shareProcessNamespace': False, |
5820 | | - 'securityContext': { |
5821 | | - 'supplementalGroups': [ '{{ supplementalGroups }}' ], |
5822 | | - 'runAsUser': '{{ uidNumber }}', |
5823 | | - 'runAsGroup': '{{ gidNumber }}' |
5824 | | - }, |
5825 | | - 'tolerations': [], |
5826 | | - 'runtimeClassName': 'nvidia' |
5827 | | - }, |
5828 | | - ... |
5829 | | -</code></pre></div> |
5830 | | -<ul> |
5831 | | -<li>Update the configmap <code>abcdesktop-config</code></li> |
5832 | | -</ul> |
5833 | | -<div class="highlight"><pre><span></span><code>kubectl create -n abcdesktop configmap abcdesktop-config --from-file=od.config -o yaml --dry-run=client | kubectl replace -n abcdesktop -f - |
5834 | | -</code></pre></div> |
5835 | | -<ul> |
5836 | | -<li>Restart deployment <code>pyos-od</code></li> |
5837 | | -</ul> |
5838 | | -<div class="highlight"><pre><span></span><code>kubectl rollout restart deployment pyos-od -n abcdesktop |
5839 | | -</code></pre></div> |
5840 | | -<ul> |
5841 | | -<li>Create a new desktop pod to check the <code>runtimeClassName</code></li> |
5842 | | -</ul> |
5843 | 5799 | <h2 id="links">Links</h2> |
5844 | 5800 | <ul> |
5845 | 5801 | <li>nvidia gpu-operator/23.6.2</li> |
|
0 commit comments