|
| 1 | +--- |
| 2 | +title: How to use a custom environment in the VEDA JupyterHub |
| 3 | +--- |
| 4 | + |
| 5 | +The VEDA JupyterHub lists a few pre-configured environments (collections of |
| 6 | +Python or R libraries and/or other software) as **Server Options** intended to |
| 7 | +cover most common geospatial data science needs. These are pre-built Docker |
| 8 | +images ready for users to launch. |
| 9 | + |
| 10 | +If you need a different configuration, there are two options for launching a |
| 11 | +*custom environment*: |
| 12 | + |
| 13 | +- **Use some other pre-built, public Docker image** (see |
| 14 | + [Environment: Other](#environment-other)) |
| 15 | +- **Build your own Docker image** (see |
| 16 | + [Environment: Build your own image](#environment-build-your-own-image)) |
| 17 | + |
| 18 | +Each option is covered in the following sections. |
| 19 | + |
| 20 | +## Environment: Other |
| 21 | + |
| 22 | +If you are aware of an existing pre-built, *publicly accessible Docker image* |
| 23 | +that meets your needs, on the **Server Options** page, open the **Environment** |
| 24 | +drop-down list, choose **Other...**, and specify the public identifier of the |
| 25 | +image in the **Custom image** field. |
| 26 | + |
| 27 | +For example, if a Modified Pangeo Notebook environment were not already one of |
| 28 | +the pre-built environments listed, you could use the standard, publicly |
| 29 | +available Pangeo Notebook image yourself, like so: |
| 30 | + |
| 31 | + |
| 32 | + |
| 33 | +At a minimum, the **Custom image** field must include both a name and a tag in |
| 34 | +the form `NAME:TAG`. There is no assumption that a name without a tag implies |
| 35 | +the `latest` tag, so a tag must be supplied. |
| 36 | + |
| 37 | +When the hostname of the image registry is not specified, it defaults to |
| 38 | +`docker.io`, the canonical hostname of the official Docker registry. Therefore, |
| 39 | +the following custom image identifiers are equivalent: |
| 40 | + |
| 41 | +- `pangeo/pangeo-notebook:2025.01.10` (registry hostname is _implicitly_ `docker.io`) |
| 42 | +- `docker.io/pangeo/pangeo-notebook:2025.01.10` |
| 43 | + |
| 44 | +If you wish to use an image hosted in a registry other than `docker.io`, the |
| 45 | +custom image identifier must also include the registry hostname (in addition to |
| 46 | +a name and a tag). For example, images hosted in the GitHub Container Registry |
| 47 | +use the hostname `ghrc.io`, such as this "Tidyverse based R image with Python": |
| 48 | +`ghcr.io/nmfs-opensci/container-images/py-rocket-base:4.4-3.10` |
| 49 | + |
| 50 | +::: {.callout-warning} |
| 51 | +If you specify `latest` for the image tag, keep in mind that the image tagged |
| 52 | +`latest` may change over time, always referring to the latest published version. |
| 53 | +If you wish to "pin" the image each time you launch an environment, then use a |
| 54 | +specific tag, as shown in the examples above, to ensure you're launching an |
| 55 | +environment from the same image version each time. |
| 56 | +::: |
| 57 | + |
| 58 | +Once you have specified your custom image, choose your desired **Resource |
| 59 | +Allocation** and press the **Start** button at the bottom of the form to launch |
| 60 | +your environment. |
| 61 | + |
| 62 | +## Environment: Build your own image |
| 63 | + |
| 64 | +When you wish to use a custom environment, but are not aware of a publicly |
| 65 | +available Docker image that suits your needs, the **Environment** option **Build |
| 66 | +your own image** provides a relatively easy and convenient means for building |
| 67 | +your desired image, without requiring knowledge of Docker. |
| 68 | + |
| 69 | +This option leverages the |
| 70 | +[repo2docker](https://repo2docker.readthedocs.io/en/stable/index.html) tool to |
| 71 | +handle the grunt work of building a custom Docker image for you, by allowing you |
| 72 | +to **fully describe your desired image via files in a public code repository**. |
| 73 | + |
| 74 | +::: {.callout-note} |
| 75 | +Currently, the VEDA JupyterHub supports only **GitHub** as a repository provider |
| 76 | +for use with repo2docker. |
| 77 | +::: |
| 78 | + |
| 79 | +### Describing Your Image |
| 80 | + |
| 81 | +The beauty of **repo2docker** is that it will use the contents of a public code |
| 82 | +repository (repo) to build a Docker image, doing all of the heavy lifting for |
| 83 | +you. You do not need to know how to write a `Dockerfile` because it generates |
| 84 | +one for you. Hence the name *repo2docker*. |
| 85 | + |
| 86 | +In brief, repo2docker allows you to describe the following within a public |
| 87 | +GitHub repository (for full details, see [Configuring your repository]): |
| 88 | + |
| 89 | +- A **list of packages and runtimes** to install (**in addition to** a |
| 90 | + [base set of packages]), specified via various files, such as the following |
| 91 | + (among others): |
| 92 | + - `environment.yml`: for any kind of conda package (and/or `pip` package), |
| 93 | + including specific versions of programming languages |
| 94 | + - `install.R`: for R packages (if not using `environment.yml`) |
| 95 | + - `requirements.txt`: for `pip` packages (if not using `environment.yml`) |
| 96 | + - `runtime.txt`: for specifying versions of runtimes (such as Python or R) |
| 97 | + when other files (such as `install.R` and `requirements.txt`) do not support |
| 98 | + doing so |
| 99 | + - `apt.txt`: for specifying Ubuntu packages installed via `apt` |
| 100 | +- *Optionally*, a **post-build script** to run after all packages and runtimes |
| 101 | + are installed (this must be named [postBuild]) |
| 102 | +- *Optionally*, a **pre-session script** to run each time you launch a new |
| 103 | + environment with your image (this must be named [start]) |
| 104 | + |
| 105 | +Such files from your repository are used to construct a `Dockerfile` file on |
| 106 | +your behalf, so you don't have to write one yourself, unless you really want to. |
| 107 | +If your repo contains a `Dockerfile`, repo2docker will ignore all other files in |
| 108 | +your repo that it would otherwise use to generate a `Dockerfile`, and simply use |
| 109 | +the `Dockerfile` from your repo directly. |
| 110 | + |
| 111 | +For complete details on what repo2docker supports, see the following: |
| 112 | + |
| 113 | +- [Configuring your repository] |
| 114 | +- [Where to put configuration files] |
| 115 | +- [Architecture] |
| 116 | +- [Frequently Asked Questions (FAQ)] |
| 117 | +- [Example repositories] |
| 118 | + |
| 119 | +Once you have described your image in your repository, you are ready to build |
| 120 | +it, as described in the next section. |
| 121 | + |
| 122 | +### Building Your Image |
| 123 | + |
| 124 | +When you choose **Build your own image** from the **Environment** list, you must |
| 125 | +specify values for 2 other fields --- **Repository** and **Git Ref** --- which |
| 126 | +indicate which repository to use and which commit within the repository to use |
| 127 | +(valid values for these fields are explained below the screenshot): |
| 128 | + |
| 129 | + |
| 130 | + |
| 131 | +In the **Repository** field, you must specify a public GitHub repository in |
| 132 | +either of the following forms: |
| 133 | + |
| 134 | +- **Full URL** (e.g., <https://github.com/binder-examples/conda>) |
| 135 | +- **Namespace/name pair** (e.g., `binder-examples/conda`), where a namespace |
| 136 | + is either a GitHub *username* or *organization* (`binder-examples` in this |
| 137 | + example). |
| 138 | + |
| 139 | +In the **Git Ref** field, if you want repo2docker to use the most recent changes |
| 140 | +(latest commit) on the default branch of the repository (typically `main`), then |
| 141 | +use the default value **HEAD** to indicate the latest commit. Alternatively, |
| 142 | +you may specify a **branch name**, a **tag**, or a specific **commit |
| 143 | +identifier** to indicate a different commit for repo2docker to use. |
| 144 | + |
| 145 | +::: {.callout-note} |
| 146 | +When specifying a commit identifier such as **HEAD** or a **branch name**, such |
| 147 | +a commit identifier may reference different commits over time because it |
| 148 | +represents a *logical* commit, not a *specific* commit. In the case of **HEAD** |
| 149 | +or a **branch name**, it represents the *latest* commit on the specified branch |
| 150 | +(whichever *specific* commit that happens to be at the current moment). |
| 151 | + |
| 152 | +This means that each time you choose to build your image using such a *logical* |
| 153 | +value for **Git Ref**, the system will rebuild your image if the logical |
| 154 | +reference points to a different commit than it did the last time you built your |
| 155 | +image. This is useful for making a series of alternating commits and builds |
| 156 | +during "development" of your image. |
| 157 | +::: |
| 158 | + |
| 159 | +Once you've specified both **Repository** and **Git Ref**, click the **Build |
| 160 | +image** button to trigger repo2docker to build your image (which may take |
| 161 | +several minutes), and you should see log messages appear below the **Build |
| 162 | +image** button, similar to the following: |
| 163 | + |
| 164 | + |
| 165 | + |
| 166 | +Once your image is built, you should see something similar to this: |
| 167 | + |
| 168 | + |
| 169 | + |
| 170 | +At this point, your image is ready for use. As with all other environment |
| 171 | +options, select your desired **Resource Allocation** and click the **Start** |
| 172 | +button to launch a Docker container using your custom image. |
| 173 | + |
| 174 | +::: {.callout-note} |
| 175 | +When your server is ready, the conda environment named "notebook" will be |
| 176 | +activated for you. Even if your repository contains an `environment.yml` file |
| 177 | +with a `name` entry, the name specified within your file will be ignored. While |
| 178 | +all dependencies in your file will be installed, the name of the environment |
| 179 | +will always be "notebook." |
| 180 | +::: |
| 181 | + |
| 182 | +[base set of packages]: |
| 183 | + https://github.com/jupyterhub/repo2docker/blob/HEAD/repo2docker/buildpacks/conda/environment.yml |
| 184 | +[postBuild]: |
| 185 | + https://repo2docker.readthedocs.io/en/stable/config_files.html#postbuild-run-code-after-installing-the-environment |
| 186 | +[start]: |
| 187 | + https://repo2docker.readthedocs.io/en/stable/config_files.html#start-run-code-before-the-user-sessions-starts |
| 188 | +[Configuring your repository]: |
| 189 | + https://repo2docker.readthedocs.io/en/stable/configuration/index.html |
| 190 | +[Architecture]: |
| 191 | + https://repo2docker.readthedocs.io/en/stable/architecture.html |
| 192 | +[Frequently Asked Questions (FAQ)]: |
| 193 | + https://repo2docker.readthedocs.io/en/stable/faq.html |
| 194 | +[Where to put configuration files]: |
| 195 | + https://repo2docker.readthedocs.io/en/stable/usage.html#where-to-put-configuration-files |
| 196 | +[Example repositories]: |
| 197 | + https://github.com/binder-examples |
0 commit comments