Skip to content

Latest commit

 

History

History
1422 lines (1021 loc) · 66.4 KB

File metadata and controls

1422 lines (1021 loc) · 66.4 KB

Admin Menus

Logging in with an admin account will reveal an extra Administration menu on the bottom left of the sidebar. User information registered in Backend.AI is listed in the Users tab. super-admin role user can see all users' information, create and deactivate a user.

User ID (email), Name (username), Role and Description(User Description) can be filtered by typing text in the search box on each column header.

Create and update users

A user can be created by clicking the '+Create User' button. Note that the password must be longer or equal to 8 characters and at least 1 alphabet/special character/number should be included. The maximum length allowed for E-Mail, User Name, and Full Name is 64 characters.

If a user with the same email or username already exists, it is not possible to create a user account. Please try other email and username.

Check if the user is created.

Click the green button in the Controls panel for more detailed user information. You can also check the domain and project information where the user belongs.

Click the 'Setting (Gear)' in the Controls panel to update information of a user who already exists. User's name, password, activation state, etc. can be changed. User ID (email) cannot be changed.

The user create/update dialog contains the following fields:

  • E-Mail: The user's email address, used as the login ID. Cannot be changed after creation.

  • Username: A unique identifier for the user (up to 64 characters).

  • Full Name: The user's display name (up to 64 characters).

  • Password: Must be at least 8 characters and include at least 1 alphabet, 1 special character, and 1 number.

  • Description: An optional description for the user (up to 500 characters).

  • User Status: Indicates the user's status. Inactive users cannot log in. Before Verification is a status that indicates a user needs an additional step to activate the account such as email verification or an approval from an admin. Note that the inactive users are listed in the Inactive tab separately.

  • Role: The user's role (user, admin, superadmin). Available options depend on the current user's permissions.

  • Domain: The domain to which the user belongs. This field is shown in both the create and update dialogs.

  • Projects: Select one or more projects for the user to belong to. The available projects depend on the domain shown in the dialog.

  • Require password change?: If the admin has chosen random passwords while creating users in batches, this field can be set to ON to indicate that password change is required. The users will see the top bar that notify user to update their password, but this is a kind of descriptive flag which has no effect on actual use.

  • Enable sudo session: Allow the user to use sudo in the compute session. This is useful when the user needs to install packages or run commands that require root privileges. However, it is not recommended to enable this option for all users, as it may cause security issues.

  • 2FA Enabled: A flag indicating whether the user uses two-factor authentication. When using two-factor authentication, users are additionally required to enter an OTP code when logging in. Administrators can only disable two-factor authentication for other users.

  • Resource Policy: From Backend.AI version 24.09, you can select the user resource policy to which the user belongs. For more information about user resource policies, please refer to the user resource policy section.

  • Allowed Client IPs: Restrict which IP addresses can access the system using this user account. Enter IP addresses or CIDR notation (e.g., 10.20.30.40, 10.20.30.0/24). If left empty, access from any IP is allowed.

  • Container UID: The numeric User ID assigned to processes inside the container. This is useful when the container needs to match a specific UID for file permission purposes.

  • Container GID: The default numeric Group ID assigned to processes inside the container.

  • Supplementary GID: Additional numeric Group IDs assigned to container processes. Enter multiple GIDs separated by commas.

  • Main Access Key: (Edit only) Select the main access key used for API authentication among the user's keypairs.

Bulk Create Users

:::note This feature is available only on Backend.AI Manager version 26.2.0 or later. :::

When you need to create multiple user accounts at once, you can use the Bulk Create Users feature. On Manager 26.2.0 or later, an ellipsis (...) dropdown button appears next to the Create User button on the Users page. Click this dropdown button and select Bulk Create Users to open the bulk creation dialog.

The bulk creation dialog contains the following fields. An info banner at the top of the dialog explains that emails and usernames will be auto-generated by appending zero-padded sequential numbers to the prefix.

  • Email prefix (before @): The prefix portion of the auto-generated email addresses. Must contain only letters, numbers, dots, hyphens, or underscores (max 30 characters).
  • Email suffix (after @): The domain portion of the auto-generated email addresses. This field displays a @ prefix automatically (max 30 characters).
  • Number of users: The number of user accounts to create (1 to 100). A live email preview is displayed below this field, showing the email addresses that will be generated. For 4 or fewer users, all emails are shown. For more than 4, the first two, an ellipsis, and the last email are displayed (e.g., student01@example.com, student02@example.com ... student10@example.com).
  • Password: A shared initial password for all created users. The same password rules apply as for single user creation (at least 8 characters with at least 1 alphabet, special character, and number).
  • Password change required: Defaults to ON for bulk-created users. When enabled, each user will be prompted to change their password on first login.
  • Domain: The domain to which the created users will belong.
  • Other fields such as Role, Status, Resource Policy, and Projects are the same as single user creation.

Usernames and email addresses are auto-generated based on the prefix and suffix you provide. For example, if you set the email prefix to student and the email suffix to example.com, and the number of users to 10, the following accounts will be created:

Username Email
student01 student01@example.com
student02 student02@example.com
... ...
student10 student10@example.com

:::note Sequential numbers are zero-padded based on the total number of users. For example, 3 users produce student1 to student3, 10 users produce student01 to student10, and 100 users produce student001 to student100. :::

:::warning If some of the generated usernames or email addresses already exist, the operation will partially succeed. A warning message will display how many users were successfully created and how many failed. :::

Inactivate user account

Deleting user accounts is not allowed even for superadmins, to track usage statistics per user, metric retention, and accidental account loss. Instead, admins can inactivate user accounts to keep users from logging in. Click the delete icon in the Controls panel. A popover asking confirmation appears, and you can deactivate the user by clicking the Deactivate button.

To re-activate users, go to Users - Inactive tab, and select the status of the target user to Active.

:::note Please note that deactivating or reactivating the user does not change the user's credentials, since the user account can have multiple keypairs, which brings it hard to decide which credential should be reactivated. :::

Manage User's Keypairs

Each user account usually have one or more keypairs. A keypair is used for API authentication to the Backend.AI server, after user logs in. Login requires authentication via user email and password, but every request the user sends to the server is authenticated based on the keypair.

A user can have multiple keypairs, but to reduce the user's burden of managing keypairs, we are currently using only one of the user's keypairs to send requests. Also, when you create a new user, a keypair is automatically created, so you do not need to create and assign a keypair manually in most cases.

Keypairs can be listed on the Credentials tab of in the Users page. Active keypairs are shown immediately, and to see the inactive keypairs, click the Inactive panel at the bottom.

Like in Users tab, you can use the buttons in the Controls panel to view or update keypair details. Click the green info icon button to see specific details of the keypair. If necessary, you can copy the secret key by clicking the copy button.

You can modify the resource policy and rate limit of the keypair by clicking the blue 'Setting (Gear)' button. Please keep in mind that if the 'Rate Limit' value is small, API operations such as login may be blocked.

You can also deactivate or reactivate the keypair by clicking red 'Deactivate' button or black 'Activate' button in control column. Unlike the User tab, the Inactive tab allows permanent deletion of key pairs. However, you cannot permanently delete a key pair if it is currently being used as a user's main access key.

If you accidentally deleted a keypair, you can re-create keypair for the user by clicking the '+ ADD CREDENTIAL' button at the upper right corner.

The Rate Limit field is where you specify the maximum number of requests that can be sent to the Backend.AI server in 15 minutes. For example, if set to 1000, and the keypair sends more than 1000 API requests in 15 minutes, and the server throws an error and does not accept the request. It is recommended to use the default value and increase it when the API request frequency goes up high according to the user's pattern.

Share project storage folders with project members

Backend.AI provides storage folders for projects, in addition to user's own storage folder. A project storage folder is a folder belonging to a specific project, not a specific user, and can be accessed by all users in that project.

:::note Project folders can be created only by administrators. Normal users can only access the contents of the project folder created by the administrator. Depending on the system settings, project folders may not be allowed. :::

First, log in with an admin account and create a project folder. After moving to the Data page, click 'Create Folder' to open the folder creation dialog. Enter the folder name, set the Type to Project. When the type is set to Project, it will be automatically assigned to the project selected in the project selector in the header. Permission is set to Read-Only.

After confirming that the folder has been created, log in with the User B's account and check that the project folder just created on the Data & Storage page is displayed without any invitation procedure. You can see that R (Read Only) is also displayed in the Permission panel.

Manage Model Cards

Model cards in the Model Store are created and managed through the Admin Model Store Management interface. Each model card is linked to a storage folder (vfolder) that contains the actual model files.

Setting Up the Model Store Folder

:::note If the model is hosted on Hugging Face as a gated model, you will need to request access before downloading. Refer to Gated models for details. :::

First, set the project to model-store.

Go to the Data page and click the Create Folder button. Configure the folder as follows:

  • Usage Mode: Model
  • Type: Project
  • Permission: Read-Write

After creating the folder, download the model files into it. You can mount the model folder during session creation and use tools such as huggingface-cli to download model weights.

:::note You need to download the model files manually into the folder. For instructions on how to download from Hugging Face, refer to Downloading models. :::

Once the folder and its model files are ready, create a model card through the Admin Model Store Management interface and link it to this folder.

Model Definition File (Advanced — Custom Runtime)

For the Custom runtime variant, you can optionally place a model-definition.yaml file in the model folder. This file tells Backend.AI how to start and operate the inference server during serving — including the startup command, health check settings, and any pre-start actions such as downloading model weights at launch time.

:::note Runtime variants such as vLLM, SGLang, NVIDIA NIM, and Modular MAX do not require a model-definition.yaml file. These variants handle model configuration automatically based on the selected settings. :::

The following is an example model-definition.yaml that starts a vLLM server using the Custom variant:

models:
  - name: "Llama-3.1-8B-Instruct"
    model_path: "/models/Llama-3.1-8B-Instruct"
    service:
      pre_start_actions:
        - action: run_command
          args:
            command:
              - huggingface-cli
              - download
              - --local-dir
              - /models/Llama-3.1-8B-Instruct
              - --token
              - hf_****
              - meta-llama/Llama-3.1-8B-Instruct
      start_command:
        - /usr/bin/python
        - -m
        - vllm.entrypoints.openai.api_server
        - --model
        - /models/Llama-3.1-8B-Instruct
        - --served-model-name
        - Llama-3.1-8B-Instruct
        - --tensor-parallel-size
        - "1"
        - --host
        - "0.0.0.0"
        - --port
        - "8000"
        - --max-model-len
        - "4096"
      port: 8000
      health_check:
        path: /v1/models
        max_retries: 500

For a full description of the model definition format, refer to the Model Definition Guide in the Model Serving documentation.

:::note To enable the Deploy button on a model card in the Model Store, include service-definition.toml in the linked folder so Backend.AI can read the model service configuration. Add model-definition.yaml only when you use the Custom runtime variant; preset runtime variants (such as vLLM, SGLang, NVIDIA NIM, and Modular MAX) do not require it. For details on the service definition file, refer to the Service Definition File section in the Model Serving documentation. :::

Admin Features

Admin Serving Page

Administrators and superadmins can access the Admin Serving page, which provides a cross-project view of all endpoints. This page shows the Project column in addition to the standard endpoint list columns, allowing admins to manage services across all projects.

The Admin Serving page has two tabs:

  • Serving: Displays the endpoint list across all projects, with the same lifecycle and property filters as the user-facing Serving page.
  • Model Store Management: Available to superadmins only. See the section below.

Admin Model Store Management

Superadmins can manage model cards through the Model Store Management tab on the Admin Serving page.

The list provides the following columns:

  • Name: The unique identifier of the model card.
  • Title: The human-readable display name.
  • Category: The model category (e.g., LLM).
  • Task: The inference task type (e.g., text-generation).
  • Access Level: Shows a green Public tag when the model card is publicly accessible, or a default Private tag otherwise.
  • Domain: The domain that owns the model card.
  • Project: The project that owns the model card.
  • Created At: The timestamp when the model card was created.

You can filter the list by Name using the property filter bar at the top. Edit and delete action icons are shown directly in the Name cell of each row.

To delete multiple model cards at once, select the rows you want to remove using the checkboxes and click the red trash-bin button next to the selection count. A confirmation dialog appears before the cards are deleted.

Creating a Model Card

Click the Create Model Card button to open the creation modal. Fill in the following fields:

  • Name (required): A unique identifier for the model card.

  • Title: A human-readable display name.

  • Description: A detailed description of the model.

  • Author: The model creator or organization.

  • Model Version: The version of the model.

  • Task: The inference task type (e.g., text-generation).

  • Category: The model category (e.g., LLM).

  • Framework: The ML framework used (e.g., PyTorch, TensorFlow).

  • Label: Tags for categorization and filtering.

  • License: The license under which the model is distributed.

  • Architecture: The model architecture (e.g., Transformer).

  • README: A markdown README for the model.

  • Domain: The domain to associate the model card with.

  • Project ID (required): The project that owns the model card.

  • VFolder (required): The storage folder containing the model files.

  • Access Level: Controls who can see the model card in the user-facing Model Store.

    • Internal: Visible only to administrators of the owning domain and project. Regular users cannot see internal cards in their Model Store.
    • Public: Visible to all users who have access to the owning project.

Editing a Model Card

Click the edit icon next to the model card name to modify an existing model card. The edit modal opens with previously entered fields already filled in.

Deleting Model Cards

You can delete an individual model card by clicking the delete icon next to its name, or perform bulk deletion by selecting multiple model cards with the row checkboxes and clicking the red trash-bin button next to the selection count.

Prometheus Query Presets

Backend.AI lets administrators define reusable Prometheus query presets that auto-scaling rules and other monitoring features can reference by name. A preset bundles a metric name, a PromQL query template, an optional time window, and optional filter / group labels so operators do not have to retype the same query for every rule.

The presets are managed from the Prometheus Preset tab on the Admin Deployments page (/admin-deployments?tab=prometheus-preset).

:::note This tab is admin-only and is visible only when the Backend.AI Manager advertises the prometheus-query-preset capability. If the tab does not appear in your environment, your Manager build does not yet support this feature. :::

List & Filter

The preset table lists all Prometheus query presets across the cluster. Each row shows:

  • Name: A unique, human-readable identifier for the preset. The cell also exposes inline Edit and Delete actions.
  • ID: The preset's internal identifier.
  • Metric Name: The metric this preset reports (used as the display label by consumers such as auto-scaling rules).
  • Query Template: The PromQL expression that will be executed. The cell is copyable — hover over the value and click the copy icon to copy the full template to the clipboard. This is useful when you want to paste the template into a Prometheus UI to verify the result.
  • Time Window: The default look-back window (for example, 5m) used when the query references a range vector.
  • Category: The optional category the preset belongs to (with the resolved category name and the category ID).
  • Options: The optional Filter Labels and Group Labels that consumers can apply on top of the preset.
  • Created At / Updated At: Timestamps maintained automatically by the server.

You can search and narrow the list with the property filter above the table, and click any column header to change the sort order.

Column Settings Persistence

The table includes a column-settings control that lets you hide columns you do not need and reorder the visible columns. Your choices are persisted across sessions per browser, so the table opens with your preferred layout the next time you visit the tab. Resetting the column settings restores the default Backend.AI layout.

Create a Preset

Click Add Preset at the top right of the table to open the Create Preset modal.

The modal contains the following fields:

  • Name: The preset's unique name. Must be unique across all Prometheus query presets.
  • Description: A free-form description shown alongside the preset in selectors.
  • Category: An optional category for grouping related presets. Leave empty for No category.
  • Metric Name: The metric label that consumers (for example, auto-scaling rules) will display.
  • Query Template: The PromQL expression to execute. As you type, a live preview area below the field calls the server's adminPrometheusQueryPresetPreview query and shows the current value the query returns against your Prometheus instance, so you can verify the template works before saving. The preview is debounced and updates automatically as you edit.
  • Time Window: The default range-vector window, for example 5m. Leave empty if the query does not use a range vector.
  • Filter Labels: Optional list of label selectors that consumers can apply on top of the preset.
  • Group Labels: Optional list of labels to group the query result by.

Click Create to save the preset. On success, the preset appears in the list and a confirmation toast is shown.

Edit a Preset

Click the Edit action in the Name cell of a preset row to open the Edit Preset modal. The modal is pre-populated with the preset's current values and exposes the same fields as the Create dialog, including the live preview area for the Query Template.

Click Save to apply your changes. Consumers of the preset (for example, auto-scaling rules referencing it) automatically pick up the new query template the next time they evaluate the metric.

Delete a Preset

Click the Delete action in the Name cell of a preset row to open the deletion confirmation modal.

:::danger Deleting a Prometheus query preset is permanent and cannot be undone. Auto-scaling rules and other features that reference the deleted preset will lose their query template and may stop functioning until they are reconfigured to point at a different preset. :::

Because deletion is irreversible, the dialog requires you to type the preset's name into the confirmation input before the Delete button becomes enabled. This typed-confirmation pattern (BAIConfirmModalWithInput) is used consistently across Backend.AI for permanent-delete actions. Type the exact preset name shown in the dialog title and click Delete to confirm.

Manage Resource Policies

Keypair Resource Policy

In Backend.AI, administrators have the ability to set limits on the total resources available for each keypair, user, and project. Resource policies enable you to define the maximum allowed resources and other compute session-related settings. Additionally, it is possible to create multiple resource policies for different needs, such as user or research requirements, and apply them on an individual basis.

The Resource Policies page allows administrators to view a list of all registered resource policies. Administrators can review the resource policies established for keypairs, users, and projects directly on this page. Let's begin by examining the resource policies for keypairs. In the figure below, there are three policies in total (gardener, student, default). The infinity symbol (∞) indicates that no resource restrictions have been applied to those resources.

The user account being used in this guide is currently assigned to the default resource policy. This can be verified in the Credentials tab on the Users page. You can also confirm that all resource policies are set to default in the Resource Policies panel.

To modify resource policies, click the 'Setting (Gear)' in the Control column of the default policy group. In the Update Resource Policy dialog, every option is editable except for Policy Name, which serves as the primary key for distinguishing resource policies in the list. Uncheck the Unlimited checkbox at the bottom of CPU, RAM, and fGPU, and set the resource limits to the desired values. Ensure that the allocated resources are less than the total hardware capacity. In this case, set CPU, RAM, and fGPU to 2, 4, and 1 respectively. Click the OK button to apply the updated resource policy.

About details of each option in resource policy dialog, see the description below.

  • Resource Policy

    • CPU: Specify the maximum amount of CPU cores. (max value: 512)
    • Memory: Specify the maximum amount of memory in GB. It would be good practice to set memory twice as large as the maximum value of GPU memory. (max value: 1024)
    • CUDA-capable GPU: Specify the maximum amount of physical GPUs. If fractional GPU is enabled by the server, this setting has no effect. (max value: 64)
    • CUDA-capable GPU (fractional): Fractional GPU (fGPU) is literally split a single GPU to multiple partitions in order to use GPU efficiently. Notice that the minimum amount of fGPU required is differed by each image. If fractional GPU is not enabled by the server, this settings has no effect. (max value: 256)
  • Sessions

    • Cluster Size: Set the maximum limit for the number of multi-containers or multi-nodes that can be configured when creating a session.
    • Session Lifetime (sec.): Limits the maximum lifetime of a compute session from the reservation in the active status, including PENDING and RUNNING statuses. After this time, the session will be force-terminated even if it is fully utilized. This will be useful to prevent the session from running indefinitely.
    • Max Pending Session Count: Maximum number of compute sessions that can be in the PENDING status simultaneously.
    • Concurrent Jobs: Maximum number of concurrent compute session per keypair. If this value is set to 3, for example, users bound to this resource policy cannot create more than 3 compute sessions simultaneously. (max value: 100)
    • Idle timeout (sec.): Configurable period of time during which the user can leave their session untouched. If there is no activity at all on a compute session for idle timeout, the session will be garbage collected and destroyed automatically. The criteria of the "idleness" can be various and set by the administrators. (max value: 15552000 (approx. 180 days))
    • Max Concurrent SFTP Sessions: Maximum number of concurrent SFTP sessions.
  • Folders

    • Allowed hosts: Backend.AI supports many NFS mountpoint. This field limits the accessibility to them. Even if a NFS named "data-1" is mounted on Backend.AI, users cannot access it unless it is allowed by resource policy.
    • (Deprecated since 23.09.4) Max. #: the maximum number of storage folders that can be created/invited. (max value: 100).

In the keypair resource policy list, check that the Resources value of the default policy has been updated.

You can create a new resource policy by clicking the '+ Create' button. Each setting value is the same as described above.

To create a resource policy and associate it with a keypair, go to the Credentials tab of the Users page, click the gear button located in the Controls column of the desired keypair, and click the Select Policy field to choose it.

You can also delete each of resource keypairs by clicking trash can icon in the Control column. When you click the icon, the confirmation popup will appears. Click 'Delete' button to erase."

:::note If there's any users (including inactive users) following a resource policy to be deleted, deletion may not be done. Before deleting a resource policy, please make sure that no users remain under the resource policy. :::

If you want to hide or show specific columns, click the 'Setting (Gear)' at the bottom right of the table. This will bring up a dialog where you can select the columns you want to display.

User Resource Policy

Starting from version 24.03, Backend.AI supports user resource policy management. While each user can have multiple keypairs, a user can only have one user resource policy. In the user resource policy page, users can set restrictions on various settings related to folders such as Max Folder Count and Max Folder Size, as well as individual resource limits like Max Session Count Per Model Session and Max Customized Image Count.

To create a new user resource policy, click the Create button.

  • Name: The name of the user resource policy.
  • Max Folder Count: The maximum number of folders that the user can create. If the user's folder count exceeds this value, user cannot create a new folder. If set to Unlimited, it is displayed as "∞".
  • Max Folder Size: The maximum size of the user's storage space. If user's storage space exceeds this value, user cannot create a new data folder. If set to Unlimited, it is displayed as "∞".
  • Max Session Count Per Model Session: The maximum number of available sessions per model service created by a user. Increasing this value can put a heavy load on the session scheduler and potentially lead to system downtime, so please caution when adjusting this setting.
  • Max Customized Image Count: The maximum number of customized images that user can create. If user's customized image count exceeds this value, user cannot create a new customized image. If you want to know more about customized images, please refer to the My Environments section.

To update, click the 'Setting (Gear)' button in the control column. To delete, click the trash can button.

:::note Changing a resource policy may affect all users who use that policy, so use it with caution. :::

Similar to keypair resource policy, users can select and display only columns users want by clicking the 'Setting (Gear)' button at the bottom right of the table.

Project Resource Policy

Starting from version 24.03, Backend.AI supports project resource policy management. Project resource policies manage storage space (quota) and folder-related limitations for projects.

When clicking the Project tab of the Resource Policies page, you can see the list of project resource policy.

To create a new project resource policy, click the + Create button at the top right of the table.

  • Name: The name of the project resource policy.
  • Max Folder Count: The maximum number of project folders that an administrator can create. If the project folder count exceeds this value, the administrator will not be able to create a new project folder. If set to Unlimited, it will be displayed as "∞".
  • Max Folder Size: The maximum size of the project's storage space. If the project's storage space exceeds this value, the administrator cannot create a new project folder. If set to Unlimited, it is displayed as "∞".
  • Max Network Count: The maximum number of networks that can be created for the project since Backend.AI version 24.12. If set to Unlimited, it is displayed as "∞".

The meaning of each field is similar to the user resource policy. The difference is that the project resource policy is applied to the project folders, while the user resource policy is applied to the user folders.

If you want to make changes, click the Setting (Gear) button in the control column. Resource policy names cannot be edited. Deletion can be done by clicking the trash can icon button.

:::note Changing a resource policy may affect all users who use that policy, so use it with caution. :::

You can select and display only the columns you want by clicking the Setting (Gear) button at the bottom right of the table.

To save the current resource policy as a file, click the 'more' button in the upper right of each tab and select the 'Export CSV' menu item.

Unified View for Pending Sessions

From Backend.AI version 25.13.0, a unified view for pending sessions is available in the Admin Menu. The Admin Session page provides a unified view of all pending sessions within a selected resource group. The index number displayed next to the status indicates the queue position in which the session will be created once sufficient resources become available.

Similar to the Session page, you can click the session name to open a drawer that displays detailed information about the session.

Fair Share Scheduler

From Backend.AI core version 26.2.0 and later, the Fair Share Scheduler page is available in the Administration menu. This feature allows administrators to manage fair share scheduling weights across a hierarchical structure of resource groups, domains, projects, and users.

Fair share scheduling allocates compute resources based on historical usage patterns, ensuring that resources are distributed fairly among users. Users who have consumed fewer resources in the past receive higher scheduling priority, while those who have used more resources are given lower priority. Administrators can fine-tune this behavior by adjusting weights at each level of the hierarchy.

:::note The Fair Share Scheduler is only available when a resource group's scheduler type is set to FAIR_SHARE. To configure the scheduler type for a resource group, refer to the Manage resource group section. :::

To access this feature, click the Scheduler menu item in the Administration section of the sidebar. The page displays a Fair Share Setting tab with a 4-step drill-down interface.

The page is organized into four hierarchical steps:

  1. Resource Group: Configure core fair share parameters for each resource group
  2. Domain: Set weights for domains within a resource group
  3. Project: Set weights for projects within a domain
  4. User: Set weights for individual users within a project

A step indicator bar at the top of the page shows your current position in the hierarchy. Completed steps display the name of the selected item. You can click on any completed step to navigate back to that level.

If the selected resource group does not have its scheduler type set to FAIR_SHARE, a warning alert is displayed indicating that the Fair Share Scheduler is not enabled for that resource group.

At each step, the following common features are available:

  • Filtering: Use the property-based search filter to narrow results by name. At the User step, additional filters for email and active status are available.
  • Sorting: Click column headers to sort the table by that column.
  • Pagination: Navigate through results with configurable page size.
  • Auto-refresh: Data refreshes automatically every 7 seconds. A manual refresh button is also available.

Resource Group

The Resource Group step displays a table of all resource groups with their fair share configuration.

The table includes the following columns:

  • Name: The resource group name. Click the name to drill into the domain-level settings for that resource group.
  • Control: A settings (gear) button that opens the Resource Group Fair Share Settings modal.
  • Allocation: Resource usage showing used/capacity for each resource type allocated to the resource group (e.g., CPU, Memory, CUDA GPU).
  • Resource Weight: Per-resource-type weights. Displays "default" if using the default weight.
  • Default Weight: The fallback weight value for domains, projects, and users without a specified weight.
  • Decay Unit: The period (in days) for aggregating usage.
  • Half Life: The period (in days) over which the usage reflection rate decreases by half.
  • Lookback: The range (in days) of usage history reflected in calculations.

Resource Group Fair Share Settings

Click the settings (gear) button in the Control column of a resource group to open the Fair Share Settings modal.

:::warning Changes are not immediately reflected in Fair Share calculations and may take approximately 5 minutes due to the calculation cycle. :::

The modal contains the following fields:

  • Resource Group: Read-only field showing the resource group name.
  • Half Life: The period over which the usage reflection rate decreases by half, specified in days (minimum 1). For example, if set to 7 days, usage from 7 days ago is calculated at 50%, and usage from 14 days ago at 25%. It is recommended to set this as a multiple of the decay unit.
  • Lookback: The range of usage history reflected in Fair Share calculations, specified in days (minimum 1). Usage prior to this period is excluded from calculations. It is recommended to set this as a multiple of the half life.
  • Default Weight: The default value applied to domains, projects, and users without a specified weight (minimum 1, step 0.1).
  • Resource Weights: Per-resource-type weights (e.g., CPU, Memory, GPU), each with a minimum value of 1 and step 0.1. This section is only displayed if resource weights exist for the resource group.

Domain

After selecting a resource group, the Domain step displays a table of domains with their fair share weights and usage within that resource group.

The table includes the following columns:

  • Name: The domain name. Click the name to drill into project-level settings for that domain.
  • Control: A settings (gear) button that opens the weight setting modal for this domain.
  • Weight: The current weight value. Displays "default" if using the default weight.
  • Fair Share Factor: The scheduling priority calculated by the scheduler. Higher values indicate higher priority.
  • Resource Allocation: Average daily decayed resource usage per resource type (CPU, Memory, GPU / Day).
  • Modified At: The last modification timestamp.
  • Created At: The creation timestamp.

You can select multiple rows using the checkboxes on the left side of the table. When rows are selected, two additional buttons appear:

  • Usage Graph (chart icon): Opens the Usage History modal for the selected items.
  • Bulk Edit (gear icon): Opens the weight setting modal to edit weights for all selected items at once.

Project

After selecting a domain, the Project step displays a table of projects with the same column structure as the Domain step. Click a project name to drill into the User step.

The same bulk operations (Usage Graph and Bulk Edit) are available when rows are selected.

User

After selecting a project, the User step displays a table of individual users with their fair share weights and usage.

The table includes the following columns:

  • Email: The user's email address.
  • Name: The user's name.
  • Control: A settings (gear) button that opens the weight setting modal for this user.
  • Weight: The current weight value. Displays "default" if using the default weight.
  • Fair Share Factor: The scheduling priority calculated by the scheduler.
  • Resource Allocation: Average daily decayed resource usage per resource type.
  • Modified At: The last modification timestamp.
  • Created At: The creation timestamp.

:::note At the User step, additional filter properties are available: email, name, and active status. :::

The same bulk operations (Usage Graph and Bulk Edit) are available when rows are selected.

Editing Fair Share Weights

To edit the fair share weight for a domain, project, or user, click the settings (gear) button in the Control column of the desired row. This opens the weight setting modal.

:::warning Changes are not immediately reflected in Fair Share calculations and may take approximately 5 minutes due to the calculation cycle. :::

In single-edit mode, the modal displays the entity name (read-only) and a weight input field.

  • Weight: The multiplier that determines Fair Share scheduling priority. Higher weight results in higher priority. The default value is "1.0". A weight of "2.0" has twice the priority of "1.0". The minimum value is 1 with a step of 0.1.

To edit weights for multiple items at once, select the desired rows using the checkboxes in the table, then click the Bulk Edit (gear icon) button. In bulk-edit mode, the modal displays a tag list of all selected entities and a single weight input that will be applied to all of them.

:::note If the selected resource group does not have its scheduler type set to FAIR_SHARE, a warning alert is displayed in the modal. :::

Viewing Usage History

To view the usage history for domains, projects, or users, select the desired rows using the checkboxes in the table, then click the Usage Graph (chart icon) button. This opens the Usage History modal.

The modal displays the following:

  • Date range picker: Select a date range for the usage history. Presets are available for Last 7 Days, Last 30 Days, and Last 90 Days.
  • Refresh button: Manually refresh the usage data.
  • Context information: Shows the resource group, domain, and project (depending on the current step).
  • Selected entities: Displayed as tags showing the names of the selected items.
  • Usage chart: A chart showing the average daily resource usage over the selected period.

Manage Images

Admins can manage images, which are used in creating a compute session, in the Images tab of the Environments page. In the tab, meta information of all images currently in the Backend.AI server is displayed. You can check information such as registry, architecture, namespace, image name, digest, and minimum resources required for each image. For images downloaded to one or more agent nodes, there will be an installed tag in the Status column.

:::note The feature to install images by selecting specific agents is currently under development. :::

The image list displays additional columns for more detailed image information:

  • Architecture: The CPU architecture of the image (e.g., x86_64, aarch64).
  • Namespace: The namespace of the image within the registry.
  • Base Image Name: The base name of the image, with alias tags for easier identification.
  • Version: The version tag of the image.
  • Tags: Detailed tags associated with the image, displayed as double tags with aliases.

You can select multiple uninstalled images and click the Install button to install them on available agent nodes in bulk.

You can change the minimum resource requirements for each image by clicking the 'Setting (Gear)' in the Controls panel. Each image has hardware and resource requirements for minimal operation. (For example, for GPU-only images, there must be a minimum allocated GPU.) The default value for the minimum resource amount is provided as embedded in the image's metadata. If an attempt is made to create a compute session with a resource that is less than the amount of resources specified in each image, the request is automatically adjusted to the minimum resource requirements for the image and then generated, not cancelled.

:::note Don't change the minimum resource requirements to an amount less than the predefined value! The minimum resource requirements included in the image metadata are values that have been tested and determined. If you are not really sure about the minimum amount of resources you want to change, leave it in the default. :::

Additionally, you can add or modify the supported apps for each image by clicking the 'Apps' icon located in the Controls column. Once you click the icon, the name of the app and its corresponding port number will be displayed accordingly.

In this interface, you can add supported custom applications by clicking the '+ Add' button below. To delete an application, simply click the 'red trash can' button on the right side of each row.

:::note You need to reinstall the image after changing the managed app.

:::

Manage docker registry

You can click on the Registries tab in Environments page to see the information of the docker registry that are currently connected. cr.backend.ai is registered by default, and it is a registry provided by Harbor.

:::note In the offline environment, the default registry is not accessible, so click the trash icon on the right to delete it. :::

Click the refresh icon in Controls to update image metadata for Backend.AI from the connected registry. Image information which does not have labels for Backend.AI among the images stored in the registry is not updated.

You can add your own private docker registry by clicking the '+ Add Registry' button. The registry creation dialog contains the following fields:

  • Registry Name: A unique name for the registry (up to 50 characters). Must match the prefix used in image names stored in the registry.
  • Registry URL: The URL of the registry. A scheme such as http:// or https:// must be explicitly included.
  • Username: Optional. Fill in if you have separate authentication settings in the registry.
  • Password: Optional. When editing an existing registry, check the "Change Password" checkbox to modify it.
  • Registry Type: Select the type of registry. Supported types include: docker, harbor, harbor2, github, gitlab, ecr, and ecr-public.
  • Project Name: The project or namespace in the registry (required). Use the full path including namespace and project name for GitLab registries.
  • Extra Information: A JSON string for additional configuration needed for each registry type. This field is available from version 24.09.3.

GitLab Container Registry Configuration

When adding a GitLab container registry, you must specify the api_endpoint in the Extra Information field. This is required because GitLab uses separate endpoints for the container registry and the GitLab API.

For GitLab.com (public instance):

  • Registry URL: https://registry.gitlab.com
  • Extra Information: {"api_endpoint": "https://gitlab.com"}

For self-hosted (on-premise) GitLab:

  • Registry URL: Your GitLab registry URL (e.g., https://registry.example.com)
  • Extra Information: {"api_endpoint": "https://gitlab.example.com"}

:::note The api_endpoint should point to your GitLab instance URL, not the registry URL. :::

Additional configuration notes:

  • Project path format: When specifying the project, use the full path including namespace and project name (e.g., namespace/project-name). Both components are required for the registry to function correctly.

  • Access token permissions: The access token used for the registry must have both read_registry and read_api scopes. The read_api scope is required for Backend.AI to query the GitLab API for image metadata during rescan operations.

You can also update the information of an existing registry, except the Registry Name.

After creating a registry and updating the image metadata, users still cannot use the images immediately. You must enable the registry by toggling the Enabled switch in the registry list to allow users to access images from the registry.

Manage resource preset

The following predefined resource presets are displayed in the Resource allocation panel when creating a compute session. Superadmin can manage these resource presets.

Go to the Resource Presets tab on the Environment page. You can check the list of currently defined resource presets.

You can set resources such as CPU, RAM, fGPU, etc. to be provided by the resource preset by clicking the 'Setting (Gear)' (cogwheel) in the Controls panel. Create or Modify Resource Preset modal shows fields of the resources currently available. Depending on your server's settings, certain resources may not be visible. After setting the resources with the desired values, save it and check if the corresponding preset is displayed when creating a compute session. If available resources are less than the amount of resources defined in the preset, the corresponding preset would not be shown.

The resource preset dialog includes:

  • Preset Name: A unique name for the preset (only alphanumeric characters, periods, hyphens, and underscores allowed).
  • Resource Group: (Conditional) Associate the preset with a specific resource group.
  • Resource Preset: Dynamic fields for each available resource type (CPU, Memory, GPU, etc.). Memory fields support dynamic unit input (MiB, GiB, TiB, PiB).
  • Shared Memory: The amount of shared memory allocated for the preset. This value must be less than the Memory value.

Also you can create resource preset by Clicking '+ Create Presets' button in the right top of the Resource Presets tab. You cannot create the same resource preset name that already exists, since it is the key value for distinguishing each resource preset.

Manage agent nodes

Superadmins can view the list of agent worker nodes, currently connected to Backend.AI, by visiting the Resources page. You can check agent node's IP, connecting time, actual resources currently in use, etc. The WebUI does not provide the function to manipulate agent nodes.

Query agent nodes

Also You can see exact usage about the resources in the agent worker node by Click note icon in the Control panel.

On Terminated tab, you can check the information of the agents that has been connected once and then terminated or disconnected. It can be used as a reference for node management. If the list is empty, then it means that there's no disconnection or termination occurred.

Set schedulable status of agent nodes

You may want to prevent new compute sessions from being scheduled to an Agent service without stopping it. In this case, you can disable the Schedulable status of the Agent. Then, you can block the creation of a new session while preserving the existing sessions on the Agent.

Manage resource group

Agents can be grouped into units called resource (scaling) groups. For example, let's say there are 3 agents with V100 GPUs and 2 agents with P100 GPUs. You want to expose two types of GPUs to users separately, then you can group three V100 agents into one resource group, and the remaining two P100 agents into another resource group.

Adding a specific agent to a specific resource group is not currently handled in the WebUI, and it can be done by editing agent config file from the installation location and restart the agent daemon. Management of the resource groups is possible in Resource Group tab of the Resource page.

You can edit a resource group by clicking the 'Setting (Gear)' in the Control panel. In the Select scheduler field, you can choose the scheduling method for creating a compute session. Currently, there are four types: FIFO, LIFO, DRF, and FAIR_SHARE. FIFO and LIFO are scheduling methods creating the first- or the last-enqueued compute session in the job queue. DRF stands for Dominant Resource Fairness, and it aims to provide resources as fair as possible for each user. FAIR_SHARE allocates compute resources based on historical usage patterns. For more details, refer to the Fair Share Scheduler section. You can deactivate a resource policy by turning off Active Status.

The resource group edit dialog contains the following additional fields:

  • Allowed session types: Since users can choose the type of session, the resource group can allow certain types. You should allow at least one session type. The allowed session types are Interactive, Batch, Inference, and System.
  • App Proxy Server Address: Sets the App Proxy (formerly WSProxy) address for the resource group's Agents to use. If you set a URL in this field, App Proxy will relay the traffic of an app like Jupyter directly to the compute session via Agent bypassing Manager (v2 API). By enabling the v2 API, you can lower the Manager's burden when using app services. If a direct connection from App Proxy to the Agent node is not available, leave this field blank to fall back to the v1 API.
  • App Proxy API Token: The API token for authenticating with the App Proxy server.
  • Active: Toggle the active status of the resource group.
  • Public: When enabled, the resource group is visible to all users.
  • Pending timeout: A compute session will be canceled if it stays PENDING status for longer than the Pending timeout. When you wish to prevent a session from remaining PENDING indefinitely, set this time. Set this value to zero (0) if you do not want to apply the pending timeout feature.
  • Retries to skip pending session: The number of retries the scheduler tries before skipping a PENDING session. It can be configured to prevent the situation where one PENDING session blocks the scheduling of the subsequent sessions indefinitely (Head-of-line blocking, HOL). If no value is specified, the global value in Etcd will be used (num retries to skip, default three times).

You can create a new resource policy by clicking the '+ Create' button. Likewise other creating options, you cannot create a resource policy with the name that already exists, since name is the key value.

Storages

On STORAGES tab, you can see what kind of mount volumes (usually NFS) exist. From 23.03 version, We provide per-user/per-project quota setting on storage that supports quota management. By using this feature, admin can easily manage and monitor the exact amount of storage usage for each user and project based folder.

In order to set quota, you need to first access to storages tab in resource page. And then, click 'Setting (Gear)' in control column.

:::note Please remind that quota setting is only available in storage that provides quota setting (e.g. XFS, CephFS, NetApp, Purestorage, etc.). Although you can see the usage of storage in quota setting page regardless of storage, you cannot configure the quota which doesn't support quota configuration internally.

:::

Quota Setting Panel

In Quota setting page, there are two panels.

  • Overview panel

    • Usage: Shows the actual amount usage of the selected storage.
    • Endpoint: Represents the mount point of the selected storage.
    • Backend Type: The type of storage.
    • Capabilities: The supported feature of the selected storage.
  • Quota Settings

    • For User: Configure per-user quota setting here.
    • For Project: Configure per-project quota(project-folder) setting here.
    • ID: Corresponds to user or project id.
    • Hard Limit (GB): Currently set hard limit quota for selected quota.
    • Control: Provides editing the hard limit or even deleting the quota setting.

Set User Quota

In Backend.AI, there are two types of vfolders created by user and admin(project). In this section, we would like to show how to check current quota setting per-user and how to configure it. First, make sure the active tab of quota settings panel is For User. Then, select user you desire to check and edit the quota. You can see the quota id that corresponds to user's id and the configuration already set in the table, if you already set the quota.

Of course, if you want to edit the quota, you can simply click the Edit button in the control column. After Clicking Edit button, you may see the small modal that enables configuring quota setting. After input the exact amount, don't forget to Click OK button, unless the changes will not be applied.

Set Project Quota

Setting a quota on project-folder is similar to setting a user quota. The difference between setting project quota and user quota is to confirm setting the project quota requires one more procedure, which is selecting the domain that the project is dependent on. The rest are the same. As in the picture below, you need to first select the domain, and then select the project.

Unset Quota

We also provides the feature to unset the quota. Please remind that after removing the quota setting, quota will automatically follows user or project default quota, which cannot be set in WebUI. If you want to change the default quota setting, you may need to access to admin-only page. By Clicking Unset button in control column, the small snackbar message will show up and confirm whether you really want to delete the current quota setting. If you click OK button in the snackbar message, then it will delete the quota setting and automatically reset the quota follows to corresponding quota, which depends on the quota type(user / project).

:::note If there's no config per user/project, then corresponding values in the user/project resource policy will be set as a default value. For example, If no hard limit value for quota is set, max_vfolder_size value in the resource policy is used as the default value. :::

Download session lists

:::note This feature is currently not available on the default Session page. To use this feature, please enable 'Classic Session list page' option in the 'Switch back to the Classic UI' section on the User Setting page. For more details, please refer to Backend.AI User Settings section. :::

There's additional feature in Session page for admin. On the right side of the FINISHED tab there is a menu marked with .... When you click this menu, a sub-menu export CSV appears.

If you click this menu, you can download the information of the comcpute sessions created so far in CSV format. After the following dialog opens, enter an appropriate file name (if necessary), click the EXPORT button and you will get the CSV file. Please note that a file name can have up to 255 characters.

System settings

In the Configurations page, you can see main settings of Backend.AI server. Currently, it provides several controls which can change and list settings.

You can change image auto install and update rule by selecting one option from Digest, Tag, None. Digest is kind of checksum for the image which verifies integrity of the image and also enhances efficiency in downloading images by reusing duplicated layers. Tag is only for developing option since it does not guarantee the Integrity of the image.

:::note Don't change rule selection unless you completely understand the meaning of each rule. :::

The Configurations page also displays the status of plugins and enterprise features:

Plugins:

  • Open Source CUDA GPU support: Status of CUDA GPU support.
  • ROCm GPU support: Status of ROCm GPU support.

Enterprise Features:

  • Fractional GPU: Fractional GPU (fGPU) virtualization for sharing GPUs across sessions.

Backend.AI supports a wide range of AI accelerators across multiple vendors:

  • NVIDIA
    • Spark (GB10)
    • Blackwell (B300, B200, RTX Pro 6000, etc.)
    • Hopper (H200, H100 NVL, etc.)
    • Grace Superchip (GB300, GB200, GH200, etc.)
    • Turing (Titan RTX, RTX 8000, T4)
    • Ampere (A100, A40, A10, etc.)
    • Ada Lovelace (L40S, L4)
    • Jetson (TX, Xavier, Orin, Thor, etc.)
  • Intel
    • Gaudi 3
    • Gaudi 2
    • Gaudi 1
    • Arc
  • AMD
    • Instinct MI Series (including MI300X)
    • MI300A
    • MI250
  • Rebellions
    • ATOM Max
    • ATOM+
    • REBEL
  • FuriosaAI
    • RNGD
  • Tenstorrent
    • Wormhole n150s
    • Wormhole n300s
  • Google
    • TPU v7 (Ironwood)
    • Coral TPU v5p
    • Coral TPU v5e
    • TPU v4
  • Graphcore
    • C600 IPU
    • Bow IPU
  • HyperAccel
    • LPU
  • Groq
    • LPU
  • Cerebras
    • WSE-3
  • SambaNova
    • SN40L

When a user launches a multi-node cluster session, which is introduced at version 20.09, Backend.AI will dynamically create an overlay network to support private inter-node communication. Admins can set the value of the Maximum Transmission Unit (MTU) for the overlay network, if it is certain that the value will enhance the network speed.

:::note For more information about Backend.AI Cluster session, please refer to Backend.AI Cluster Compute Session section. :::

You can edit the configuration per job scheduler by clicking the Scheduler's config button. The values in the scheduler setting are the defaults to use when there is no scheduler setting in each resource group. If there is a resource group-specific setting, this value will be ignored.

Currently supported scheduling methods include FIFO, LIFO, and DRF. Each method of scheduling is exactly the same as the scheduling methods above. Scheduler options include session creation retries. Session creation retries refers to the number of retries to create a session if it fails. If the session cannot be created within the trials, the request will be ignored and Backend.AI will process the next request. Currently, changes are only possible when the scheduler is FIFO.

:::note We will continue to add broader range of setting controls. :::

:::note System settings are default settings. If resource group has certain value, then it overrides configured value in system settings. :::

Server management

Go to the Maintenance page and you will see some buttons to manage the server.

  • RECALCULATE USAGE: Occasionally, due to unstable network connections or container management problem of Docker daemon, there may be a case where the resource occupied by Backend.AI does not match the resource actually used by the container. In this case, click the RECALCULATE USAGE button to manually correct the resource occupancy.
  • RESCAN IMAGES: Update image meta information from all registered Docker registries. It can be used when a new image is pushed to a Backend.AI-connected docker registry.

:::note We will continue to add other settings needed for management, such as removing unused images or registering periodic maintenance schedules. :::

Detailed Information

In Information page, you can see several detailed information and status of each feature. To see Manager version and API version, check the Core panel. To see whether each component for Backend.AI is compatible or not, check the Component panel.

:::note This page is only for showing current information. :::

RBAC Management

RBAC (Role-Based Access Control) Management allows superadmins to define roles with fine-grained permissions and assign them to users. You can control which actions specific users are allowed to perform on various resources throughout the Backend.AI system.

:::note RBAC Management is only available to superadmins and requires Backend.AI Manager version 26.4.0 or later. :::

For detailed information about managing roles, permissions, and user assignments, refer to the dedicated RBAC Management page.