Skip to content

[RunPod] Use zone to provision in a specific data center ID #5166

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 7 commits into
base: master
Choose a base branch
from

Conversation

Kovbo
Copy link
Contributor

@Kovbo Kovbo commented Apr 10, 2025

RunPod has a data_center_id property that enables deployment in a specific region.

Currently, SkyPilot’s region property uses RunPod’s country_code property, which only allows selecting the deployment country. An additional zone parameter to deploy in a specific data center.

Copy link
Collaborator

@Michaelvll Michaelvll left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for adding this @Kovbo! Left a question below. : )

@@ -305,7 +305,7 @@ def launch(cluster_name: str, node_type: str, instance_type: str, region: str,
'min_vcpu_count': 4 * gpu_quantity,
'min_memory_in_gb': gpu_specs['memoryInGb'] * gpu_quantity,
'gpu_count': gpu_quantity,
'country_code': region,
'data_center_id': region,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we are changing this, do we need to update SkyPilot runpod catalog below as well to make the region points to data_center_id?
https://github.com/skypilot-org/skypilot-catalog/blob/master/catalogs/v6/runpod/vms.csv

Copy link
Contributor Author

@Kovbo Kovbo Apr 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think we need to update the region column and list region codes (datacenter codes) instead of countries.

Will contact RunPod to be sure, but I think the price in all regions is the same, as well as the available GPUs.

Should I change it?
Is there a more efficient way to structure the list, rather than displaying 20+ rows with identical information for different regions? For example, could we consolidate them into a single row with the regions comma-separated like here?

Also, the price seems to be outdated.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be awesome if you could help update the catalog as well, so that we can get this PR in. Either consolidating them or not should be fine. : )

If you would like conslidate them, you can update the way runpod catalog is loaded here.

Actually, I am wondering whether the data center id is more similar to the availability zone concept in other clouds? Should we keep the region code and add another zone column for the data center id?

For the outdated price, yes, a automatic catalog fetcher should be implemented like the other clouds did, so that the pricing / offerings could be updated automatically: https://github.com/skypilot-org/skypilot/tree/master/sky/clouds/service_catalog/data_fetchers
The fetcher could be done later if it takes much more effort : )

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I can update the catalog.

This is a good point about the zone! I don't have a strong opinion, but it may be a good idea to keep the current country as a region to keep it backward compatible and speed up provisioning if no region/zone is specified.

I assume that setting a zone parameter will automatically pass it to config.provider_config['availability_zone'], and we can access it in the run_instances function?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes! You may edit the following function to return an additional zone information in the dict with key 'availability_zone':

return {
'instance_type': instance_type,
'custom_resources': custom_resources,
'region': region.name,
'image_id': image_id,
'use_spot': use_spot,
'bid_per_gpu': str(hourly_cost),
'docker_username_for_runpod': docker_username_for_runpod,
}

And add the same key to the runpod specific configuration template:

region: "{{region}}"

similar to the other cloud:
availability_zone: {{zones}}

After those, in the run_instances function, you will be able to find the config in the config.provider_config['availability_zone'] : )

Bohdan Kovalevskyi added 2 commits April 11, 2025 17:28
@Kovbo Kovbo changed the title Use data center ID as the region identifier instead of the country code [RunPod] [RunPod] Use data center ID as the region identifier instead of the country code Apr 12, 2025
@Kovbo Kovbo changed the title [RunPod] Use data center ID as the region identifier instead of the country code [RunPod] Use zone to provision in a specific data center ID Apr 12, 2025
@Kovbo Kovbo marked this pull request as draft April 12, 2025 00:44
@Kovbo Kovbo marked this pull request as ready for review April 23, 2025 00:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants