Skip to content

Conversation

sciarrilli
Copy link

Issue #, if available:

Description of changes:

Added 3 notebooks for fine-tuning section of workshop. Also added scripts folder with fine-tuning and eval scripts.

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

@ravees ravees self-requested a review October 9, 2025 20:23
@ravees
Copy link
Contributor

ravees commented Oct 9, 2025

@sciarrilli - Great content. See initial feedback below. I'll provide more as I review:

  1. Can you use the same pattern as noted in the other notebooks (for example agents-intro notebook) for pip install !pip install --upgrade -r requirements.txt -q
  2. For loading the SageMaker managed MLflow tracking server ARN the TRACKING_SERVER_ARN variable from the stored value.
  3. Should we add instructions for the participants to get the hf_token especially for participants new to HF?
  4. Is there a reason to check in the notebook cell output to the repo?
  5. Can you move the variable you set into a separate cell where possible and add a comment/markdown note to explain to the reader. Like pytorch_image = '763104351884.dkr.ecr.us-east-1.amazonaws.com/pytorch-training:2.8.0-gpu-py312-cu129-ubuntu22.04-sagemaker'
  6. Remove the hardcoded variables like s3 bucket URL, tracking server arn. 'MLFLOW_TRACKING_URI', 'arn:aws:sagemaker:us-east-1:198346569064:mlflow-tracking-server/vlm-finetuning-server', "adapter_path": 's3://sagemaker-us-east-1-198346569064/qwen3-06b-fine-tuned/'

@sciarrilli
Copy link
Author

@ravees

  1. done
  2. I don't understand what you are asking here?
  3. let me double check this. we might not need to set the hf_token
  4. done
  5. done
  6. let me know the mlflow app name in the workshop studio and i will retrieve mlflow server arn. i need to rewrite the eval workshop to pull down the tar.gz from the last training job.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants