docs: add MLflow integration documentation to fine-tuning examples#1
Open
briangallagher wants to merge 118 commits into
Open
docs: add MLflow integration documentation to fine-tuning examples#1briangallagher wants to merge 118 commits into
briangallagher wants to merge 118 commits into
Conversation
… Hat OpenShift AI
…anite/granite-3.3-2b-instruct` model on the Alpaca dataset. Update training function and arguments for better demonstration, and improve markdown documentation for clarity.
Fixed MD032 errors by ensuring lists are surrounded by blank lines.
Signed-off-by: Brian Gallagher <briangal@gmail.com>
Signed-off-by: Brian Gallagher <briangal@gmail.com>
Signed-off-by: Brian Gallagher <briangal@gmail.com>
….2-finetuning-examples feat: add rhoai 3.2 fine tuning examples
…validation - Fix PVC mount path: notebook mounts shared PVC at /opt/app-root/src/shared - Add NOTEBOOK_SHARED_PATH and TRAINING_POD_PATH for correct path handling - Use PreTrainedTokenizerFast to bypass AutoTokenizer hub validation - Load model config explicitly to avoid hub validation issues - Set HF_HUB_OFFLINE and TRANSFORMERS_OFFLINE env vars
- Add detailed model description (Qwen 2.5 1.5B Instruct) - Add dataset documentation (Stanford Alpaca format and structure) - Add training configuration tables with parameter explanations - Add progress tracking architecture explanation - Add PVC mounting and checkpoint structure documentation - Add summary with what you accomplished and next steps - Add quick reference tables for TransformersTrainer parameters
- Add comprehensive README with setup instructions - Add images for workbench setup walkthrough - Include model/dataset documentation - Add validation configuration - Add TransformersTrainer quick reference
- Fix syntax error: add missing if statement for final_path check - Remove trailing whitespace from blank lines - Remove unnecessary 'r' mode argument from open() calls
…nctionality - Replace "Monitor Training Progress" section with "Follow Job Logs" for better context - Stream job logs in real-time during training - Add job status retrieval and detailed progress metrics display - Improve error handling and namespace retrieval logic - Clean up unnecessary imports and comments
… and clarity - Update PVC mount paths to reflect SDK's automatic mounting at /mnt/kubeflow-checkpoints - Clarify comments regarding model/data paths and checkpointing - Improve documentation on checkpoint configuration and training arguments - Ensure consistent output messages for checkpoints in both notebook and training pods
…pynb Co-authored-by: Rob Bell <robell@redhat.com>
…-pre-commit Fix: Test case for no stored outputs and consistent versions for code-quality
* doc-updates-model-serve-1 * doc-update minor fix module >step * doc-update fix typos * doc-updates change prereq to previous modules * doc-update markdown in notebooks * doc-update SME review comments
…e/model_serve_1
Signed-off-by: Saad Zaher <szaher@redhat.com>
…ng-3.4GA-examples Update for 3.4 and refactor to simplify
* Feature: AutoML time series forecasting tutorial (electricity sample) (red-hat-data-services#62) * Replace main branch with rhoai-3.4 in autorag .md files * docs: Updated Workbech section and documentation links (red-hat-data-services#69) Assisted By Cursor Signed-off-by: Dorota Laczak <dlaczak@redhat.com> * automl time-series tutorial draft Signed-off-by: Lukasz Cmielowski <lcmielow@redhat.com> Assisted-by: Cursor * Update time_series_forecasting_tutorial.md Signed-off-by: Lukasz Cmielowski <lcmielow@redhat.com> Assisted-by: Cursor * promo known_covariates_names example added Signed-off-by: Lukasz Cmielowski <lcmielow@redhat.com> Assisted-by: Cursor * replace the git urls from autox to main Signed-off-by: Lukasz Cmielowski <lcmielow@redhat.com> Assisted-by: Cursor # Conflicts: # examples/autorag/readme.md * clean up the dev preview status mentions Signed-off-by: Lukasz Cmielowski <lcmielow@redhat.com> Assisted-by: Cursor * adding time-series pipeline to readme Signed-off-by: Lukasz Cmielowski <lcmielow@redhat.com> Assisted-by: Cursor * Change branch reference from main -> rhoai-3.4. Signed-off-by: Dorota Laczak <dlaczak@redhat.com> * Improved formatting, and notebook section Assisted by Cursor Signed-off-by: Dorota Laczak <dlaczak@redhat.com> * docs: Updated Workbech section and documentation links Assisted By Cursor Signed-off-by: Dorota Laczak <dlaczak@redhat.com> * chore: AutoML examples update how to get artifacts Assisted by Cursor Signed-off-by: Dorota Laczak <dlaczak@redhat.com> * chore: Updated Model registry anbd deployment steps Signed-off-by: Dorota Laczak <dlaczak@redhat.com> --------- Signed-off-by: Dorota Laczak <dlaczak@redhat.com> Signed-off-by: Lukasz Cmielowski <lcmielow@redhat.com> Co-authored-by: Michal Steczko <msteczko@redhat.com> Co-authored-by: Dorota Laczak <dlaczak@redhat.com> * updated images * updated image Signed-off-by: ZabinskiMichal <mzabinsk@redhat.com> * updated images * updated pipeline (red-hat-data-services#72) * Custom column names in TS scenario tutorial update (red-hat-data-services#73) * updated tutorial * deleted unnecessary instrucion part * last small change * Update documentation with the latest UI changes (red-hat-data-services#75) Signed-off-by: MichalSteczko <msteczko@redhat.com> * chore: Removed pipeline.yaml file for AutoRAG (red-hat-data-services#79) Assisted by Cursor Signed-off-by: Dorota Laczak <dlaczak@redhat.com> * docs(AutoML): Updated AutoML tutorials with UI path (red-hat-data-services#77) updatedTabular and TimeSeries tutorials to new UI flow * chore: Fixed for issues found by Markdownlinter Assisted by Claude Code Signed-off-by: Dorota Laczak <dlaczak@redhat.com> * chore: Fixed issues found by CodeRabbit in PR review. Assisted by Claude Code Signed-off-by: Dorota Laczak <dlaczak@redhat.com> * chore: Updated KServe AutoGluon Server repo link Signed-off-by: Dorota Laczak <dlaczak@redhat.com> --------- Signed-off-by: Dorota Laczak <dlaczak@redhat.com> Signed-off-by: Lukasz Cmielowski <lcmielow@redhat.com> Signed-off-by: ZabinskiMichal <mzabinsk@redhat.com> Signed-off-by: MichalSteczko <msteczko@redhat.com> Co-authored-by: Lukasz Cmielowski <lcmielow@redhat.com> Co-authored-by: Michal Steczko <msteczko@redhat.com> Co-authored-by: ZabinskiMichal <mzabinsk@redhat.com> Co-authored-by: Michał Żabiński <85452231+ZabinskiMichal@users.noreply.github.com>
Signed-off-by: Saad Zaher <szaher@redhat.com>
8b53222 to
704462c
Compare
Signed-off-by: Saad Zaher <szaher@redhat.com>
…ed-hat-ai-examples into ray-rag-pipeline
Signed-off-by: Saad Zaher <szaher@redhat.com>
…s/ray-rag-pipeline add rag example using Ray Data, kfp, docling
Co-authored-by: Cursor <cursoragent@cursor.com>
…59077 RHOAIENG-59077: Add Ray Data & Docling RAG example
e34a451 to
5b069d1
Compare
- Add shared MLflow guide (examples/fine-tuning/mlflow.md) covering enabling the operator, creating the CR, and viewing experiments - Link to the shared guide from lora, osft, and sft READMEs - Add screenshots showing the Experiments page and run metrics - Note that the KB article link requires Red Hat Customer Portal login Co-authored-by: Cursor <cursoragent@cursor.com>
55ebb20 to
cd1c7ed
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
The interactive notebooks already contain the MLFLOW_EXPERIMENT_NAME environment variable cells; this PR adds the missing README documentation so users know how to take advantage of the feature.
Test plan
Made with Cursor