-
Notifications
You must be signed in to change notification settings - Fork 37
odh-1736 adding customize models content #1048
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
WalkthroughAdds multiple AsciiDoc assembly and module documents for a model customization workflow: 4 new assemblies and ~12 new modules covering environment setup, Python index/mirroring, container image builds, data preparation (including synthetic data), training, examples, and support guidance. Integrates these into the main customization guide. Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~12 minutes
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 6
🧹 Nitpick comments (4)
modules/about-the-python-index.adoc (1)
12-12: Minor: Inconsistent capitalization in table header.Line 12 uses "UBi9" which appears inconsistent with standard naming conventions. Consider standardizing to "UBI9" or "UBI 9" for clarity.
modules/estimate-memory-usage.adoc (1)
8-9: Consider converting file references to actual links.Lines 8-9 reference code files as inline code but don't provide clickable links. Consider whether these should be external links to the repository or if they serve as documentation references only.
modules/import-example-notebooks.adoc (1)
15-34: Inconsistent URL formatting in table.Line 16 uses proper AsciiDoc
link:syntax, but lines 21, 26, and 31 use backticks withoutlink:syntax for repository URLs. For consistency and to ensure clickable links in rendered documentation, apply the same formatting pattern to all repository URLs. Consider whether all should use thelink:syntax or if backticks are intentional.Apply this diff to standardize URL formatting:
-|`https://github.com/Red-Hat-AI-Innovation-Team/sdg_hub.git` +|link:https://github.com/Red-Hat-AI-Innovation-Team/sdg_hub.git[https://github.com/Red-Hat-AI-Innovation-Team/sdg_hub.git] -|`https://github.com/Red-Hat-AI-Innovation-Team/training_hub.git` +|link:https://github.com/Red-Hat-AI-Innovation-Team/training_hub.git[https://github.com/Red-Hat-AI-Innovation-Team/training_hub.git] -|`https://github.com/red-hat-data-services/red-hat-ai-examples.git` +|link:https://github.com/red-hat-data-services/red-hat-ai-examples.git[https://github.com/red-hat-data-services/red-hat-ai-examples.git]modules/compare-the-performance-of-osft-and-sft.adoc (1)
24-26: Improve readability of performance comparison explanations.Lines 24-26 contain dense, complex technical information with overly long sentences (each exceeds 200 characters). Break these into shorter, more digestible segments with better structural formatting. Consider using separate bullet points or definition lists for each metric.
For example, for the memory scaling explanation, separate the concept definition from the formula and the practical implication:
* *Memory scaling:* OSFT memory scales linearly with the unfreeze rank ratio (URR) which is a hyperparameter for OSFT that is a value between 0 and 1 representing the fraction of the matrix rank that is unfrozen and updated during fine-tuning. A rough comparison can be expressed as OSFT Memory ~ 3r times SFT Memory where r is the URR unfreeze rank ratio — the fraction of the matrix being fine-tuned. At URR = 1/3, OSFT and SFT have similar memory usage. But in most post-training setups, URR values below 1/3 are sufficient for learning new tasks, making OSFT notably lighter in memory. +* *Memory scaling:* OSFT memory scales linearly with the unfreeze rank ratio (URR), a hyperparameter representing the fraction of the matrix rank that is unfrozen and updated during fine-tuning. URR is a value between 0 and 1. + +The rough memory comparison is: OSFT Memory ≈ 3r × SFT Memory, where r is the URR value. + +At URR = 1/3, OSFT and SFT have similar memory usage. However, in most post-training setups, URR values below 1/3 are sufficient for learning new tasks, making OSFT notably lighter in memory.Similarly, restructure the training time explanation for clarity.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (17)
assemblies/generate-synthetic-data-to-augment-real-data.adoc(1 hunks)assemblies/prepare-your-data-for-ai-consumption.adoc(1 hunks)assemblies/set-up-your-working-environment.adoc(1 hunks)assemblies/train-the-model-by-using-your-prepared-data.adoc(1 hunks)customize-models-to-build-gen-ai-applications.adoc(1 hunks)modules/about-the-python-index.adoc(1 hunks)modules/build-a-custom-container-image.adoc(1 hunks)modules/clone-an-example-git-repository.adoc(1 hunks)modules/compare-the-performance-of-osft-and-sft.adoc(1 hunks)modules/end-to-end-model-customization-workflow.adoc(1 hunks)modules/estimate-memory-usage.adoc(1 hunks)modules/explore-the-sdg-hub-examples.adoc(1 hunks)modules/explore-the-training-hub-examples.adoc(1 hunks)modules/import-example-notebooks.adoc(1 hunks)modules/mirror-the-python-index.adoc(1 hunks)modules/overview-of-the-model-customization-workflow.adoc(1 hunks)modules/support-philosophy.adoc(1 hunks)
🔇 Additional comments (12)
assemblies/prepare-your-data-for-ai-consumption.adoc (1)
1-20: Assembly structure looks good.The module declaration, context handling, conditional blocks, and includes follow the proper AsciiDoc assembly patterns. The leveloffset=+1 for nested modules is correct.
modules/mirror-the-python-index.adoc (1)
12-12: Verify test URL is intentional.Line 12 uses
cuda-ubi9-testin the URL. Confirm whether this test endpoint is correct for production documentation, or if it should match the production endpoint (cuda-ubi9) referenced in modules/about-the-python-index.adoc.modules/support-philosophy.adoc (1)
1-30: Well-structured support philosophy document.The document clearly articulates support scope, benefits, and encourages appropriate community engagement. Proper use of AsciiDoc formatting with clear sections and bullet points. All attribute references appear consistent with documentation standards.
modules/overview-of-the-model-customization-workflow.adoc (1)
12-15: Clarify commented-out xref and add missing cross-reference.Line 13 has a commented-out xref for "prepare-your-data-for-ai-consumption". Either activate this xref or explain why it's commented out. Additionally, line 15 mentions "Automate data processing steps by building AI pipelines" but lacks a cross-reference—consider whether this should link to relevant documentation.
Verify that:
- The xref at line 13 should be active (referencing prepare-your-data-for-ai-consumption)
- The text at line 15 should have a corresponding xref or link
- All xref IDs exist in the referenced assemblies/modules
assemblies/generate-synthetic-data-to-augment-real-data.adoc (1)
1-15: Assembly structure is well-formed.The module declaration, context handling, and include directive follow proper AsciiDoc assembly patterns. The description clearly explains SDG Hub functionality, and the leveloffset is correctly set for nested content inclusion.
modules/end-to-end-model-customization-workflow.adoc (1)
1-8: LGTM. Appropriate introductory concept module with well-formed external links.modules/clone-an-example-git-repository.adoc (1)
11-11: Verify Table 2 reference. Line 11 references "Table 2" but the table definition is not visible in this module. Confirm this table exists in the source document or clarify the intent.modules/explore-the-sdg-hub-examples.adoc (1)
1-45: LGTM. Well-structured procedure module with clear prerequisites, comprehensive steps, and proper cross-references. Code examples and resource links are properly formatted.assemblies/set-up-your-working-environment.adoc (1)
1-19: LGTM. Assembly structure is correct with proper context handling and module includes. The intentional leveloffset=+2 on line 16 establishes the correct document hierarchy.modules/build-a-custom-container-image.adoc (1)
6-6: Verify Table 1 reference. Line 6 references "Table 1" but the table definition is not visible in this module. Confirm this table exists in the source document or clarify the intent.customize-models-to-build-gen-ai-applications.adoc (1)
25-26: Clarify data processing assembly placeholder. Lines 25-26 contain a commented-out include for a data processing assembly. Confirm whether this is intentionally deferred or if a reference should be added now. The comment should be removed in the final version if no assembly is planned for this location.modules/compare-the-performance-of-osft-and-sft.adoc (1)
20-20: Clarify the reference to the examples/docs directory.The reference to "The
examples/docsdirectory" is vague. Please provide either the full path or a more specific reference to improve findability. Consider linking to specific documentation modules if these are available.
6c501da to
d5df8d5
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (1)
modules/build-a-custom-container-image.adoc (1)
33-53: Fix incomplete and incorrect placeholders in code example to make it functional.The container building example contains multiple incomplete placeholders and syntax errors that prevent users from following it successfully:
- Line 37:
podman pull <link to registry.redhat>is a non-functional placeholder. Must specify the actual registry URL.- Line 41:
FROM <image name>needs a concrete base image reference.- Line 52:
podman build <new_image_name>is incomplete—missing the required-tflag for tagging and.for build context.- Line 51 comment: The
#Build the custom imagecomment uses Dockerfile syntax but appears after the Dockerfile block, creating confusion about command flow.Apply this diff to provide complete, functional example code:
[source, bash] ---- -podman pull <link to registry.redhat> +podman pull registry.redhat.io/ubi9/python-39:latest # Log in via `podman login registry.redhat.io` + +# Create a Dockerfile +cat > Dockerfile << 'EOF' -FROM <image name> +FROM registry.redhat.io/ubi9/python-39:latest # Install software # Install Python, pip, and then the docling library RUN pip3 install docling # Define the default command to run when the container starts CMD ["docling", "--help"] - -#Build the custom image -podman build <new_image_name> +EOF + +# Build the custom image +podman build -t my-custom-docling-image:latest .This ensures users have:
- A working pull command with the correct Red Hat registry
- A complete, valid Dockerfile with proper FROM and RUN instructions
- A complete podman build command with tag (
-t) and build context (.)
🧹 Nitpick comments (1)
modules/compare-the-performance-of-osft-and-sft.adoc (1)
24-28: Slightly awkward phrasing — optional improvement.Line 24 contains redundant attribution: "URR which is a hyperparameter for OSFT that is a value between 0 and 1." The phrase combines "which is a hyperparameter" with "that is a value," which is slightly repetitive.
Consider simplifying to: "The unfreeze rank ratio (URR), a hyperparameter for OSFT with a value between 0 and 1, represents..."
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (17)
assemblies/generate-synthetic-data-to-augment-real-data.adoc(1 hunks)assemblies/prepare-your-data-for-ai-consumption.adoc(1 hunks)assemblies/set-up-your-working-environment.adoc(1 hunks)assemblies/train-the-model-by-using-your-prepared-data.adoc(1 hunks)customize-models-to-build-gen-ai-applications.adoc(1 hunks)modules/about-the-python-index.adoc(1 hunks)modules/build-a-custom-container-image.adoc(1 hunks)modules/clone-an-example-git-repository.adoc(1 hunks)modules/compare-the-performance-of-osft-and-sft.adoc(1 hunks)modules/end-to-end-model-customization-workflow.adoc(1 hunks)modules/estimate-memory-usage.adoc(1 hunks)modules/explore-the-sdg-hub-examples.adoc(1 hunks)modules/explore-the-training-hub-examples.adoc(1 hunks)modules/import-example-notebooks.adoc(1 hunks)modules/mirror-the-python-index.adoc(1 hunks)modules/overview-of-the-model-customization-workflow.adoc(1 hunks)modules/support-philosophy.adoc(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (12)
- modules/overview-of-the-model-customization-workflow.adoc
- modules/estimate-memory-usage.adoc
- assemblies/prepare-your-data-for-ai-consumption.adoc
- customize-models-to-build-gen-ai-applications.adoc
- modules/explore-the-sdg-hub-examples.adoc
- modules/clone-an-example-git-repository.adoc
- modules/mirror-the-python-index.adoc
- modules/support-philosophy.adoc
- modules/import-example-notebooks.adoc
- assemblies/set-up-your-working-environment.adoc
- modules/about-the-python-index.adoc
- assemblies/generate-synthetic-data-to-augment-real-data.adoc
🔇 Additional comments (7)
modules/build-a-custom-container-image.adoc (1)
1-31: LGTM—Module structure and basic examples are clear and well-formatted.The metadata, introduction, and simple pip install examples follow AsciiDoc conventions and provide helpful context on pre-configured Python index and system trust store usage. The three concrete library examples (docling, sdg-hub, training-hub) are clear and actionable.
modules/end-to-end-model-customization-workflow.adoc (1)
1-8: LGTM!The module is well-structured with clear reference to the Knowledge Tuning example. The external links are properly formatted with HTTPS.
assemblies/train-the-model-by-using-your-prepared-data.adoc (2)
20-24: Previous issue resolved.The duplicate word "the the" mentioned in past review comments has been corrected. Line 24 now correctly reads with a single "the" as expected.
14-18: Includes are properly structured.The module includes at lines 14, 16, and 18 use consistent leveloffset syntax and reference valid module paths.
modules/explore-the-training-hub-examples.adoc (2)
16-16: Previous xref issue resolved.The broken cross-reference that used section number "2.4.1" has been corrected to use the proper ID format
xref:clone-an-example-git-repository[...].
32-37: Previous security best practice applied.The insecure HTTP link to red.ht has been updated to use HTTPS (line 36), aligning with documentation security best practices.
modules/compare-the-performance-of-osft-and-sft.adoc (1)
1-32: Overall structure and technical content look good.The module clearly explains both algorithms and provides a meaningful performance comparison. The mathematical relationships (memory scaling, training time trade-offs) and continual learning benefits are well-articulated for the documentation audience.
|
|
||
| * Red Hat AI documentation: | ||
|
|
||
| ** link:https://docs.redhat.com/en/documentation/red_hat_openshift_ai_self-managed/2.25/html/working_with_distributed_workloads/running-kfto-based-distributed-training-workloads_distributed-workloads[Chapter 4. Running Training Operator-based distributed training workloads] in the *Working with distributed workloads* guide |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| ** link:https://docs.redhat.com/en/documentation/red_hat_openshift_ai_self-managed/2.25/html/working_with_distributed_workloads/running-kfto-based-distributed-training-workloads_distributed-workloads[Chapter 4. Running Training Operator-based distributed training workloads] in the *Working with distributed workloads* guide | |
| ** link:{rhoaidocshome}{default-format-url}/working_with_distributed_workloads/running-kfto-based-distributed-training-workloads_distributed-workloads[Chapter 4. Running Training Operator-based distributed training workloads] in the *Working with distributed workloads* guide |
|
|
||
| * Example: | ||
|
|
||
| ** link:https://docs.redhat.com/en/documentation/red_hat_openshift_ai_self-managed/2.25/html/openshift_ai_tutorial_-_fraud_detection_example/running-a-distributed-workload#distributing-training-jobs-with-kfto[Distributing training jobs with the Training Operator] in the *Red Hat OpenShift AI tutorial: Fraud Detection example* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| ** link:https://docs.redhat.com/en/documentation/red_hat_openshift_ai_self-managed/2.25/html/openshift_ai_tutorial_-_fraud_detection_example/running-a-distributed-workload#distributing-training-jobs-with-kfto[Distributing training jobs with the Training Operator] in the *Red Hat OpenShift AI tutorial: Fraud Detection example* | |
| ** link:{rhoaidocshome}{default-format-url}/openshift_ai_tutorial_-_fraud_detection_example/running-a-distributed-workload#distributing-training-jobs-with-kfto[Distributing training jobs with the Training Operator] in the *Red Hat OpenShift AI tutorial: Fraud Detection example* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about adding conditions and links to equivalent upstream docs?
| [id='about-the-python-index_{context}'] | ||
| = About the {org-name} Python Index | ||
|
|
||
| {org-name} AI includes a maintained Python package index that provides secure and reliable access to supported libraries, with full support for disconnected environments. For details about {org-name} support for the Python package index, see Support philosophy: A secure platform. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a link to "Support philosophy: A secure platform"?
|
|
||
| *Additional resources* | ||
|
|
||
| * link:https://docs.redhat.com/en/documentation/red_hat_openshift_ai_self-managed/2.25/html/managing_openshift_ai/creating-custom-workbench-images[Creating custom workbench images] No newline at end of file |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| * link:https://docs.redhat.com/en/documentation/red_hat_openshift_ai_self-managed/2.25/html/managing_openshift_ai/creating-custom-workbench-images[Creating custom workbench images] | |
| * link:{rhoaidocshome}{default-format-url}/managing_openshift_ai/creating-custom-workbench-images[Creating custom workbench images] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about adding conditions and links for equivalent upstream docs?
| + | ||
| The file-browser window shows the files and directories that are saved inside your own personal space in {productname-short} . | ||
|
|
||
| . Bring the content of an example Git repo inside your JupyterLab environment: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| . Bring the content of an example Git repo inside your JupyterLab environment: | |
| . Bring the content of an example Git repository inside your JupyterLab environment: |
| + | ||
| In most post-training setups, URR values below 1/3 are sufficient for learning new tasks, making OSFT notably lighter in memory. | ||
|
|
||
| * *Training time:* On datasets of equal size, OSFT typically takes about 2x longer per phase. However, since OSFT does not require replay buffers from past tasks (unlike SFT), the total training time across multiple phases or tasks is lower with clear benefits as the number of tasks grows. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| * *Training time:* On datasets of equal size, OSFT typically takes about 2x longer per phase. However, since OSFT does not require replay buffers from past tasks (unlike SFT), the total training time across multiple phases or tasks is lower with clear benefits as the number of tasks grows. | |
| * *Training time:* On datasets of equal size, OSFT typically takes about 2x longer per phase. However, because OSFT does not require replay buffers from past tasks (unlike SFT), the total training time across multiple phases or tasks is lower with clear benefits as the number of tasks grows. |
|
|
||
| *Prerequisites* | ||
|
|
||
| * Install the Synthetic Data Generation (SDG) Hub library, as described in xref:set-up-your-working-environment[Set up your working environment]. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use correct docs link format
|
|
||
| . To access the SDG Hub examples, clone the `sdg_hub` Git repository: | ||
| + | ||
| * To clone the repository from JupyterLab, follow the steps in xref:clone-an-example-git-repository[Clone an example Git repository]. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use correct docs link format
|
|
||
| *Additional resources* | ||
|
|
||
| * Upstream documentation: link:https://github.com/instructlab/sdg/tree/main/docs[https://github.com/instructlab/sdg/tree/main/docs] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| * Upstream documentation: link:https://github.com/instructlab/sdg/tree/main/docs[https://github.com/instructlab/sdg/tree/main/docs] | |
| * SDG community documentation: link:https://github.com/instructlab/sdg/tree/main/docs[https://github.com/instructlab/sdg/tree/main/docs] |
| *Additional resources* | ||
|
|
||
| * Upstream documentation: link:https://github.com/instructlab/sdg/tree/main/docs[https://github.com/instructlab/sdg/tree/main/docs] | ||
| * GitHub repository: link:https://github.com/instructlab/sdg[https://github.com/instructlab/sdg] No newline at end of file |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| * GitHub repository: link:https://github.com/instructlab/sdg[https://github.com/instructlab/sdg] | |
| * SDG GitHub repository: link:https://github.com/instructlab/sdg[https://github.com/instructlab/sdg] |
|
|
||
| *Prerequisites* | ||
|
|
||
| * Install the Training Hub library, as described in xref:set-up-your-working-environment[Set up your working environment]. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use correct docs link format
| + | ||
| You can extend a base notebook to use distributed training across multiple nodes by using the KubeFlow Trainer Operator (KFTO). The KFTO, abstracts the underlying infrastructure complexity of distributed training and fine-tuning of models. The iterative process of fine-tuning significantly reduces the time and resources required compared to training models from scratch. | ||
| + | ||
| For details, see xref:train-the-model-by-using-your-prepared-data[Train the model by using your prepared data]. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use correct link format
|
|
||
| Serve and consume a customized model:: After you customize a model, you can serve your customized models as APIs (Application Programming Interfaces). Serving a model as an API enables seamless integration into existing or newly developed applications. | ||
| + | ||
| Learn more about serving and consuming a customized model link:https://docs.redhat.com/en/documentation/red_hat_openshift_ai_self-managed/2.24/html/deploying_models/deploying_models_on_the_single_model_serving_platform[Chapter 2: Deploying a model on the Single Model Serving platform] in the Deploying models guide. No newline at end of file |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| Learn more about serving and consuming a customized model link:https://docs.redhat.com/en/documentation/red_hat_openshift_ai_self-managed/2.24/html/deploying_models/deploying_models_on_the_single_model_serving_platform[Chapter 2: Deploying a model on the Single Model Serving platform] in the Deploying models guide. | |
| Learn more about serving and consuming a customized model link:{rhoaidocshome}{default-format-url}/deploying_models/deploying_models_on_the_single_model_serving_platform[Chapter 2: Deploying a model on the Single Model Serving platform] in the Deploying models guide. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about adding links to equivalent upstream docs?
| Prepare your data for AI consumption:: To prepare your data, use Docling, a powerful Python library to transform unstructured data (such as text documents, images, and audio files) into structured formats that models can consume. | ||
| //For details, see xref:prepare-your-data-for-ai-consumption[Prepare your data for AI consumption]. | ||
| + | ||
| To automate data processing tasks, you can build Kubeflow Pipelines (KFP), see Automate data processing steps by building AI pipelines |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| To automate data processing tasks, you can build Kubeflow Pipelines (KFP), see Automate data processing steps by building AI pipelines | |
| To automate data processing tasks, you can build Kubeflow Pipelines (KFP), see Automate data processing steps by building AI pipelines. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add link to Automate data processing steps by building AI pipelines?
Summary by CodeRabbit