Skip to content

docs: updating architecture section#174

Merged
nerdalert merged 1 commit intomainfrom
architecture-section
Oct 22, 2025
Merged

docs: updating architecture section#174
nerdalert merged 1 commit intomainfrom
architecture-section

Conversation

@jland-redhat
Copy link
Copy Markdown
Contributor

@jland-redhat jland-redhat commented Oct 21, 2025

Updating arch section to be closer to what we are delivering.

Summary by CodeRabbit

  • Documentation
    • Updated architecture documentation with clearer component descriptions and relationships.
    • Added new architectural diagrams and workflows illustrating token management, policy enforcement, and model inference processes.
    • Enhanced sections on security, observability, and platform modularity for better understanding of system design.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Oct 21, 2025

Walkthrough

The architecture documentation is restructured to shift from a gateway-centric model to a MaaS API-centric token management approach, integrating Kuadrant-based policy enforcement and OpenShift AI model serving. Component naming, roles, and terminology are updated throughout, with new sequence diagrams illustrating token retrieval, policy application, and model inference workflows.

Changes

Cohort / File(s) Summary
Architecture Documentation Restructure
docs/content/architecture.md
Comprehensive restructuring of architecture documentation: renames sections from "High-Level Architecture" to "Architecture"; replaces gateway-centric references with MaaS API-centric token management model; updates component naming and roles (MaaS API, Gateway API, RHCL, Kuadrant, Authrino, Limitador, AuthPolicy, RateLimitPolicy, TokenRateLimitPolicy, RBAC, LLMInferenceService, RHOAI); reorganizes "Core Components" and "Flows" sections; replaces "Gateway Layer" and "Management Layer" terminology; adds new sequence diagrams for token retrieval, policy engine, and model inference flows; expands observability and security sections to emphasize token usage metrics and centralized dashboards.

Sequence Diagram(s)

sequenceDiagram
    actor User
    participant MaaS API
    participant Gateway API
    participant RHCL as RHCL<br/>(Policy Engine)
    participant RHOAI as RHOAI<br/>(Model Serving)

    rect rgb(200, 220, 255)
    Note over User,MaaS API: Token Retrieval Flow
    User->>MaaS API: Request token (tier info)
    MaaS API-->>User: Return auth token
    end

    rect rgb(220, 200, 255)
    Note over User,RHOAI: Inference Request Flow
    User->>Gateway API: Inference request + token
    Gateway API->>RHCL: Apply policies (AuthPolicy, RateLimitPolicy)
    RHCL->>RHCL: Validate token & rate limits
    RHCL-->>Gateway API: Policy check passed
    Gateway API->>RHOAI: Route to model endpoint
    RHOAI-->>Gateway API: Model response
    Gateway API-->>User: Return inference result
    end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

The changes consist of a single documentation file with comprehensive restructuring and terminology updates. Review effort is driven by the need to verify architectural accuracy, consistency across renamed components, and proper representation of new flows, but homogeneous in scope to one file.

Poem

🐰 Hop, hop! New tokens dance and flow,
MaaS API leads the grand tableau,
Kuadrant gates with policies precise,
RHOAI serves the models nice,
Architecture blooms—modular, clean, divine!

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The PR title "docs: updating architecture section" directly relates to the primary change in the pull request, which is a comprehensive rework of the architecture documentation in docs/content/architecture.md. The title accurately conveys that the architecture section of the documentation is being updated, which is the main focus of this changeset. While the title is somewhat generic and doesn't specify the extent of the rework (such as new diagrams, component reorganization, or the shift to MaaS API-centric design), it appropriately summarizes the change at the right level of abstraction. The title is sufficiently specific compared to terms like "misc updates" and clearly communicates which documentation section is being modified.
Docstring Coverage ✅ Passed No functions found in the changes. Docstring coverage check skipped.
✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch architecture-section

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2b0aa13 and 629847a.

📒 Files selected for processing (1)
  • docs/content/architecture.md (1 hunks)
🧰 Additional context used
🪛 LanguageTool
docs/content/architecture.md

[grammar] ~125-~125: Use a hyphen to join words.
Context: ...oken and enforce access control and rate limiting policies: ```mermaid graph TB ...

(QB_NEW_EN_HYPHEN)


[uncategorized] ~298-~298: If this is a compound adjective that modifies the following noun, use a hyphen.
Context: ... that validates tokens - Limitador: Rate limiting service that enforces usage quotas ###...

(EN_COMPOUND_ADJECTIVE_INTERNAL)


[uncategorized] ~305-~305: If this is a compound adjective that modifies the following noun, use a hyphen.
Context: ... - Scaling: Automatic scaling based on demand - Resource Management: GPU allocati...

(EN_COMPOUND_ADJECTIVE_INTERNAL)

🔇 Additional comments (1)
docs/content/architecture.md (1)

309-324: Well-structured summary of architecture benefits.

The Architecture Benefits section clearly articulates the security, scalability, and observability advantages of the redesigned platform. Good closure to the documentation.


- **MaaS API**: The central component for token generation and management.
- **Gateway API**: The entry point for all inference requests.
- **Kuandrant (Red Hat Connectivity Link)**: The policy engine for authentication and authorization.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Fix component name typo.

Line 52 has "Kuandrant" but should be "Kuadrant" (consistent with other references throughout the document).

-- **Kuandrant (Red Hat Connectivity Link)**: The policy engine for authentication and authorization.
+- **Kuadrant (Red Hat Connectivity Link)**: The policy engine for authentication and authorization.
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
- **Kuandrant (Red Hat Connectivity Link)**: The policy engine for authentication and authorization.
- **Kuadrant (Red Hat Connectivity Link)**: The policy engine for authentication and authorization.
🤖 Prompt for AI Agents
In docs/content/architecture.md around line 52, the component name is misspelled
as "Kuandrant"; update the text to "Kuadrant" to match other references in the
document and ensure consistent spelling for the Red Hat Connectivity Link policy
engine.


### Inference Service Component Details

Once a user has obtained their token through the MaaS API, they can use it to make inference requests to the Gateway API. RHCL's Application Connectivity Policies then validate the token and enforce access control and rate limiting policies:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Use hyphen in compound adjective.

"Rate limiting" should be hyphenated when used as a compound adjective modifying "policies".

-Once a user has obtained their token through the MaaS API, they can use it to make inference requests to the Gateway API. RHCL's Application Connectivity Policies then validate the token and enforce access control and rate limiting policies:
+Once a user has obtained their token through the MaaS API, they can use it to make inference requests to the Gateway API. RHCL's Application Connectivity Policies then validate the token and enforce access control and rate-limiting policies:
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
Once a user has obtained their token through the MaaS API, they can use it to make inference requests to the Gateway API. RHCL's Application Connectivity Policies then validate the token and enforce access control and rate limiting policies:
Once a user has obtained their token through the MaaS API, they can use it to make inference requests to the Gateway API. RHCL's Application Connectivity Policies then validate the token and enforce access control and rate-limiting policies:
🧰 Tools
🪛 LanguageTool

[grammar] ~125-~125: Use a hyphen to join words.
Context: ...oken and enforce access control and rate limiting policies: ```mermaid graph TB ...

(QB_NEW_EN_HYPHEN)

🤖 Prompt for AI Agents
In docs/content/architecture.md around line 125, the phrase "rate limiting
policies" is used as a compound adjective and should be hyphenated; update the
text to use "rate-limiting policies" so the compound modifier is grammatically
correct.

Gateway-->>Client: Response
participant GatewayAPI[Gateway API]
participant Kuadrant[Kuadrant]
participant Authrino[Authrino]
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Fix participant name spelling in sequence diagram.

Line 240 shows "Authrino" but the component is defined as "Authorino" on line 140. Update for consistency.

-    participant Authrino[Authrino]
+    participant Authrino[Authorino]
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
participant Authrino[Authrino]
participant Authrino[Authorino]
🤖 Prompt for AI Agents
In docs/content/architecture.md around line 240 the sequence diagram participant
is misspelled as "Authrino" while the component is defined as "Authorino" on
line 140; update the participant declaration at line 240 to "Authorino" so the
name matches the defined component exactly to maintain consistency in the
diagram.

- **Gateway API**: Entry point for all inference requests
- **Kuadrant**: Policy attachment point for authentication and authorization
- **Authrino**: Authentication and authorization service that validates tokens
- **Limitador**: Rate limiting service that enforces usage quotas
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Use hyphen in compound adjective.

"Rate limiting" should be hyphenated when used as a compound adjective modifying "service".

-  - **Limitador**: Rate limiting service that enforces usage quotas
+  - **Limitador**: Rate-limiting service that enforces usage quotas
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
- **Limitador**: Rate limiting service that enforces usage quotas
- **Limitador**: Rate-limiting service that enforces usage quotas
🧰 Tools
🪛 LanguageTool

[uncategorized] ~298-~298: If this is a compound adjective that modifies the following noun, use a hyphen.
Context: ... that validates tokens - Limitador: Rate limiting service that enforces usage quotas ###...

(EN_COMPOUND_ADJECTIVE_INTERNAL)

🤖 Prompt for AI Agents
In docs/content/architecture.md around line 298, the phrase "Rate limiting
service that enforces usage quotas" uses "Rate limiting" as a compound adjective
and should be hyphenated; update the text to read "Rate-limiting service that
enforces usage quotas" so the compound modifier is correct.

Copy link
Copy Markdown
Member

@nerdalert nerdalert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@openshift-ci
Copy link
Copy Markdown

openshift-ci bot commented Oct 22, 2025

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: jland-redhat, nerdalert

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:
  • OWNERS [jland-redhat,nerdalert]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@nerdalert nerdalert merged commit e09cc22 into main Oct 22, 2025
7 of 8 checks passed
@bartoszmajsak bartoszmajsak deleted the architecture-section branch October 22, 2025 13:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants