[CAI-741][CAI-762] Chatbot-index creates new indexes on Redis#2008
[CAI-741][CAI-762] Chatbot-index creates new indexes on Redis#2008
Conversation
* Add package.json to parser app * Add changeset, update package-lock * Update apps/parser/package.json Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> --------- Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
* Add package.json to parser app * Add changeset, update package-lock * Update apps/parser/package.json Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Add puppeteer dependency in parser app * Add changeset * Update .changeset/three-cups-change.md Co-authored-by: marcobottaro <39835990+marcobottaro@users.noreply.github.com> * Update apps/parser/package.json Co-authored-by: marcobottaro <39835990+marcobottaro@users.noreply.github.com> --------- Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Co-authored-by: marcobottaro <39835990+marcobottaro@users.noreply.github.com>
🦋 Changeset detectedLatest commit: 6c65f57 The changes in this PR will be included in the next version bump. This PR includes changesets to release 1 package
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
There was a problem hiding this comment.
Pull request overview
Refactors the chatbot indexer to support configurable, selective document ingestion (including structured docs) and Redis index creation/cleanup, while updating GitHub Actions inputs to drive the new CLI flags.
Changes:
- Added structured document ingestion and flag-based document selection for indexing.
- Refactored Redis index schema/creation/loading to support configurable
index_idand targeted cleanup. - Updated workflow/action wiring and added Terraform skill documentation files.
Reviewed changes
Copilot reviewed 14 out of 15 changed files in this pull request and generated 10 comments.
Show a summary per file
| File | Description |
|---|---|
| apps/parser/package.json | Adds a new parser app package manifest with Puppeteer dependency. |
| apps/chatbot-index/src/modules/vector_index.py | Introduces schema factory + parameterized index build/load/cleanup behavior. |
| apps/chatbot-index/src/modules/settings.py | Switches index_id configuration to environment variable. |
| apps/chatbot-index/src/modules/documents.py | Adds structured docs ingestion and flags to select doc sources. |
| apps/chatbot-index/src/modules/create_vector_index.py | Adds CLI flags to choose which document types to include and whether to clean Redis. |
| apps/chatbot-index/src/modules/codec.py | Adds helper to safely parse JSON-encoded strings. |
| apps/chatbot-index/config/params.yaml | Removes index_id from params configuration. |
| .github/workflows/chatbot_create_index.yaml | Adds workflow input to choose indexing mode; passes inputs to composite action. |
| .github/actions/chatbot/action.yaml | Adds composite action inputs and passes corresponding CLI flags to indexer. |
| .github/skills/terraform-style-guide/SKILL.md | Adds Terraform style guide documentation. |
| .github/skills/terraform-refactor-module/SKILL.md | Adds Terraform module refactor skill documentation. |
| .changeset/three-cups-change.md | Changeset for parser Puppeteer addition. |
| .changeset/eager-colts-smile.md | Changeset for chatbot-index structured docs support. |
| .changeset/cuddly-pumas-cross.md | Changeset for creating parser app. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 14 out of 15 changed files in this pull request and generated 8 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
…veloper-portal into CAI-741-refactor-for-new-index
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
why use both AWS client and AWS resource for S3?
I think it is possible to use only AWS resourse.
| import argparse | ||
|
|
||
| from src.modules.logger import get_logger | ||
| from src.modules.vector_index import DiscoveryVectorIndex |
There was a problem hiding this comment.
DiscoveryVectorIndex is not present in src.modules.vector_index
There was a problem hiding this comment.
I have an idea, you could rename the class : )
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 9 out of 10 changed files in this pull request and generated 5 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
…veloper-portal into CAI-741-refactor-for-new-index
…veloper-portal into CAI-741-refactor-for-new-index
Jira Pull Request LinkThis Pull Request refers to the following Jira issue CAI-741 |
Branch is not up to date with base branch@batdevis it seems this Pull Request is not updated with base branch. |
|
This PR exceeds the recommended size of 800 lines. Please make sure you are NOT addressing multiple issues with one PR. Note this PR might be rejected due to its size. |
List of Changes
This pull request introduces several enhancements and new features to the chatbot indexer, with a focus on supporting structured data. The most important changes include adding support for structured document indexing, refactoring the document ingestion and index creation workflow to be more flexible.
get_structured_docs()function and updating the main document ingestion logic to accept flags for including static, dynamic, API, and structured data.Motivation and Context
This refactor should allow the creation of new vector indexes on Redis
How Has This Been Tested?
Screenshots (if appropriate):
Types of changes
Checklist: