-
Notifications
You must be signed in to change notification settings - Fork 84
Analytics pipeline creation 2 #2265
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Analytics pipeline creation 2 #2265
Conversation
WalkthroughThis PR adds an AnalyticsCollectionData model and three CallbackScriptUtility methods (add_data_analytics, get_data_analytics, mark_as_processed), exposes them as predefined PyScript objects, and introduces unit tests and validation around analytics data handling. Changes
Sequence Diagram(s)sequenceDiagram
participant PyScript as PyScript
participant Utils as CallbackScriptUtility
participant DB as AnalyticsCollectionData (DB)
rect rgb(200,220,240)
note over PyScript,DB: Create analytics record
PyScript->>Utils: add_data_analytics(user, payload, bot)
Utils->>Utils: validate bot, extract fields
Utils->>DB: create & save record
DB-->>Utils: saved record (id)
Utils-->>PyScript: {"message":"success","id":id}
end
rect rgb(220,240,200)
note over PyScript,DB: Retrieve analytics records
PyScript->>Utils: get_data_analytics(collection_name, bot)
Utils->>Utils: normalize collection_name
Utils->>DB: query by bot & collection_name
DB-->>Utils: list of records
Utils->>Utils: map records -> dict list
Utils-->>PyScript: {"data":[...]}
end
rect rgb(240,220,200)
note over PyScript,DB: Mark records processed
PyScript->>Utils: mark_as_processed(user, collection_name, bot)
Utils->>DB: query matching records
DB-->>Utils: records
Utils->>DB: update each record (user, is_data_processed=true) & save
DB-->>Utils: confirm saves
Utils-->>PyScript: {"message":"marked processed"}
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Points to focus on:
Suggested reviewers
Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (1 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
📜 Recent review detailsConfiguration used: CodeRabbit UI Review profile: CHILL Plan: Pro 📒 Files selected for processing (1)
🧰 Additional context used🧬 Code graph analysis (1)tests/unit_test/data_processor/data_processor2_test.py (1)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
🔇 Additional comments (4)
Tip 📝 Customizable high-level summaries are now available in beta!You can now customize how CodeRabbit generates the high-level summary in your pull requests — including its content, structure, tone, and formatting.
Example instruction:
Note: This feature is currently in beta for Pro-tier users, and pricing will be announced later. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Nitpick comments (3)
kairon/shared/pyscript/callback_pyscript_utils.py (2)
342-342: Use explicit type hints for optional parameters.The
botparameter is optional but lacks explicitOptionalor| Nonetype annotation. Per PEP 484, implicit optional is discouraged.Apply this diff:
@staticmethod - def get_data_analytics(collection_name: str, bot: Text = None): + def get_data_analytics(collection_name: str, bot: Text | None = None): if not bot: raise Exception("Missing bot id") # ... rest of method @staticmethod - def add_data_analytics(user: str, payload: dict, bot: str = None): + def add_data_analytics(user: str, payload: dict, bot: str | None = None): if not bot: raise Exception("Missing bot id") # ... rest of method @staticmethod - def mark_as_processed(user: str, collection_name: str, bot: str = None): + def mark_as_processed(user: str, collection_name: str, bot: str | None = None): if not bot: raise Exception("Missing bot id")Also applies to: 372-372, 395-395
372-392: Consider validating payload fields in add_data_analytics.The method extracts
sourceandreceived_atfrom the payload without validation. Ifreceived_atis provided as a string rather than a datetime object, it could cause issues. Consider adding validation or documenting expected payload structure.tests/unit_test/callback/pyscript_handler_test.py (1)
3955-4034: Good test coverage for the new analytics methods.The tests validate the core functionality of
add_data_analytics,get_data_analytics, andmark_as_processed. They properly use mocks and verify the expected behavior.However, consider adding tests for edge cases:
add_data_analyticswith missingcollection_namein payloadget_data_analyticswith non-existent collectionmark_as_processedwhen no records exist (to verify the fix for the queryset check issue flagged in callback_pyscript_utils.py)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
kairon/async_callback/utils.py(1 hunks)kairon/shared/cognition/data_objects.py(1 hunks)kairon/shared/pyscript/callback_pyscript_utils.py(2 hunks)tests/unit_test/callback/pyscript_handler_test.py(3 hunks)
🧰 Additional context used
🧬 Code graph analysis (4)
kairon/shared/cognition/data_objects.py (1)
kairon/shared/data/audit/data_objects.py (1)
Auditlog(24-77)
kairon/async_callback/utils.py (1)
kairon/shared/pyscript/callback_pyscript_utils.py (3)
add_data_analytics(372-392)get_data_analytics(342-369)mark_as_processed(395-411)
kairon/shared/pyscript/callback_pyscript_utils.py (2)
kairon/shared/cognition/data_objects.py (1)
AnalyticsCollectionData(126-149)kairon/exceptions.py (1)
AppException(1-3)
tests/unit_test/callback/pyscript_handler_test.py (2)
kairon/shared/cognition/data_objects.py (1)
AnalyticsCollectionData(126-149)kairon/shared/pyscript/callback_pyscript_utils.py (3)
add_data_analytics(372-392)get_data_analytics(342-369)mark_as_processed(395-411)
🪛 Ruff (0.14.5)
kairon/shared/cognition/data_objects.py
135-135: Mutable class attributes should be annotated with typing.ClassVar
(RUF012)
142-142: Avoid specifying long messages outside the exception class
(TRY003)
145-145: Avoid specifying long messages outside the exception class
(TRY003)
kairon/shared/pyscript/callback_pyscript_utils.py
339-339: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling
(B904)
339-339: Create your own exception
(TRY002)
339-339: Avoid specifying long messages outside the exception class
(TRY003)
339-339: Use explicit conversion flag
Replace with conversion flag
(RUF010)
342-342: PEP 484 prohibits implicit Optional
Convert to T | None
(RUF013)
344-344: Create your own exception
(TRY002)
344-344: Avoid specifying long messages outside the exception class
(TRY003)
372-372: PEP 484 prohibits implicit Optional
Convert to T | None
(RUF013)
374-374: Create your own exception
(TRY002)
374-374: Avoid specifying long messages outside the exception class
(TRY003)
395-395: PEP 484 prohibits implicit Optional
Convert to T | None
(RUF013)
397-397: Create your own exception
(TRY002)
397-397: Avoid specifying long messages outside the exception class
(TRY003)
402-402: Avoid specifying long messages outside the exception class
(TRY003)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: Codacy Static Code Analysis
- GitHub Check: Python CI
- GitHub Check: Analyze (python)
🔇 Additional comments (2)
kairon/shared/cognition/data_objects.py (1)
126-149: LGTM! The new analytics data model is well-structured.The
AnalyticsCollectionDataclass follows established patterns in the codebase (similar toCollectionDataandCognitionSchema). The validation ensures data integrity by enforcing non-empty collection names and type-checking the data field.kairon/async_callback/utils.py (1)
85-87: LGTM! The new analytics functions are correctly exposed.The three new analytics methods are properly bound using
partialwith thebotparameter, following the established pattern for other predefined objects in the PyScript environment.
| return media_id | ||
| except Exception as e: | ||
| raise Exception(f"encryption failed-{str(e)}") | ||
| raise Exception(f"encryption failed-{str(e)}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix the error message in the exception.
The error message says "encryption failed" but this method is save_as_pdf. This appears to be a copy-paste error.
Apply this diff:
- raise Exception(f"encryption failed-{str(e)}")
+ raise Exception(f"PDF generation failed-{str(e)}")📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| raise Exception(f"encryption failed-{str(e)}") | |
| raise Exception(f"PDF generation failed-{str(e)}") |
🧰 Tools
🪛 Ruff (0.14.5)
339-339: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling
(B904)
339-339: Create your own exception
(TRY002)
339-339: Avoid specifying long messages outside the exception class
(TRY003)
339-339: Use explicit conversion flag
Replace with conversion flag
(RUF010)
🤖 Prompt for AI Agents
In kairon/shared/pyscript/callback_pyscript_utils.py around line 339, the raised
exception message incorrectly says "encryption failed" in the save_as_pdf
method; update the exception text to reflect the actual operation (e.g.,
"save_as_pdf failed" or "PDF save failed") and include the original exception
details (str(e)) in the message so the log accurately describes the failure and
preserves error context.
| queryset = AnalyticsCollectionData.objects(bot=bot, collection_name=collection_name) | ||
|
|
||
| if not queryset: | ||
| raise AppException("No records found for given bot and collection_name") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
MongoEngine queryset truthiness check may not work as expected.
The check if not queryset: on line 401 will not work correctly. MongoEngine querysets are always truthy, even when empty. To check if a queryset has no results, use .count() or iterate and check.
Apply this diff:
queryset = AnalyticsCollectionData.objects(bot=bot, collection_name=collection_name)
- if not queryset:
+ if queryset.count() == 0:
raise AppException("No records found for given bot and collection_name")📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| queryset = AnalyticsCollectionData.objects(bot=bot, collection_name=collection_name) | |
| if not queryset: | |
| raise AppException("No records found for given bot and collection_name") | |
| queryset = AnalyticsCollectionData.objects(bot=bot, collection_name=collection_name) | |
| if queryset.count() == 0: | |
| raise AppException("No records found for given bot and collection_name") |
🧰 Tools
🪛 Ruff (0.14.5)
402-402: Avoid specifying long messages outside the exception class
(TRY003)
🤖 Prompt for AI Agents
In kairon/shared/pyscript/callback_pyscript_utils.py around lines 399 to 402,
the truthiness check `if not queryset:` is invalid because MongoEngine querysets
are always truthy; replace it with an explicit emptiness check such as `if
queryset.count() == 0:` or `if queryset.first() is None:` so the code correctly
detects no matching records and then raise the AppException.
Summary by CodeRabbit
Release Notes
New Features
Tests
✏️ Tip: You can customize this high-level summary in your review settings.