-
Notifications
You must be signed in to change notification settings - Fork 518
Feat/cache pricing #1884
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Feat/cache pricing #1884
Conversation
- Added logic to process credit billing for successful completions and user aborts. - Ensured compliance with coding standards, including optional chaining and nullish coalescing for safer property access. - Removed redundant credit billing logic from previous location to streamline the process.
- Implemented two-level caching strategy: - Global static point: caches system prompt across all sessions (index 0) - Session dynamic points: caches recent conversation history (last 3 messages before user query) - Replaced `instanceof` checks with `_getType()` method to properly handle deserialized messages - Added support for caching AI messages with tool calls
…fly-ai/refly into feat/cache-pricing
- Added SyncTokenCreditUsageJobData interface to capture detailed token usage data. - Implemented syncTokenCreditUsage method in CreditService to handle credit billing for individual model calls. - Updated SkillInvokerService to process credit billing directly upon model call completion, simplifying the billing logic. - Enhanced logging for credit billing processes to improve observability.
WalkthroughThis PR introduces per-model-call token credit billing and refactors the credit billing flow. A new Changes
Sequence DiagramsequenceDiagram
participant SkillInvoker
participant CreditService
participant Database as DB
SkillInvoker->>SkillInvoker: Invoke model call
SkillInvoker->>SkillInvoker: Collect tokens (input, output, cache)
SkillInvoker->>CreditService: syncTokenCreditUsage(data)
activate CreditService
CreditService->>Database: Validate user & check early-bird
Database-->>CreditService: User & status
CreditService->>CreditService: Compute creditCost<br/>(unit or fixed)
CreditService->>CreditService: Apply early-bird<br/>allowance
CreditService->>CreditService: Construct modelUsageDetails
alt Zero Amount
CreditService->>Database: Record zero-amount usage
else Positive Cost
CreditService->>CreditService: Deduct credits &<br/>create usage
CreditService->>Database: Update user balance
end
Database-->>CreditService: Success
CreditService-->>SkillInvoker: true/false
deactivate CreditService
SkillInvoker->>SkillInvoker: Continue execution
Estimated code review effort🎯 4 (Complex) | ⏱️ ~50 minutes
Possibly related PRs
Suggested reviewers
Poem
Pre-merge checks and finishing touches❌ Failed checks (1 inconclusive)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
🧹 Nitpick comments (2)
apps/api/src/modules/credit/credit.service.ts (1)
1121-1157: Consider extracting credit cost calculation to a shared helper.The credit cost calculation logic (lines 1121-1157) is nearly identical to the logic in
syncBatchTokenCreditUsage(lines 996-1027). This duplication increases maintenance burden and risk of divergence.🔎 Suggested refactor:
Extract a private helper method:
private calculateCreditCost( inputTokens: number, outputTokens: number, cacheReadTokens: number, cacheWriteTokens: number, creditBilling: CreditBilling, ): number { if (creditBilling.unit === '5k_tokens') { const perInputUnit = creditBilling.inputCost || 0; const perOutputUnit = creditBilling.outputCost || 0; const perCacheReadUnit = creditBilling.cacheReadCost ?? perInputUnit; const perCacheWriteUnit = creditBilling.cacheWriteCost ?? perInputUnit; const inputCost = (inputTokens / 5000) * perInputUnit; const outputCost = (outputTokens / 5000) * perOutputUnit; const cacheReadCost = (cacheReadTokens / 5000) * perCacheReadUnit; const cacheWriteCost = (cacheWriteTokens / 5000) * perCacheWriteUnit; return Math.ceil(inputCost + outputCost + cacheReadCost + cacheWriteCost); } else if (creditBilling.unit === '1m_tokens') { // Similar calculation with 1000000 divisor // ... } else { return Math.max(creditBilling.inputCost || 0, creditBilling.outputCost || 0); } }Then both
syncTokenCreditUsageandsyncBatchTokenCreditUsagecan call this helper.apps/api/src/modules/skill/skill-invoker.service.ts (1)
1525-1542: Clear method signature and documentation.The refactored method has a clear purpose. However, the parameter type for
actualProviderItemis an inline object type which could be extracted for better readability.🔎 Consider extracting the inline type:
+type ProviderItemBillingInfo = { + name?: string; + creditBilling?: string; + tier?: string; + provider?: { name?: string }; +}; private async processCreditBilling( user: User, resultId: string, version: number, inputTokens: number, outputTokens: number, cacheReadTokens: number, cacheWriteTokens: number, - actualProviderItem: - | { name?: string; creditBilling?: string; tier?: string; provider?: { name?: string } } - | null - | undefined, + actualProviderItem: ProviderItemBillingInfo | null | undefined, requestProviderItem?: ProviderItem, ): Promise<void> {
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
apps/api/src/modules/credit/credit.dto.ts(1 hunks)apps/api/src/modules/credit/credit.service.ts(2 hunks)apps/api/src/modules/skill/skill-invoker.service.ts(5 hunks)packages/skill-template/src/scheduler/utils/message.ts(3 hunks)
🧰 Additional context used
📓 Path-based instructions (15)
**/*.{js,ts,jsx,tsx}
📄 CodeRabbit inference engine (.cursorrules)
**/*.{js,ts,jsx,tsx}: Always use optional chaining (?.) when accessing object properties
Always use nullish coalescing (??) or default values for potentially undefined values
Always check array existence before using array methods
Always validate object properties before destructuring
Always use single quotes for string literals in JavaScript/TypeScript code
**/*.{js,ts,jsx,tsx}: Use semicolons at the end of statements
Include spaces around operators (e.g.,a + binstead ofa+b)
Always use curly braces for control statements
Place opening braces on the same line as their statement
**/*.{js,ts,jsx,tsx}: Group import statements in order: React/framework libraries, third-party libraries, internal modules, relative path imports, type imports, style imports
Sort imports alphabetically within each import group
Leave a blank line between import groups
Extract complex logic into custom hooks
Use functional updates for state (e.g.,setCount(prev => prev + 1))
Split complex state into multiple state variables rather than single large objects
Use useReducer for complex state logic instead of multiple useState calls
Files:
apps/api/src/modules/credit/credit.dto.tspackages/skill-template/src/scheduler/utils/message.tsapps/api/src/modules/skill/skill-invoker.service.tsapps/api/src/modules/credit/credit.service.ts
**/*.{js,ts,tsx,jsx,py,java,cpp,c,cs,rb,go,rs,php,swift,kt,scala,r,m,mm,sql}
📄 CodeRabbit inference engine (.cursor/rules/00-language-priority.mdc)
**/*.{js,ts,tsx,jsx,py,java,cpp,c,cs,rb,go,rs,php,swift,kt,scala,r,m,mm,sql}: All code comments MUST be written in English
All variable names, function names, class names, and other identifiers MUST use English words
Comments should be concise and explain 'why' rather than 'what'
Use proper grammar and punctuation in comments
Keep comments up-to-date when code changes
Document complex logic, edge cases, and important implementation details
Use clear, descriptive names that indicate purpose
Avoid abbreviations unless they are universally understood
Files:
apps/api/src/modules/credit/credit.dto.tspackages/skill-template/src/scheduler/utils/message.tsapps/api/src/modules/skill/skill-invoker.service.tsapps/api/src/modules/credit/credit.service.ts
**/*.{js,ts,tsx,jsx}
📄 CodeRabbit inference engine (.cursor/rules/00-language-priority.mdc)
Use JSDoc style comments for functions and classes in JavaScript/TypeScript
Files:
apps/api/src/modules/credit/credit.dto.tspackages/skill-template/src/scheduler/utils/message.tsapps/api/src/modules/skill/skill-invoker.service.tsapps/api/src/modules/credit/credit.service.ts
**/*.{js,jsx,ts,tsx}
📄 CodeRabbit inference engine (.cursor/rules/01-code-style.mdc)
**/*.{js,jsx,ts,tsx}: Use single quotes for string literals in TypeScript/JavaScript
Always use optional chaining (?.) when accessing object properties in TypeScript/JavaScript
Always use nullish coalescing (??) or default values for potentially undefined values in TypeScript/JavaScript
Always check array existence before using array methods in TypeScript/JavaScript
Validate object properties before destructuring in TypeScript/JavaScript
Use ES6+ features like arrow functions, destructuring, and spread operators in TypeScript/JavaScript
Avoid magic numbers and strings - use named constants in TypeScript/JavaScript
Use async/await instead of raw promises for asynchronous code in TypeScript/JavaScript
Files:
apps/api/src/modules/credit/credit.dto.tspackages/skill-template/src/scheduler/utils/message.tsapps/api/src/modules/skill/skill-invoker.service.tsapps/api/src/modules/credit/credit.service.ts
**/*.{ts,tsx}
📄 CodeRabbit inference engine (.cursor/rules/03-typescript-guidelines.mdc)
**/*.{ts,tsx}: Avoid usinganytype whenever possible - useunknowntype instead with proper type guards
Always define explicit return types for functions, especially for public APIs
Prefer extending existing types over creating entirely new types
Use TypeScript utility types (Partial<T>,Pick<T, K>,Omit<T, K>,Readonly<T>,Record<K, T>) to derive new types
Use union types and intersection types to combine existing types
Always import types explicitly using theimport typesyntax
Group type imports separately from value imports
Minimize creating local type aliases for imported types
Files:
apps/api/src/modules/credit/credit.dto.tspackages/skill-template/src/scheduler/utils/message.tsapps/api/src/modules/skill/skill-invoker.service.tsapps/api/src/modules/credit/credit.service.ts
**/*.{js,ts,jsx,tsx,css,json}
📄 CodeRabbit inference engine (.cursor/rules/04-code-formatting.mdc)
Maximum line length of 100 characters
Files:
apps/api/src/modules/credit/credit.dto.tspackages/skill-template/src/scheduler/utils/message.tsapps/api/src/modules/skill/skill-invoker.service.tsapps/api/src/modules/credit/credit.service.ts
**/*.{js,ts,jsx,tsx,css,json,yml,yaml}
📄 CodeRabbit inference engine (.cursor/rules/04-code-formatting.mdc)
Use 2 spaces for indentation, no tabs
Files:
apps/api/src/modules/credit/credit.dto.tspackages/skill-template/src/scheduler/utils/message.tsapps/api/src/modules/skill/skill-invoker.service.tsapps/api/src/modules/credit/credit.service.ts
**/*.{js,ts,jsx,tsx,css,json,yml,yaml,md}
📄 CodeRabbit inference engine (.cursor/rules/04-code-formatting.mdc)
No trailing whitespace at the end of lines
Files:
apps/api/src/modules/credit/credit.dto.tspackages/skill-template/src/scheduler/utils/message.tsapps/api/src/modules/skill/skill-invoker.service.tsapps/api/src/modules/credit/credit.service.ts
**/*.{css,scss,sass,less,js,jsx,ts,tsx}
📄 CodeRabbit inference engine (.cursor/rules/09-design-system.mdc)
**/*.{css,scss,sass,less,js,jsx,ts,tsx}: Primary color (#155EEF) should be used for main brand color in buttons, links, and accents
Error color (#F04438) should be used for error states and destructive actions
Success color (#12B76A) should be used for success states and confirmations
Warning color (#F79009) should be used for warnings and important notifications
Info color (#0BA5EC) should be used for informational elements
Files:
apps/api/src/modules/credit/credit.dto.tspackages/skill-template/src/scheduler/utils/message.tsapps/api/src/modules/skill/skill-invoker.service.tsapps/api/src/modules/credit/credit.service.ts
**/*.{tsx,ts}
📄 CodeRabbit inference engine (.cursor/rules/09-i18n-guidelines.mdc)
**/*.{tsx,ts}: Use the translation wrapper component and useTranslation hook in components
Ensure all user-facing text is translatable
Files:
apps/api/src/modules/credit/credit.dto.tspackages/skill-template/src/scheduler/utils/message.tsapps/api/src/modules/skill/skill-invoker.service.tsapps/api/src/modules/credit/credit.service.ts
**/*.{tsx,ts,json}
📄 CodeRabbit inference engine (.cursor/rules/09-i18n-guidelines.mdc)
Support dynamic content with placeholders in translations
Files:
apps/api/src/modules/credit/credit.dto.tspackages/skill-template/src/scheduler/utils/message.tsapps/api/src/modules/skill/skill-invoker.service.tsapps/api/src/modules/credit/credit.service.ts
**/*.{tsx,ts,jsx,js,vue,css,scss,less}
📄 CodeRabbit inference engine (.cursor/rules/11-ui-design-patterns.mdc)
**/*.{tsx,ts,jsx,js,vue,css,scss,less}: Use the primary blue (#155EEF) for main UI elements, CTAs, and active states
Use red (#F04438) only for errors, warnings, and destructive actions
Use green (#12B76A) for success states and confirmations
Use orange (#F79009) for warning states and important notifications
Use blue (#0BA5EC) for informational elements
Primary buttons should be solid with the primary color
Secondary buttons should have a border with transparent or light background
Danger buttons should use the error color
Use consistent padding, border radius, and hover states for all buttons
Follow fixed button sizes based on their importance and context
Use consistent border radius (rounded-lg) for all cards
Apply light shadows (shadow-sm) for card elevation
Maintain consistent padding inside cards (p-4orp-6)
Use subtle borders for card separation
Ensure proper spacing between card elements
Apply consistent styling to all form inputs
Use clear visual indicators for focus, hover, and error states in form elements
Apply proper spacing between elements using 8px, 16px, 24px increments
Ensure proper alignment of elements (left, center, or right)
Use responsive layouts that work across different device sizes
Maintain a minimum contrast ratio of 4.5:1 for text
Files:
apps/api/src/modules/credit/credit.dto.tspackages/skill-template/src/scheduler/utils/message.tsapps/api/src/modules/skill/skill-invoker.service.tsapps/api/src/modules/credit/credit.service.ts
**/*.{tsx,ts,jsx,js,vue}
📄 CodeRabbit inference engine (.cursor/rules/11-ui-design-patterns.mdc)
**/*.{tsx,ts,jsx,js,vue}: Include appropriate loading states for async actions in buttons
Group related form elements with appropriate spacing
Provide clear validation feedback for forms
Ensure proper labeling and accessibility for form elements
Ensure all interactive elements are keyboard accessible
Include appropriate ARIA attributes for complex components
Provide alternative text for images and icons
Support screen readers with semantic HTML elements
Files:
apps/api/src/modules/credit/credit.dto.tspackages/skill-template/src/scheduler/utils/message.tsapps/api/src/modules/skill/skill-invoker.service.tsapps/api/src/modules/credit/credit.service.ts
**/*.{ts,tsx,js,jsx}
📄 CodeRabbit inference engine (.cursor/rules/08-contributing-guidelines.mdc)
**/*.{ts,tsx,js,jsx}: Follow the TypeScript/JavaScript style guidelines
Ensure code is well-tested and documented
Files:
apps/api/src/modules/credit/credit.dto.tspackages/skill-template/src/scheduler/utils/message.tsapps/api/src/modules/skill/skill-invoker.service.tsapps/api/src/modules/credit/credit.service.ts
apps/api/src/**/*.{controller,service}.ts
📄 CodeRabbit inference engine (.cursor/rules/06-api-structure.mdc)
Implement proper error handling in API modules
Files:
apps/api/src/modules/skill/skill-invoker.service.tsapps/api/src/modules/credit/credit.service.ts
🧬 Code graph analysis (2)
apps/api/src/modules/credit/credit.dto.ts (1)
packages/openapi-schema/src/types.gen.ts (2)
CreditBilling(6095-6124)ModelTier(1532-1532)
apps/api/src/modules/credit/credit.service.ts (2)
apps/api/src/modules/credit/credit.dto.ts (2)
SyncTokenCreditUsageJobData(12-29)ModelUsageDetail(61-71)packages/utils/src/id.ts (1)
genCreditUsageId(285-288)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: build / Build
🔇 Additional comments (13)
apps/api/src/modules/credit/credit.dto.ts (2)
1-1: LGTM!The import statement correctly adds
ModelTiertype using theimport typesyntax pattern established in the file.
8-29: Well-structured interface with clear documentation.The
SyncTokenCreditUsageJobDatainterface is well-designed for per-call billing. The JSDoc comments clearly explain the purpose and usage context.One minor inconsistency to note:
cacheReadTokensandcacheWriteTokensare required here but optional inModelUsageDetail(lines 68-69). This is acceptable since this interface is specifically for billing where cache tokens should always be tracked (even if 0), but worth keeping consistent if these interfaces are meant to be interchangeable.apps/api/src/modules/credit/credit.service.ts (4)
17-17: LGTM!Import correctly added for the new DTO type.
1092-1096: Good JSDoc documentation for the new method.The comment clearly describes the purpose and trigger point for this billing method.
1184-1198: LGTM on the zero-cost early return path.The early return correctly creates a usage record with
amount: 0anddueAmounttracking the original cost, and returnsfalseto indicate no recharge is required.
1200-1215: Credit deduction path looks correct.The method properly delegates to
deductCreditsAndCreateUsageand returns therequireRechargeflag to allow the caller to handle insufficient credits.apps/api/src/modules/skill/skill-invoker.service.ts (6)
55-55: LGTM!Import correctly updated to use the new
SyncTokenCreditUsageJobDatatype.
1067-1084: Per-call billing integration looks correct.The billing is triggered at the right point (
on_chat_model_end) with proper gating viashouldBillCredits. The condition correctly allows billing for successful completions and user aborts (since partial usage should be charged).
1548-1574: Routing logic handles Auto model correctly.The code properly checks for routed data and fetches the original item's billing configuration when the request came through an Auto model. Good fallback behavior with warning log on failure.
1576-1584: Early return on missing credit billing is appropriate.The warning log helps with debugging when billing config is missing.
1586-1600: Credit usage data construction is correct.All required fields from
SyncTokenCreditUsageJobDataare properly populated. Note thattierfrom the DTO interface is not being set here - verify if this is intentional.The
SyncTokenCreditUsageJobDatainterface includes an optionaltier?: ModelTierfield, but it's not being set in thecreditUsageDataobject. If tier-based billing or reporting is needed, this should be added:const creditUsageData: SyncTokenCreditUsageJobData = { uid: user.uid, resultId, version, inputTokens, outputTokens, cacheReadTokens, cacheWriteTokens, creditBilling, modelName: billingModelName || 'unknown', actualModelName: actualProviderItem?.name, modelProvider: actualProviderItem?.provider?.name, + tier: actualProviderItem?.tier as ModelTier | undefined, timestamp: new Date(), };
1602-1610: Credit billing processing with proper error handling.The method correctly throws
ModelUsageQuotaExceededwhen credits are insufficient, which will be caught by the caller and handled appropriately. The info log provides useful debugging information.packages/skill-template/src/scheduler/utils/message.ts (1)
198-216: LGTM! AI message caching implementation is correct.The new AI message caching logic properly:
- Converts content to text format for caching
- Applies the cachePoint marker after the content
- Preserves essential properties (
tool_calls,additional_kwargs)- Follows the same pattern as system and human message handling
This ensures AI messages in the conversation history can be cached while maintaining their full context.
| * Two-level caching strategy: | ||
| * 1. Global Static Point: After System Prompt (index 0) | ||
| * - Shared across all users and sessions | ||
| * - Contains: System Prompt + Tool Definitions + Examples | ||
| * - Benefits: Write once, reuse globally | ||
| * 2. Session Dynamic Point: After the second-to-last message (messages.length - 2) | ||
| * - Caches the conversation history for the current user | ||
| * - Benefits: Reuses multi-turn conversation context within a session | ||
| * | ||
| * LangChain AWS uses cachePoint markers: | ||
| * - Format: { cachePoint: { type: 'default' } } | ||
| * - Only cache messages before the final message (not the last user query) | ||
| * - Place the cachePoint marker AFTER the content to cache | ||
| */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Documentation inconsistency with implementation.
Line 122 states the Session Dynamic Point is "After the second-to-last message (messages.length - 2)", but the actual implementation (lines 138-141) caches THREE positions: length-2, length-3, and length-4.
🔎 Apply this diff to correct the documentation:
-* 2. Session Dynamic Point: After the second-to-last message (messages.length - 2)
+* 2. Session Dynamic Point: After the last 3 messages except the final user message (indices length-2, length-3, length-4)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| * Two-level caching strategy: | |
| * 1. Global Static Point: After System Prompt (index 0) | |
| * - Shared across all users and sessions | |
| * - Contains: System Prompt + Tool Definitions + Examples | |
| * - Benefits: Write once, reuse globally | |
| * 2. Session Dynamic Point: After the second-to-last message (messages.length - 2) | |
| * - Caches the conversation history for the current user | |
| * - Benefits: Reuses multi-turn conversation context within a session | |
| * | |
| * LangChain AWS uses cachePoint markers: | |
| * - Format: { cachePoint: { type: 'default' } } | |
| * - Only cache messages before the final message (not the last user query) | |
| * - Place the cachePoint marker AFTER the content to cache | |
| */ | |
| * Two-level caching strategy: | |
| * 1. Global Static Point: After System Prompt (index 0) | |
| * - Shared across all users and sessions | |
| * - Contains: System Prompt + Tool Definitions + Examples | |
| * - Benefits: Write once, reuse globally | |
| * 2. Session Dynamic Point: After the last 3 messages except the final user message (indices length-2, length-3, length-4) | |
| * - Caches the conversation history for the current user | |
| * - Benefits: Reuses multi-turn conversation context within a session | |
| * | |
| * LangChain AWS uses cachePoint markers: | |
| * - Format: { cachePoint: { type: 'default' } } | |
| * - Place the cachePoint marker AFTER the content to cache | |
| */ |
🤖 Prompt for AI Agents
In packages/skill-template/src/scheduler/utils/message.ts around lines 117 to
129, the documentation incorrectly states the Session Dynamic Point caches only
the second-to-last message (messages.length - 2) while the implementation caches
three positions; update the docblock to describe that the Session Dynamic Point
places cache markers after the second-to-last, third-to-last, and fourth-to-last
messages (messages.length - 2, messages.length - 3, messages.length - 4) and
keep the note that cachePoint markers must be placed AFTER the content to cache
so the docs match the code.
| // Determine if this message should have a cache point | ||
| // 1. Global Static Point: After System Prompt (index 0) | ||
| // 2. Session Dynamic Point: After the last 3 messages except the last user message (index -2, -3, -4) | ||
| const isGlobalStaticPoint = index === 0; | ||
| const isSessionDynamicPoint = | ||
| index === messages.length - 2 || | ||
| index === messages.length - 3 || | ||
| index === messages.length - 4; | ||
| const shouldAddCachePoint = isGlobalStaticPoint || isSessionDynamicPoint; | ||
|
|
||
| if (!shouldAddCachePoint) { | ||
| return message; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Potential duplicate cache points for short message arrays.
When the message array has 2-4 messages, the system message at index 0 could receive duplicate cache point markers because it matches both isGlobalStaticPoint and one of the isSessionDynamicPoint conditions:
messages.length = 2: index 0 matches both global (index 0) and session (length-2 = 0)messages.length = 3: index 0 matches both global (index 0) and session (length-3 = 0)messages.length = 4: index 0 matches both global (index 0) and session (length-4 = 0)
While this won't crash, it applies caching transformation twice to the same message, which could cause unexpected behavior depending on how the cache provider handles duplicate markers.
🔎 Apply this diff to prevent duplicate cache points:
const applyContextCaching = (messages: BaseMessage[]): BaseMessage[] => {
if (messages.length <= 1) return messages;
return messages.map((message, index) => {
// Determine if this message should have a cache point
// 1. Global Static Point: After System Prompt (index 0)
// 2. Session Dynamic Point: After the last 3 messages except the last user message (index -2, -3, -4)
const isGlobalStaticPoint = index === 0;
const isSessionDynamicPoint =
- index === messages.length - 2 ||
- index === messages.length - 3 ||
- index === messages.length - 4;
+ index > 0 && (
+ index === messages.length - 2 ||
+ index === messages.length - 3 ||
+ index === messages.length - 4
+ );
const shouldAddCachePoint = isGlobalStaticPoint || isSessionDynamicPoint;🤖 Prompt for AI Agents
In packages/skill-template/src/scheduler/utils/message.ts around lines 134 to
146, the logic that marks cache points can double-mark index 0 for short
messages (length 2–4) because isSessionDynamicPoint includes indices that can
equal 0; change the condition so session dynamic points exclude index 0 (e.g.,
require index > 0) or compute session indices first and dedupe before combining
with the global static point, then use the combined unique result to decide
shouldAddCachePoint so index 0 cannot be added twice.
| // Use _getType() instead of instanceof to handle deserialized messages | ||
| const messageType = message._getType(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use .getType() method instead of _getType() for better API stability.
Current LangChain documentation recommends using .getType() instead, and official code examples use message.getType(). While _getType() may work, typeguard functions like isAIMessage() are the preferred alternative, especially for proper type checking.
🤖 Prompt for AI Agents
In packages/skill-template/src/scheduler/utils/message.ts around lines 149 to
150, the code uses the internal method _getType() which is unstable; replace the
call to message._getType() with the public API message.getType() (or, where
appropriate, use the provided typeguard helpers like
isAIMessage()/isHumanMessage() for type checks) so the code relies on the
supported, stable LangChain API and proper typeguards for deserialized messages.
Summary by CodeRabbit
New Features
Refactor
✏️ Tip: You can customize this high-level summary in your review settings.