-
Notifications
You must be signed in to change notification settings - Fork 548
feat(conversation): add prompt cache + usage metrics to test caching #4154
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
… works Signed-off-by: Samantha Coyle <[email protected]>
Signed-off-by: Samantha Coyle <[email protected]>
Signed-off-by: Samantha Coyle <[email protected]>
| // NOTE: These are all translations due to langchaingo data types. | ||
|
|
||
| // extractInt64FromGenInfo extracts an int64 value from genInfo map to extract usage data from langchaingo's GenerationInfo map in the choices response. | ||
| func extractInt64FromGenInfo(genInfo map[string]any, key string) int64 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we not return an error here when the value is not a number? Do we not expect other number types (uint)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I initially defined this as int64 only in my original PR. However, after splitting this effort and what I'm assuming is a subsequent version bump, my conformance tests were no longer returning usage metrics, which required loosening the handling here. This change resolved the issue.
TLDR: Yes, I can add support for uint as a defensive measure and return an error for any other types.
conversation/converse.go
Outdated
| Metadata map[string]string `json:"metadata"` | ||
| Model *string `json:"model"` | ||
|
|
||
| PromptCacheRetention time.Duration `json:"promptCacheRetention"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be optional?
| PromptCacheRetention time.Duration `json:"promptCacheRetention"` | |
| PromptCacheRetention *time.Duration `json:"promptCacheRetention"` |
Signed-off-by: Samantha Coyle <[email protected]>
Description
This PR breaks out work from a larger contrib PR into a smaller, more focused change: #4129
It introduces response caching support via metadata passed to LLM providers. This serves as a workaround for LangChain’s WithPromptCaching(true) option, which currently sets a boolean value that fails Dapr conformance tests because OpenAI-based providers expect a duration string rather than a boolean.
To properly validate this workaround, usage metrics support was added, which required additional data type translations within LangChain.
The PR also renames CacheTTL to ResponseCacheTTL to more accurately reflect its behavior. Backward compatibility is maintained by continuing to support the original JSON tag.
Finally, the LangChain dependency was updated to support the newly required options that I am using.
Reconfirmed things work as expected:
Issue reference
We strive to have all PR being opened based on an issue, where the problem or feature have been discussed prior to implementation.
Please reference the issue this PR will close: #[issue number]
Checklist
Please make sure you've completed the relevant tasks for this PR, out of the following list:
Note: We expect contributors to open a corresponding documentation PR in the dapr/docs repository. As the implementer, you are the best person to document your work! Implementation PRs will not be merged until the documentation PR is opened and ready for review.