feat: add native Groq model integration for high-speed evaluations#2556
feat: add native Groq model integration for high-speed evaluations#2556Jayachander123 wants to merge 5 commits intoconfident-ai:mainfrom
Conversation
|
@Jayachander123 is attempting to deploy a commit to the Confident AI Team on Vercel. A member of the Team first needs to authorize it. |
|
Hi team! Just a quick heads-up regarding the failing CI checks—my new code is working perfectly, but the pipeline is hitting some standard fork-related restrictions:
Let me know if you need me to adjust anything, or if you are able to trigger the test workflows on your end with the secrets enabled! |
|
Hey @Jayachander123, thanks for raising this PR! It looks mostly good, just added some comments above, could you please resolve them? Let me know if you have any doubts, thank you! :) |
Hi @A-Vamshi! Thank you so much for taking the time to review this. I would love to resolve your comments, but I actually don't see any inline code comments on my end in the PR diff or the conversation timeline. Please let me know what you'd like me to change and I'll get right on it! |
There was a problem hiding this comment.
Hey @Jayachander123, here's the comments, hope they're visible now :)
| default_groq_model = "llama3-8b-8192" | ||
|
|
||
| # Use a standard string for the retry decorator if ProviderSlug.GROQ doesn't exist yet | ||
| retry_groq = create_retry_decorator("Groq") |
There was a problem hiding this comment.
Can we add the groq as enum inside the PS (ProviderSlug) itself? Here's where you can add it btw: https://github.com/confident-ai/deepeval/blob/main/deepeval/constants.py#L27
There was a problem hiding this comment.
Done! I've added GROQ = "groq" to the ProviderSlug enum in deepeval/constants.py and updated the retry decorator to use it.
| return self._client | ||
|
|
||
| def load_async_model(self) -> "AsyncGroq": | ||
| """Initializes and caches the asynchronous Groq client.""" |
There was a problem hiding this comment.
It seems like most of the code here is repeated but for sync and async logic, could we combine the logic into one method load_model and separate the logic using self.async_mode? Please look at the existing examples to see how we can do that: https://github.com/confident-ai/deepeval/blob/main/deepeval/models/llms/grok_model.py#L256
There was a problem hiding this comment.
Done! I've removed load_async_model and combined both into a single load_model(self, async_mode: bool = False) method, following the Grok model example.
| # Generation Methods | ||
| # ------------------------------------------------------------------------- | ||
| @retry_groq | ||
| def generate( |
There was a problem hiding this comment.
For the generate and a_generate method, if you look at the other model examples, we have a way to support images inside them as well, an example for this is here: https://github.com/confident-ai/deepeval/blob/main/deepeval/models/llms/openai_model.py#L154
I've looked through the official docs to see how to pass images, you can see them here: https://console.groq.com/docs/vision#how-to-pass-images-from-urls-as-input
It looks like the API is mostly similar to openai so the above shared example should help you implement this, don't worry about supporting images if it's too hard, you can safely ignore this comment :)
There was a problem hiding this comment.
Thank you for sharing the Groq vision docs! I will safely ignore this one for now just to keep this initial integration focused and stable, but we can definitely add multimodal support in a future PR.
| def supports_log_probs(self) -> Union[bool, None]: | ||
| return False | ||
|
|
||
| def supports_temperature(self) -> Union[bool, None]: | ||
| return True | ||
|
|
||
| def supports_multimodal(self) -> Union[bool, None]: | ||
| return False | ||
|
|
||
| def supports_structured_outputs(self) -> Union[bool, None]: | ||
| return True | ||
|
|
||
| def supports_json_mode(self) -> Union[bool, None]: | ||
| return True |
There was a problem hiding this comment.
For these methods, we just need to add the constants inside the self.model_data and return them here, I'm sharing some examples that would help you understand how to write this here:
There was a problem hiding this comment.
Done! I added GROQ_MODELS_DATA to deepeval/models/llms/constants.py using make_model_data (with verified Groq API pricing). I then updated init to load this into self.model_data and replaced the hardcoded booleans here with dynamic lookups.
A-Vamshi
left a comment
There was a problem hiding this comment.
Hey @Jayachander123, everything looks good, would you be open to writing docs for this as well? If you're willing to write them, please follow the existing pattern used in the existing models docs in the integrations tab, thank you :) Let me know if you're not too keen on the documentation side, we can get this PR merged right away!
Hey @A-Vamshi! I'm really glad the code looks good. I went ahead and wrote the documentation as you suggested. I added groq.mdx following the exact format of the existing Gemini docs in the integrations tab, I've pushed the commit to this PR and updated the description. Let me know if the docs look good or if you need any formatting tweaks before we merge! Thanks again for all the help. |
|
Hey @Jayachander123 we're missing a few things, specifically, add calculate_cost, use require_costs, add generation_kwargs, remove the os.environ fallback, add cost Settings fields, from what i can see |
Hey @penguine-ip, thanks for catching those! I've just pushed a commit that addresses all your points to align with the latest model standards:
|
Description
This PR introduces native support for Groq (
GroqModel) to the DeepEval framework, allowing users to leverage Groq's ultra-fast LPU inference engine for high-speed LLM evaluations.Motivation
As evaluation datasets grow larger, metric calculation speed becomes a bottleneck. Integrating Groq natively allows developers to run DeepEval metrics significantly faster using models like
llama3-8b-8192.Changes Made
deepeval/models/llms/groq_model.py): ImplementedGroqModelinheriting fromDeepEvalBaseLLM. Added support for both synchronous (generate) and asynchronous (a_generate) execution, as well as JSON mode structured outputs.require_dependencyfor thegroqSDK. This ensuresgroqremains an optional dependency and does not bloat the environment for non-Groq users.deepeval/config/settings.py): AddedGROQ_API_KEY,GROQ_MODEL_NAME, andUSE_GROQ_MODELto the central PydanticSettingsclass for seamless.envmanagement.SecretStrunwrapping in the model loaders to ensure secure key management without causingisinstancetype errors in the underlying Groq client.docs/): Added comprehensive.mdxdocumentation for Groq under the integrations tab, including setup instructions and Python/ENV usage examples.Testing
Added comprehensive unit tests in
tests/test_core/test_models/test_groq_model.py:.envsettings.Settings.GROQ_API_KEYand successfully unwraps theSecretStrbefore passing it to the client.Checklist
blackand linted withruffpyproject.toml)