feat: add native Groq model integration for high-speed evaluations by Jayachander123 · Pull Request #2556 · confident-ai/deepeval

Jayachander123 · 2026-03-17T03:04:29Z

Description

This PR introduces native support for Groq (GroqModel) to the DeepEval framework, allowing users to leverage Groq's ultra-fast LPU inference engine for high-speed LLM evaluations.

Motivation

As evaluation datasets grow larger, metric calculation speed becomes a bottleneck. Integrating Groq natively allows developers to run DeepEval metrics significantly faster using models like llama3-8b-8192.

Changes Made

Core Model (deepeval/models/llms/groq_model.py): Implemented GroqModel inheriting from DeepEvalBaseLLM. Added support for both synchronous (generate) and asynchronous (a_generate) execution, as well as JSON mode structured outputs.
Lazy Loading: Utilized DeepEval's require_dependency for the groq SDK. This ensures groq remains an optional dependency and does not bloat the environment for non-Groq users.
Configuration (deepeval/config/settings.py): Added GROQ_API_KEY, GROQ_MODEL_NAME, and USE_GROQ_MODEL to the central Pydantic Settings class for seamless .env management.
Security & Type Safety: Handled Pydantic SecretStr unwrapping in the model loaders to ensure secure key management without causing isinstance type errors in the underlying Groq client.
Documentation (docs/): Added comprehensive .mdx documentation for Groq under the integrations tab, including setup instructions and Python/ENV usage examples.

Testing

Added comprehensive unit tests in tests/test_core/test_models/test_groq_model.py:

Verified that explicit API keys passed to the constructor override .env settings.
Verified that if no key is provided, the model correctly falls back to Settings.GROQ_API_KEY and successfully unwraps the SecretStr before passing it to the client.
All tests pass locally.

Checklist

Code is formatted with black and linted with ruff
New unit tests added and passing
Lazy loading implemented (no new required dependencies added to pyproject.toml)
Documentation added following existing integration formats

vercel · 2026-03-17T03:04:33Z

@Jayachander123 is attempting to deploy a commit to the Confident AI Team on Vercel.

A member of the Team first needs to authorize it.

Jayachander123 · 2026-03-17T03:23:06Z

Hi team! Just a quick heads-up regarding the failing CI checks—my new code is working perfectly, but the pipeline is hitting some standard fork-related restrictions:

Core & Confident Tests (test): These are failing due to missing OPENAI_API_KEY and CONFIDENT_API_KEY secrets in the GitHub Actions environment, which is the expected security behavior for PRs originating from a fork. My local tests for GroqModel pass successfully.
Lint Lint Lint (lint): The black formatting step failed because it flagged 102 existing files across the repository that need reformatting. I already ran black and ruff locally on the 4 specific files I modified, so my additions are fully compliant with your style guide.
Vercel: This is pending because Vercel requires a repository maintainer to manually authorize preview deployments from outside contributors.

Let me know if you need me to adjust anything, or if you are able to trigger the test workflows on your end with the secrets enabled!

A-Vamshi · 2026-03-17T07:33:05Z

Hey @Jayachander123, thanks for raising this PR! It looks mostly good, just added some comments above, could you please resolve them? Let me know if you have any doubts, thank you! :)

Jayachander123 · 2026-03-17T14:12:51Z

Hey @Jayachander123, thanks for raising this PR! It looks mostly good, just added some comments above, could you please resolve them? Let me know if you have any doubts, thank you! :)

Hi @A-Vamshi! Thank you so much for taking the time to review this. I would love to resolve your comments, but I actually don't see any inline code comments on my end in the PR diff or the conversation timeline.

Please let me know what you'd like me to change and I'll get right on it!

A-Vamshi

Hey @Jayachander123, here's the comments, hope they're visible now :)

A-Vamshi · 2026-03-17T07:31:52Z

+default_groq_model = "llama3-8b-8192"
+
+# Use a standard string for the retry decorator if ProviderSlug.GROQ doesn't exist yet
+retry_groq = create_retry_decorator("Groq")


Can we add the groq as enum inside the PS (ProviderSlug) itself? Here's where you can add it btw: https://github.com/confident-ai/deepeval/blob/main/deepeval/constants.py#L27

Done! I've added GROQ = "groq" to the ProviderSlug enum in deepeval/constants.py and updated the retry decorator to use it.

A-Vamshi · 2026-03-17T07:31:56Z

+        return self._client
+
+    def load_async_model(self) -> "AsyncGroq":
+        """Initializes and caches the asynchronous Groq client."""


It seems like most of the code here is repeated but for sync and async logic, could we combine the logic into one method load_model and separate the logic using self.async_mode? Please look at the existing examples to see how we can do that: https://github.com/confident-ai/deepeval/blob/main/deepeval/models/llms/grok_model.py#L256

Done! I've removed load_async_model and combined both into a single load_model(self, async_mode: bool = False) method, following the Grok model example.

A-Vamshi · 2026-03-17T07:31:58Z

+    # Generation Methods
+    # -------------------------------------------------------------------------
+    @retry_groq
+    def generate(


For the generate and a_generate method, if you look at the other model examples, we have a way to support images inside them as well, an example for this is here: https://github.com/confident-ai/deepeval/blob/main/deepeval/models/llms/openai_model.py#L154

I've looked through the official docs to see how to pass images, you can see them here: https://console.groq.com/docs/vision#how-to-pass-images-from-urls-as-input

It looks like the API is mostly similar to openai so the above shared example should help you implement this, don't worry about supporting images if it's too hard, you can safely ignore this comment :)

Thank you for sharing the Groq vision docs! I will safely ignore this one for now just to keep this initial integration focused and stable, but we can definitely add multimodal support in a future PR.

A-Vamshi · 2026-03-17T07:32:00Z

+    def supports_log_probs(self) -> Union[bool, None]:
+        return False
+
+    def supports_temperature(self) -> Union[bool, None]:
+        return True
+
+    def supports_multimodal(self) -> Union[bool, None]:
+        return False
+
+    def supports_structured_outputs(self) -> Union[bool, None]:
+        return True
+
+    def supports_json_mode(self) -> Union[bool, None]:
+        return True


For these methods, we just need to add the constants inside the self.model_data and return them here, I'm sharing some examples that would help you understand how to write this here:

https://github.com/confident-ai/deepeval/blob/main/deepeval/models/llms/grok_model.py#L77

https://github.com/confident-ai/deepeval/blob/main/deepeval/models/llms/grok_model.py#L237

https://github.com/confident-ai/deepeval/blob/main/deepeval/models/llms/constants.py

Done! I added GROQ_MODELS_DATA to deepeval/models/llms/constants.py using make_model_data (with verified Groq API pricing). I then updated init to load this into self.model_data and replaced the hardcoded booleans here with dynamic lookups.

A-Vamshi

Hey @Jayachander123, everything looks good, would you be open to writing docs for this as well? If you're willing to write them, please follow the existing pattern used in the existing models docs in the integrations tab, thank you :) Let me know if you're not too keen on the documentation side, we can get this PR merged right away!

Jayachander123 · 2026-03-20T20:03:30Z

Hey @Jayachander123, everything looks good, would you be open to writing docs for this as well? If you're willing to write them, please follow the existing pattern used in the existing models docs in the integrations tab, thank you :) Let me know if you're not too keen on the documentation side, we can get this PR merged right away!

Hey @A-Vamshi! I'm really glad the code looks good. I went ahead and wrote the documentation as you suggested.

I added groq.mdx following the exact format of the existing Gemini docs in the integrations tab, I've pushed the commit to this PR and updated the description.

Let me know if the docs look good or if you need any formatting tweaks before we merge! Thanks again for all the help.

penguine-ip · 2026-04-14T06:23:03Z

Hey @Jayachander123 we're missing a few things, specifically, add calculate_cost, use require_costs, add generation_kwargs, remove the os.environ fallback, add cost Settings fields, from what i can see

…fallback

Jayachander123 · 2026-04-14T10:37:01Z

Hey @Jayachander123 we're missing a few things, specifically, add calculate_cost, use require_costs, add generation_kwargs, remove the os.environ fallback, add cost Settings fields, from what i can see

Hey @penguine-ip, thanks for catching those! I've just pushed a commit that addresses all your points to align with the latest model standards:

Added cost Settings fields: Added GROQ_COST_PER_INPUT_TOKEN and GROQ_COST_PER_OUTPUT_TOKEN to deepeval/config/settings.py.
Removed os.environ fallback: Updated init so API keys and costs now rely purely on the central Settings class.
Added generation_kwargs: Added this to init and unpacked it into chat_args in both generate and a_generate.
Added calculate_cost & require_costs: Implemented the calculate_cost method using the require_costs utility for strict pricing validation, and updated the generation methods to return the actual calculated cost instead of 0.0.

feat: add native Groq model integration for high-speed evaluations

dc6df25

A-Vamshi reviewed Mar 17, 2026

View reviewed changes

refactor: address reviewer comments for Groq integration

cde025c

Jayachander123 requested a review from A-Vamshi March 19, 2026 20:12

A-Vamshi reviewed Mar 20, 2026

View reviewed changes

docs: add comprehensive documentation for Groq model integration

5e96919

Jayachander123 requested a review from A-Vamshi March 23, 2026 23:43

refactor: implement cost tracking, generation_kwargs, and remove env …

1d85b51

…fallback

chore: merge upstream main and resolve conflicts

a1b219f

Conversation

Jayachander123 commented Mar 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Motivation

Changes Made

Testing

Checklist

Uh oh!

vercel Bot commented Mar 17, 2026

Uh oh!

Jayachander123 commented Mar 17, 2026

Uh oh!

A-Vamshi commented Mar 17, 2026

Uh oh!

Jayachander123 commented Mar 17, 2026

Uh oh!

A-Vamshi left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

A-Vamshi left a comment

Choose a reason for hiding this comment

Uh oh!

Jayachander123 commented Mar 20, 2026

Uh oh!

penguine-ip commented Apr 14, 2026

Uh oh!

Jayachander123 commented Apr 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Jayachander123 commented Mar 17, 2026 •

edited

Loading

A-Vamshi left a comment •

edited

Loading