Add intent extraction test suite and fix priority extraction by anfredette · Pull Request #181 · llm-d-incubation/llm-d-planner

anfredette · 2026-04-13T23:10:40Z

Add 39-scenario test suite for LLM intent extraction covering all 9 use cases, GPU/model extraction, priority enforcement, user counts, conversation history, and UI button strings. Run explicitly via make test-intent (requires Ollama); 2 smoke tests run as part of make test-integration.
Rewrite the intent extraction prompt as a short, directive format with ordered pattern-matching rules, replacing the verbose prose prompt. Remove experience_class, complexity_priority, and additional_context from the prompt — fields that were either inferred deterministically or never consumed downstream. Consolidate the prompt as self-contained (remove INTENT_EXTRACTION_SCHEMA constant and schema_description parameter from extract_structured_data).
Fix unreliable priority extraction: the LLM (qwen2.5:7b) was inferring priorities from use case type rather than from explicit user statements, causing incorrect scoring weights. The LLM now returns *_mentioned booleans alongside *_priority values, and post-processing resets priority to "medium" when the topic was not explicitly mentioned.
Harden post-processing: case-insensitive normalization for domain_specialization, experience_class, use_case, and *_mentioned booleans; priority value aliases (e.g. "very_high" -> "high") applied before validation; logger.warning when an unrecognized use_case cannot be resolved by alias map or fuzzy match. Add 4 unit tests for case-insensitive normalization.

Add comprehensive test suite for LLM intent extraction covering all 9 use cases, GPU/model extraction, priority enforcement, user counts, conversation history, and UI button strings. Tests require Ollama and are run explicitly via `make test-intent`. Two smoke tests run as part of `make test-integration`. Assisted-by: Claude <noreply@anthropic.com> Signed-off-by: Andre Fredette <afredette@redhat.com>

Priority extraction fix: - LLM now returns *_mentioned booleans alongside *_priority values. Post-processing trusts priority only when _mentioned=true; resets to "medium" otherwise. This prevents the LLM (qwen2.5:7b) from inferring priorities from use-case type rather than explicit user statements. The SLO profiles already handle use-case-appropriate targets. Prompt rewrite (for smaller LLMs): - Replace verbose prose prompt with a short, directive format using ordered pattern-matching rules. The prompt is now self-contained (schema embedded inline) so INTENT_EXTRACTION_SCHEMA constant and the schema_description parameter on extract_structured_data() are removed. - Remove experience_class, complexity_priority, and additional_context from the LLM prompt. experience_class is inferred deterministically from use_case in post-processing; complexity_priority and additional_context were never consumed downstream. Post-processing hardening: - Case-insensitive normalization for domain_specialization, experience_class, and *_mentioned booleans (handles string "True"). - Lowercase use_case before alias/fuzzy lookup so mixed-case LLM responses like "Text_Summarization" are handled correctly. - Add logger.warning when an unrecognized use_case cannot be resolved by alias map or fuzzy match. - Priority value aliases (e.g. "very_high" -> "high") applied before validation. - Remove stale complexity_priority from test helper _base_intent(). - Add 4 unit tests for case-insensitive normalization. Assisted-by: Claude <noreply@anthropic.com> Signed-off-by: Andre Fredette <afredette@redhat.com>

anfredette · 2026-04-14T20:44:56Z

@amito I suggest you test this with the use cases that were giving you trouble. And, better yet, we can add them to the test cases now or in a later pr.

anfredette requested a review from amito April 14, 2026 00:20

anfredette force-pushed the intent-extraction-test branch from 49e3c52 to b452605 Compare April 14, 2026 13:27

anfredette force-pushed the intent-extraction-test branch from b452605 to ad8c9a7 Compare April 14, 2026 20:29

anfredette requested review from jgchn and namasl April 14, 2026 20:42

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add intent extraction test suite and fix priority extraction#181

Add intent extraction test suite and fix priority extraction#181
anfredette wants to merge 2 commits intollm-d-incubation:mainfrom
anfredette:intent-extraction-test

anfredette commented Apr 13, 2026 •

edited

Loading

Uh oh!

anfredette commented Apr 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

anfredette commented Apr 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

anfredette commented Apr 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

anfredette commented Apr 13, 2026 •

edited

Loading