Skip to content

Add intent extraction test suite and fix priority extraction#181

Open
anfredette wants to merge 2 commits intollm-d-incubation:mainfrom
anfredette:intent-extraction-test
Open

Add intent extraction test suite and fix priority extraction#181
anfredette wants to merge 2 commits intollm-d-incubation:mainfrom
anfredette:intent-extraction-test

Conversation

@anfredette
Copy link
Copy Markdown
Collaborator

@anfredette anfredette commented Apr 13, 2026

  • Add 39-scenario test suite for LLM intent extraction covering all 9 use cases, GPU/model extraction, priority enforcement, user counts, conversation history, and UI button strings. Run explicitly via make test-intent (requires Ollama); 2 smoke tests run as part of make test-integration.

  • Rewrite the intent extraction prompt as a short, directive format with ordered pattern-matching rules, replacing the verbose prose prompt. Remove experience_class, complexity_priority, and additional_context from the prompt — fields that were either inferred deterministically or never consumed downstream. Consolidate the prompt as self-contained (remove INTENT_EXTRACTION_SCHEMA constant and schema_description parameter from extract_structured_data).

  • Fix unreliable priority extraction: the LLM (qwen2.5:7b) was inferring priorities from use case type rather than from explicit user statements, causing incorrect scoring weights. The LLM now returns *_mentioned booleans alongside *_priority values, and post-processing resets priority to "medium" when the topic was not explicitly mentioned.

  • Harden post-processing: case-insensitive normalization for domain_specialization, experience_class, use_case, and *_mentioned booleans; priority value aliases (e.g. "very_high" -> "high") applied before validation; logger.warning when an unrecognized use_case cannot be resolved by alias map or fuzzy match. Add 4 unit tests for case-insensitive normalization.

@anfredette anfredette requested a review from amito April 14, 2026 00:20
Add comprehensive test suite for LLM intent extraction covering all 9
use cases, GPU/model extraction, priority enforcement, user counts,
conversation history, and UI button strings. Tests require Ollama and
are run explicitly via `make test-intent`. Two smoke tests run as part
of `make test-integration`.

Assisted-by: Claude <noreply@anthropic.com>
Signed-off-by: Andre Fredette <afredette@redhat.com>
@anfredette anfredette force-pushed the intent-extraction-test branch from 49e3c52 to b452605 Compare April 14, 2026 13:27
Priority extraction fix:
- LLM now returns *_mentioned booleans alongside *_priority values.
  Post-processing trusts priority only when _mentioned=true; resets to
  "medium" otherwise.  This prevents the LLM (qwen2.5:7b) from
  inferring priorities from use-case type rather than explicit user
  statements.  The SLO profiles already handle use-case-appropriate
  targets.

Prompt rewrite (for smaller LLMs):
- Replace verbose prose prompt with a short, directive format using
  ordered pattern-matching rules.  The prompt is now self-contained
  (schema embedded inline) so INTENT_EXTRACTION_SCHEMA constant and the
  schema_description parameter on extract_structured_data() are removed.
- Remove experience_class, complexity_priority, and additional_context
  from the LLM prompt.  experience_class is inferred deterministically
  from use_case in post-processing; complexity_priority and
  additional_context were never consumed downstream.

Post-processing hardening:
- Case-insensitive normalization for domain_specialization,
  experience_class, and *_mentioned booleans (handles string "True").
- Lowercase use_case before alias/fuzzy lookup so mixed-case LLM
  responses like "Text_Summarization" are handled correctly.
- Add logger.warning when an unrecognized use_case cannot be resolved
  by alias map or fuzzy match.
- Priority value aliases (e.g. "very_high" -> "high") applied before
  validation.
- Remove stale complexity_priority from test helper _base_intent().
- Add 4 unit tests for case-insensitive normalization.

Assisted-by: Claude <noreply@anthropic.com>
Signed-off-by: Andre Fredette <afredette@redhat.com>
@anfredette anfredette force-pushed the intent-extraction-test branch from b452605 to ad8c9a7 Compare April 14, 2026 20:29
@anfredette anfredette requested review from jgchn and namasl April 14, 2026 20:42
@anfredette
Copy link
Copy Markdown
Collaborator Author

@amito I suggest you test this with the use cases that were giving you trouble. And, better yet, we can add them to the test cases now or in a later pr.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant