Skip to content

Add Coherence to the analysis workflow#766

Merged
davidgisbey merged 5 commits into
mainfrom
add-coherence-to-the-analysis-workflow
Jan 12, 2026
Merged

Add Coherence to the analysis workflow#766
davidgisbey merged 5 commits into
mainfrom
add-coherence-to-the-analysis-workflow

Conversation

@davidgisbey
Copy link
Copy Markdown
Contributor

@davidgisbey davidgisbey commented Jan 9, 2026

Description

This PR adds the Coherence auto-evaluation to the AnswerAnalysis workflow. It:

  • adds the answer_analysis_coherence_run table, model and factory
  • includes the LLMRecordable & AutoEvaluationResultsCreatable in the coherence run model
  • adds the stub_bedrock_invoke_model_openai_oss_coherence to StubBedrock
  • adds the AnswerAnalysis::CoherenceJob and refactors shared metric specs into a shared_example
  • calls the job after an answer has been composed and persisted
  • adds it to the admin UI

Screenshots

image image image

Trello card

https://trello.com/c/KJFocexA/3047-integrate-coherence-metric-into-analysis-workflow

@govuk-ci govuk-ci temporarily deployed to govuk-chat-add-coherenc-nyh3bs January 9, 2026 10:20 Inactive
This adds a migration to the table needed to store coherence
evaluations. It also adds the corresponding models and factories.

Much like with the answer relevancy models we will utilise concerns to
levargage shared functionality. So i've included the
LlmCallsRecordable and AutoEvaluationResultsCreatable concern to the
CoherenceRun model.
This follows the established pattern for other auto-evaluations of
adding a dedicated stub helper method for Bedrock invocations.

This will be used in the various specs (mainly system) that need to stub
out coherence calls.
This job adds the CoherenceJob which is essentially a copy of the
AnswerRelevancyJob but for coherence evaluation. To avoid duplication
i've added a shared example for all the tests since they completely overlap.
@davidgisbey davidgisbey force-pushed the add-coherence-to-the-analysis-workflow branch from 9e19aee to 3af2370 Compare January 9, 2026 10:40
@govuk-ci govuk-ci temporarily deployed to govuk-chat-add-coherenc-nyh3bs January 9, 2026 10:40 Inactive
@davidgisbey davidgisbey changed the title Add coherence to the analysis workflow Add Coherence to the analysis workflow Jan 9, 2026
This updates the answer analysis job to include coherence evaluation. It
updates the relevant specs to stub coherence model invocations.
@davidgisbey davidgisbey force-pushed the add-coherence-to-the-analysis-workflow branch from 3af2370 to 19bcfb4 Compare January 9, 2026 12:53
@govuk-ci govuk-ci temporarily deployed to govuk-chat-add-coherenc-nyh3bs January 9, 2026 12:53 Inactive
@davidgisbey davidgisbey force-pushed the add-coherence-to-the-analysis-workflow branch from 19bcfb4 to 53b7ba8 Compare January 9, 2026 12:56
@govuk-ci govuk-ci temporarily deployed to govuk-chat-add-coherenc-nyh3bs January 9, 2026 12:57 Inactive
This adds the coherence evaluation runs to the analysis tab of the question
show page in the admin interface. It mirrors the existing answer relevancy
runs display.
@davidgisbey davidgisbey force-pushed the add-coherence-to-the-analysis-workflow branch from 53b7ba8 to d4ae892 Compare January 9, 2026 13:11
@govuk-ci govuk-ci temporarily deployed to govuk-chat-add-coherenc-nyh3bs January 9, 2026 13:12 Inactive
@davidgisbey davidgisbey marked this pull request as ready for review January 9, 2026 16:15
Copy link
Copy Markdown
Member

@kevindew kevindew left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. Nice job with the shared specs.

I could foresee that we could hit some naming issues if we end up doing other things with auto eval, but that doesn't seem too big a concern as I guess we can start inserting the "generic" name in things if we need to.

@davidgisbey davidgisbey merged commit e7113d9 into main Jan 12, 2026
12 checks passed
@davidgisbey davidgisbey deleted the add-coherence-to-the-analysis-workflow branch January 12, 2026 10:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants