Skip to content

feat(schema): add FacialEmotionOutput for multimodal engine integration#26

Open
Kunal-Somani wants to merge 1 commit into
ruxailab:mainfrom
Kunal-Somani:feat/multimodal-sentiment-engine
Open

feat(schema): add FacialEmotionOutput for multimodal engine integration#26
Kunal-Somani wants to merge 1 commit into
ruxailab:mainfrom
Kunal-Somani:feat/multimodal-sentiment-engine

Conversation

@Kunal-Somani

Copy link
Copy Markdown

Summary

The facial-sentiment-analysis-api and sentiment-analysis-api currently have no shared contract — the facial API returns a
GetEmotionPercentagesResponse Pydantic model but there is no standardized way to convert that into the format the MultimodalSentimentEngine expects.

This PR adds that conversion layer.

New File — schemas/multimodal_output_schema.py

FacialEmotionOutput is a Pydantic wrapper around
GetEmotionPercentagesResponse that exposes two conversion methods:

from_response(response)
Converts a GetEmotionPercentagesResponse into a validated FacialEmotionOutput with field-level range constraints (0.0–100.0 per emotion). Fails fast on out-of-range values rather than silently passing bad data downstream.

to_multimodal_dict()
Produces the {emotion_label: percentage} dict format expected by MultimodalSentimentEngine.analyze(facial_emotions=...) in the sentiment-analysis-api repo (PR #39).

dominant_emotion()
Returns (label, percentage) for the highest-scoring emotion — useful for logging and single-label summaries without reimplementing max() at every call site.

Integration Boundary

facial-sentiment-analysis-api
        |
        | GetEmotionPercentagesResponse
        v
FacialEmotionOutput.from_response()
        |
        | .to_multimodal_dict()
        v
MultimodalSentimentEngine.analyze(facial_emotions={...})
        |
        | fused prediction
        v
sentiment-analysis-api /multimodal/analyze (PR #39)

Example

from schemas.multimodal_output_schema import FacialEmotionOutput

response = emotion_analysis_service.get_emotion_percentages(video_path)
output = FacialEmotionOutput.from_response(response)

# Ready for MultimodalSentimentEngine 
facial_dict = output.to_multimodal_dict()
# {'Angry': 2.5, 'Happy': 61.3, 'Neutral': 28.0, ...}

label, pct = output.dominant_emotion()
# ('Happy', 61.3)

Relation to GSoC 2026

This is the integration boundary between both RUXAILAB sentiment repos for the Multimodal Sentiment Analysis Engine project. PR #21 and #22 established consistent logging across both repos. This PR establishes the data contract between them.

Adds schemas/multimodal_output_schema.py — a Pydantic wrapper around
GetEmotionPercentagesResponse that standardizes the facial API output
for consumption by the MultimodalSentimentEngine in sentiment-analysis-api.

Key additions:
- FacialEmotionOutput.from_response(): converts GetEmotionPercentagesResponse
  into a validated Pydantic model with field-level range constraints (0-100%)
- to_multimodal_dict(): produces the {emotion_label: percentage} dict
  format expected by MultimodalSentimentEngine.analyze(facial_emotions=...)
- dominant_emotion(): returns the top emotion and its confidence as a
  (label, float) tuple — useful for logging and single-label summaries

This is the integration boundary between the two RUXAILAB sentiment repos:
the facial API produces GetEmotionPercentagesResponse, FacialEmotionOutput
converts it, and the multimodal engine fuses it with text and prosody.
Closes the gap identified in PR ruxailab#21 and ruxailab#22 where the two pipelines
had no shared contract.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant