All resources (slides, code, etc) for AI_Dev Europe 2025: Sarcastically Speaking: Unlocking Multi-modal Sentiment Analysis with NLP and Facial Expressions
Please see session presentation and recording for details. The short answer, this is a model for tackling a complex problem and attempting to achieve the best result as possible. The real value is the session recording as this is a demonstration from start (how to tackle a problem) to finish (building the models and doing the multimodal inference). I wrote this session because I dont't think there are any good tutorials/workshops that discuss things like data grooming (at all), how to attack a problem, and decision to make when building a model.
IMPORTANT: It is highly recommended not to use this in production. This is more of a research paper as this and the original research paper has the following limitations:
- the dataset size isn't nearly sufficient for a real-world application (the original research paper is academic only)
- sarcasm is truly a complex problem that will never be perfect despite getting better results (even though 80% is good, we talk about why in the session)
- Navigate to
demo/mp4_to_facialand follow theREADME.mdto generate the Facial Landmark data - Navigate to
demo/facialand follow theREADME.mdto create the model and also run inference
- Navigate to
demo/mp4_to_speechand follow theREADME.mdto generate the Facial Landmark data - Navigate to
demo/speechand follow theREADME.mdto create the model and also run inference
- Navigate to
demo/multimodaland follow theREADME.mdto run inference