Link to Blog Post: https://medium.com/@benjamin.mingjun/data-driven-sommelier-uncovering-the-secrets-behind-wine-16c65cfcc323
Wine sommeliers often have detailed, insightful descriptions for each wine they try. Basic descriptions like elegant, fruity, and earthy can help a consumer better understand a potential wine. However, more complex descriptions like boxwood, unctuous, and cassis leave an average connoisseur confused. In this project, we wanted to explore whether we can utilize machine learning and natural language processing to make sense of these reviews and help us predict meaningful features like wine variety and price from a large dataset.
Using a large dataset of wine reviews, we utilized both structured features (e.g., country, price, ratings) with more unstructured features (long, varying reviews) to discover patterns and build predictive models. We also explored the dataset itself to get a more general understanding of existing trends within wine reviews.
This project dives into the journey of turning subjective wine data into structured insights and shows both the potential and the limitations of applying data science to something as nuanced as taste. We hope to become pseudo-experts in a field we hold minimal knowledge of, but that is highly discussed and valued.