This workshop demonstrates how to perform data analysis on the Titanic dataset using Python, focusing on leveraging AI-assisted techniques with LangChain and GPT models. Participants will learn how to load data, explore its contents, visualize relationships, and interpret results using a combination of traditional data analysis methods and AI-powered insights.
- Basic understanding of Python
- Familiarity with data analysis concepts
- Google Colab account (optional, but recommended for easy setup)
- OpenAI API Token
-
Open the notebook in Google Colab or your preferred Jupyter environment.
-
Upload the Data_Science_Workshop jupyter notebook file, or use the link shared on the workshop
-
Run the first cell to install required packages:
- langgraph
- langchain
- langchain-openai
- pandas
- langchain_core
- langchain_experimental
- pydantic
-
Set up your OpenAI API key when prompted.
-
Data Acquisition
- Download the Titanic dataset
- Load the data into a pandas DataFrame
-
Initial Data Exploration
- Use AI to generate questions about the dataset
- Examine basic statistics and structure of the data
-
Data Visualization
- Create scatter plots to visualize relationships between variables
- Use AI to interpret and explain the visualizations
-
Advanced Analysis
- Perform deeper analysis on passenger demographics and survival rates
- Use AI to generate insights and answer complex questions about the data
-
AI-Assisted Exploration
- Utilize LangChain and GPT models to create a conversational interface for data exploration
- Demonstrate how AI can assist in formulating queries and interpreting results
- LangChain: Used for creating AI-powered workflows
- OpenAI GPT: Provides natural language processing capabilities
- Pandas: Used for data manipulation and analysis
- Matplotlib: Used for data visualization
- Set up the AI-assisted analysis pipeline using LangChain
- Load and preprocess the Titanic dataset
- Use AI to generate initial insights and questions about the data
- Create visualizations based on AI suggestions
- Interpret results with AI assistance
- Iterate through analysis steps, asking follow-up questions and generating new visualizations as needed
By the end of this workshop, participants will have gained hands-on experience in:
- Using AI to assist in data analysis tasks
- Exploring and visualizing dataset characteristics
- Interpreting complex relationships in data
- Leveraging natural language interfaces for data exploration
This workshop showcases how AI can enhance traditional data analysis techniques, providing a powerful toolset for deriving insights from complex datasets.
- Vipul Kumar - Demonstrating the use of LangGraph
- oomti - Creating this workshop notebook
- Langchain - Library for creating agentic workflows
- OpenAI - OpenAI GPT-4 LLM API
- [Alex Grazer] - Hosting and organizing the event
- [Valerio Ficcadenti] - Co-Hosting and organizing the event
- LSBU - For providing the venue for our workshop