Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions pydata-virginia-2025/category.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
{
"title": "PyData Virginia 2025"
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
{
"description": "SciPy is a powerful library for scientific and technical computing in Python. The primary objectives of this presentation are to explore the core concepts of Responsible AI and to demonstrate these concepts with SciPy.",
"duration": 3152,
"language": "eng",
"recorded": "2025-04-18",
"related_urls": [
{
"label": "Conference Website",
"url": "https://pydata.org/virginia2025"
},
{
"label": "https://github.com/numfocus/YouTubeVideoTimestamps",
"url": "https://github.com/numfocus/YouTubeVideoTimestamps"
}
],
"speakers": [
"Andrea Hobby"
],
"tags": [],
"thumbnail_url": "https://i.ytimg.com/vi/W6fTFSgyhMg/maxresdefault.jpg",
"title": "Responsible AI with SciPy",
"videos": [
{
"type": "youtube",
"url": "https://www.youtube.com/watch?v=W6fTFSgyhMg"
}
]
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
{
"description": "Multi-armed bandits are a reinforcement learning tool often used in environments where the cost or rewards of different choices are unknown or where those functions may change over time. The good news is that as far as implementation goes, bandits are surprisingly easy to implement; however, in practice, the difficulty comes from defining a reward function that best targets your specific use case. In this talk, we will discuss how to use bandit algorithms effectively, taking note of practical strategies for experimental design and deployment of bandits in your applications.",
"duration": 1819,
"language": "eng",
"recorded": "2025-04-18",
"related_urls": [
{
"label": "Conference Website",
"url": "https://pydata.org/virginia2025"
},
{
"label": "https://github.com/numfocus/YouTubeVideoTimestamps",
"url": "https://github.com/numfocus/YouTubeVideoTimestamps"
}
],
"speakers": [
"Benjamin Bengfort"
],
"tags": [],
"thumbnail_url": "https://i.ytimg.com/vi/jP978VKBl-w/maxresdefault.jpg",
"title": "Practical Multi Armed Bandits",
"videos": [
{
"type": "youtube",
"url": "https://www.youtube.com/watch?v=jP978VKBl-w"
}
]
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
{
"description": "When Bayesian modeling scales up to large datasets, traditional MCMC methods can become impractical due to their computational demands. Variational Inference (VI) offers a scalable alternative, trading exactness for speed while retaining the essence of Bayesian inference.\n\nIn this tutorial, we\u2019ll explore how to implement and compare VI techniques in PyMC, including the Adaptive Divergence Variational Inference (ADVI) and the cutting-edge Pathfinder algorithm.\n\nStarting with simple models like linear regression, we\u2019ll gradually introduce more complex, real-world applications, comparing the performance of VI against Markov Chain Monte Carlo (MCMC) to understand the trade-offs in speed and accuracy.\n\nThis tutorial will arm participants with practical tools to deploy VI in their workflows and help answer pressing questions, like \"What do I do when MCMC is too slow?\", or \"How does VI compare to MCMC in terms of approximation quality?\".",
"duration": 5357,
"language": "eng",
"recorded": "2025-04-18",
"related_urls": [
{
"label": "Conference Website",
"url": "https://pydata.org/virginia2025"
},
{
"label": "https://github.com/numfocus/YouTubeVideoTimestamps",
"url": "https://github.com/numfocus/YouTubeVideoTimestamps"
}
],
"speakers": [
"Chris Fonnesbeck"
],
"tags": [],
"thumbnail_url": "https://i.ytimg.com/vi/XECLmgnS6Ng/maxresdefault.jpg",
"title": "A Beginner's Guide to Variational Inference",
"videos": [
{
"type": "youtube",
"url": "https://www.youtube.com/watch?v=XECLmgnS6Ng"
}
]
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
{
"description": "Geospatial data can unlock valuable insights. OpenStreetMap includes electric power and telecommunication infrastructure geospatial data, and it is already \u201copen\u201d. This presentation will demonstrate how to use Python to \u201cunlock the insights\u201d available in OSM power and telecommunications geospatial data.",
"duration": 1522,
"language": "eng",
"recorded": "2025-04-18",
"related_urls": [
{
"label": "Conference Website",
"url": "https://pydata.org/virginia2025"
},
{
"label": "https://github.com/numfocus/YouTubeVideoTimestamps",
"url": "https://github.com/numfocus/YouTubeVideoTimestamps"
}
],
"speakers": [
"Cory Eicher"
],
"tags": [],
"thumbnail_url": "https://i.ytimg.com/vi/kmRyFmMThVo/maxresdefault.jpg",
"title": "Using Python to Unlock Insights from OpenStreetMap Data at Scale",
"videos": [
{
"type": "youtube",
"url": "https://www.youtube.com/watch?v=kmRyFmMThVo"
}
]
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
{
"description": "Tired of waiting for massive datasets to load on your local machine? In this beginner-friendly tutorial, we\u2019ll explore how to scale your data analysis skills from pandas to PySpark using a real-world anime dataset. We\u2019ll walk through the basics of distributed computing, discuss why Spark was created, and demonstrate the benefits of working with PySpark for big data tasks\u2014including reading, cleaning, and transforming millions of records with ease. By the end of this workshop, you\u2019ll understand how PySpark harnesses cluster computing to handle large-scale data and you\u2019ll be comfortable applying these techniques to your own projects.\n\nParticipant Requirements:\n- A laptop (any OS) with an internet connection\n- A Google account (to access Colab notebooks and slides)\n- Familiarity with Python and pandas\n\nHere's the link to the Google Colab to follow along \ud83d\udc47\ud83c\udffe\nhttps://colab.research.google.com/drive/1fi0cTQ1NIE5kDEH0ynp2sqDuVeiBJJWU?usp=sharing\n\nHere are the slides \ud83d\udc47\ud83c\udffe\nhttps://drive.google.com/file/d/11JIih1VzLxTJ9O6PeGzqD_e8vumTZQmw/view?usp=sharing",
"duration": 4399,
"language": "eng",
"recorded": "2025-04-18",
"related_urls": [
{
"label": "Conference Website",
"url": "https://pydata.org/virginia2025"
},
{
"label": "https://colab.research.google.com/drive/1fi0cTQ1NIE5kDEH0ynp2sqDuVeiBJJWU?usp=sharing",
"url": "https://colab.research.google.com/drive/1fi0cTQ1NIE5kDEH0ynp2sqDuVeiBJJWU?usp=sharing"
},
{
"label": "https://drive.google.com/file/d/11JIih1VzLxTJ9O6PeGzqD_e8vumTZQmw/view?usp=sharing",
"url": "https://drive.google.com/file/d/11JIih1VzLxTJ9O6PeGzqD_e8vumTZQmw/view?usp=sharing"
},
{
"label": "https://github.com/numfocus/YouTubeVideoTimestamps",
"url": "https://github.com/numfocus/YouTubeVideoTimestamps"
}
],
"speakers": [
"Cynthia Ukawu"
],
"tags": [],
"thumbnail_url": "https://i.ytimg.com/vi/McbJMdcKp5c/maxresdefault.jpg",
"title": "From Pandas to PySpark",
"videos": [
{
"type": "youtube",
"url": "https://www.youtube.com/watch?v=McbJMdcKp5c"
}
]
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
{
"description": "The team behind DVC has spent years tackling data versioning challenges. With the rise of AI, we\u2019ve seen new complexities emerge - especially with multimodal datasets like images, video, audio, and text. This talk shows why multimodal data versioning is different and how Pydantic provides a powerful way to structure and integrate metadata.",
"duration": 1659,
"language": "eng",
"recorded": "2025-04-18",
"related_urls": [
{
"label": "Conference Website",
"url": "https://pydata.org/virginia2025"
},
{
"label": "https://github.com/numfocus/YouTubeVideoTimestamps",
"url": "https://github.com/numfocus/YouTubeVideoTimestamps"
}
],
"speakers": [
"Dmitry Petrov"
],
"tags": [],
"thumbnail_url": "https://i.ytimg.com/vi/yNBoJSKl49U/maxresdefault.jpg",
"title": "Versioning Multimodal Data: Metadata & Beyond",
"videos": [
{
"type": "youtube",
"url": "https://www.youtube.com/watch?v=yNBoJSKl49U"
}
]
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
{
"description": "Health disparities remain a critical challenge in public health, demanding innovative approaches to uncover inequities and drive actionable change. This webinar will demonstrate how Python can serve as a powerful tool for creating data visualizations that illustrate the unequal burden of HIV across different populations. Participants will learn how Python\u2019s popular libraries, such as Matplotlib, Seaborn, and Plotly, can transform complex datasets into accessible, impactful visuals.\nUsing an HIV dataset containing demographic, geographic, and clinical variables, this session will guide attendees through a series of practical examples. From creating heatmaps and geospatial maps to analyzing temporal trends, the webinar emphasizes how to identify and communicate key social determinants related to race, gender, socioeconomic status, and access to care. Through hands-on demonstrations, attendees will see how Python\u2019s capabilities streamline data analysis and visualization workflows.\nKey takeaways from the session include identifying regions and communities in Texas, disproportionately affected by HIV, uncovering intersectional factors influencing health outcomes, and leveraging visual tools to inform policy and resource allocation. Special attention will be given to designing visuals that resonate with non-technical audiences, ensuring findings are actionable for public health professionals and policymakers.",
"duration": 4007,
"language": "eng",
"recorded": "2025-04-18",
"related_urls": [
{
"label": "Conference Website",
"url": "https://pydata.org/virginia2025"
},
{
"label": "https://github.com/numfocus/YouTubeVideoTimestamps",
"url": "https://github.com/numfocus/YouTubeVideoTimestamps"
}
],
"speakers": [
"Kimberly Deas"
],
"tags": [],
"thumbnail_url": "https://i.ytimg.com/vi/-BA2eXBoDoc/maxresdefault.jpg",
"title": "Data Viz in Python as a Tool to Study HIV Health Disparities",
"videos": [
{
"type": "youtube",
"url": "https://www.youtube.com/watch?v=-BA2eXBoDoc"
}
]
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
{
"description": "Where do landlords engage in more eviction actions? What characteristics of renters or landlords increase the practice of serial filing? There is widespread interest in using administrative data -- information collected by government and agencies in the implementation of public programs -- to evaluate systems and promote most just outcomes. Working with the Civil Court Data Initiative of Legal Services Corporation, we use data collected from civil court records in Virginia to analyze the behavior of landlords. Expanding on our Virginia Evictors Catalog, we use data on court evictions to build additional data tools to support the work of legal and housing advocates and model key eviction outcomes to contribute to our understanding of landlord behavior.",
"duration": 1755,
"language": "eng",
"recorded": "2025-04-18",
"related_urls": [
{
"label": "Conference Website",
"url": "https://pydata.org/virginia2025"
},
{
"label": "https://github.com/numfocus/YouTubeVideoTimestamps",
"url": "https://github.com/numfocus/YouTubeVideoTimestamps"
}
],
"speakers": [
"Michele Claibourn",
"Samantha Toet"
],
"tags": [],
"thumbnail_url": "https://i.ytimg.com/vi/eE0D79trL2c/maxresdefault.jpg",
"title": "Exploring Eviction Trends in Virginia",
"videos": [
{
"type": "youtube",
"url": "https://www.youtube.com/watch?v=eE0D79trL2c"
}
]
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
{
"description": "When every day counted during the COVID-19 pandemic, data science became an essential catalyst in accelerating the path to widespread vaccination. This talk delves into the data-driven strategies that enabled the U.S. government\u2019s vaccine trials to move faster, cutting crucial weeks\u20146 to 8, by our estimates\u2014off the timeline to deployment. Through sophisticated geospatial modeling, we identified and swiftly mobilized trial recruitment efforts in emerging hot zones, ensuring that each candidate pool was both numerically sufficient and demographically representative. Attendees will discover how advanced analytics, predictive modeling, and interdisciplinary collaboration converged to target the right communities at the right time, ultimately expediting vaccine availability. This behind-the-scenes look at rapid-response data science highlights not just the technical innovations, but the decisive cultural and operational shifts that turned real-time insights into life-saving action.",
"duration": 1733,
"language": "eng",
"recorded": "2025-04-18",
"related_urls": [
{
"label": "Conference Website",
"url": "https://pydata.org/virginia2025"
},
{
"label": "https://github.com/numfocus/YouTubeVideoTimestamps",
"url": "https://github.com/numfocus/YouTubeVideoTimestamps"
}
],
"speakers": [
"Greg Michaelson"
],
"tags": [],
"thumbnail_url": "https://i.ytimg.com/vi/zXKdjBv1SGc/maxresdefault.jpg",
"title": "How data science shortened the COVID-19 pandemic by 2 months",
"videos": [
{
"type": "youtube",
"url": "https://www.youtube.com/watch?v=zXKdjBv1SGc"
}
]
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
{
"description": "This workshop will provide a comprehensive introduction to Large Language Models (LLMs), covering their capabilities, structure, and practical applications. Participants will learn prompt engineering techniques, retrieval-augmented generation (RAG), agentic AI design, fine-tuning strategies, and model evaluation methods. The session will conclude with a discussion on the future of AI-powered reasoning machines.",
"duration": 5940,
"language": "eng",
"recorded": "2025-04-18",
"related_urls": [
{
"label": "Conference Website",
"url": "https://pydata.org/virginia2025"
},
{
"label": "https://github.com/numfocus/YouTubeVideoTimestamps",
"url": "https://github.com/numfocus/YouTubeVideoTimestamps"
}
],
"speakers": [
"John Berryman"
],
"tags": [],
"thumbnail_url": "https://i.ytimg.com/vi/jmwLzX_ltbQ/maxresdefault.jpg",
"title": "Mastering LLMs: From Prompt Engineering to Agentic AI",
"videos": [
{
"type": "youtube",
"url": "https://www.youtube.com/watch?v=jmwLzX_ltbQ"
}
]
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
{
"description": "As businesses strive to become AI-first, the pivotal role of AI practitioners extends beyond technical implementation to encompass strategic stewardship. This transition necessitates a profound understanding of organizational goals, data governance, and ethical considerations. By aligning AI initiatives with business objectives, fostering cross-functional collaboration, and addressing challenges such as data privacy and employee adaptation, AI professionals can drive effective transformation. This keynote explores the essential competencies and approaches required for AI practitioners to lead their organizations successfully into an AI-centric future.",
"duration": 3481,
"language": "eng",
"recorded": "2025-04-18",
"related_urls": [
{
"label": "Conference Website",
"url": "https://pydata.org/virginia2025"
},
{
"label": "https://github.com/numfocus/YouTubeVideoTimestamps",
"url": "https://github.com/numfocus/YouTubeVideoTimestamps"
}
],
"speakers": [
"Rajkumar Venkatesan"
],
"tags": ["keynote"],
"thumbnail_url": "https://i.ytimg.com/vi/jOgUY9Rcd80/maxresdefault.jpg",
"title": "Building AI-First Organizations",
"videos": [
{
"type": "youtube",
"url": "https://www.youtube.com/watch?v=jOgUY9Rcd80"
}
]
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
{
"description": "Traditional PDF extraction tools often struggle with complex layouts, tables, and images, Docling (an opensource Python library developed at IBM) excels at extracting structured information from these elements, enabling the creation of richer, more accurate vector databases. This hands-on tutorial will guide participants through building a Retrieval Augmented Generation (RAG) system using Docling, an open-source document processing library.\n\nParticipants will learn how to harness Docling's advanced capabilities to build superior RAG systems that can understand and retrieve information from complex document elements that traditional tools might miss. Participants will learn how to handle complex documents, extract structured information, and create an efficient vector database for semantic search. The session will cover best practices for document parsing, chunking strategies, and integration with popular LLM frameworks.",
"duration": 5414,
"language": "eng",
"recorded": "2025-04-18",
"related_urls": [
{
"label": "Conference Website",
"url": "https://pydata.org/virginia2025"
},
{
"label": "https://github.com/numfocus/YouTubeVideoTimestamps",
"url": "https://github.com/numfocus/YouTubeVideoTimestamps"
}
],
"speakers": [
"Krishna Rekapalli"
],
"tags": [],
"thumbnail_url": "https://i.ytimg.com/vi/41pxp4-pRmI/maxresdefault.jpg",
"title": "Unlock Information from Tables, Images and Complex Docs",
"videos": [
{
"type": "youtube",
"url": "https://www.youtube.com/watch?v=41pxp4-pRmI"
}
]
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
{
"description": "When \u201cAI Agent\u201d became the buzz word, have you ever wondered: what exactly is an AI agent? What is the multi-agent system? And how can you use the power of AI agents in your day-to-day data science workflow? In this hands-on tutorial, I will introduce AI agents and demonstrate how to design, build, and manage a multi-agent system for your data science workflows. Participants will learn how to break down complex tasks, assign AI agents to collaborate effectively, and ensure accuracy and reliability in their outputs. We will also discuss the trade-offs, limitations, and best practices for incorporating AI agents into data science projects.",
"duration": 5337,
"language": "eng",
"recorded": "2025-04-18",
"related_urls": [
{
"label": "Conference Website",
"url": "https://pydata.org/virginia2025"
},
{
"label": "https://github.com/numfocus/YouTubeVideoTimestamps",
"url": "https://github.com/numfocus/YouTubeVideoTimestamps"
}
],
"speakers": [
"Niharika Krishnan",
"Chuxin Liu",
"Astha Puri",
"Michelle Rojas"
],
"tags": [],
"thumbnail_url": "https://i.ytimg.com/vi/s5dx_4y6Iy8/maxresdefault.jpg",
"title": "Build Your Own Data Science AI Agents",
"videos": [
{
"type": "youtube",
"url": "https://www.youtube.com/watch?v=s5dx_4y6Iy8"
}
]
}
Loading