Skip to content

Community Meetings

a_git_a edited this page Aug 30, 2023 · 47 revisions

Next meeting 30th of August 16:00 CET

Community meetings are held to keep all people interested in the project up to date.
Meetings are recorded and available to the public.
Please don't hesitate to start a conversation

Invitation to have your camera on.

Agenda

  1. Release (Antoni)
  2. Productionizing Jupyter Notebooks (Duygu + Antoni)

Next meeting 27th of September 16:00 CET
Support us by ⭐

ADD to

Community meetings are held to keep all people interested in the project up to date.
Meetings are recorded and available to the public.
Please don't hesitate to start a conversation

Invitation to have your camera on.

Agenda

quick intro of VDK and the team and people who have been working / using VDK

  1. Latest release (Antoni)
  2. Huggingface + VDK to train and use LLMs (Paul) He will show how running Hugging Face on VDK will augment its functionality.

Workflows:

  • Finetuning an LLM
  • Creating a dataset
  • Catching regressions in LLMs ahead of time
  • Q&A
  1. Roadmap (Antonio)
  • Q&A

28th of June - Practical Kimball Patterns - Dimensional modeling 101 Watch Recording

Agenda:

  1. VDK quick intro and latest release (Antoni)
  2. VDK team meeting/workshop (Agi)
  3. Dimensional modeling 101 - Practical Kimball data patterns (Antoni)

31st of May - Generative Data Packs and DevOps for Data Watch Recording

Agenda:

  1. VDK’s latest release (Antoni)
  2. Generative Data Packs (Iva)
  3. DevOps for Data (Agi)

26th of April - VDK UI demo Watch Recording

Agenda:

  1. Intro to the agenda and people - what are you working on lately?
  2. RADME updates and VDK intro (Agi)
  3. VDK’s latest release (Antoni)
  4. VDK UI demo (Paul)
  5. Next meeting topic. Date: 31th of May (Agi)

22nd of February Jupyter Integration - Watch recording

Shoutout to the recent VDK contributors and their work!

Agenda:

  1. VDK’s latest release
  2. VDK Jupyter integration
  3. FOSDEM experience and conclusions
  4. Next meeting date

11th of January Watch recording

Agenda:

  1. VDK’s latest release - Stanislav
  2. Introduction to Versatile Data Kit Control Service - Paul
  3. Demo of the current installation process - Iva
  4. Discussion on a proposal to implement the “Three Click Rule” to make the installation faster and easier for users. - Iva
  5. Decide on date for next community meeting (provisionally 15th of feb) - Paul

30th of November Watch Recording

Agenda

  1. Welcome - Agi
  2. Release - Antoni
  3. Newest industry DB adoption stats suggest that PostgreSQL gets quite some traction lately. We have recently introduced PostgreSQL embedded support so that for the control service it is a configurational option to choose the database type deployed by default (in case no external data source is set). It could now be either CockroachDB (by default), or PostgreSQL - Iva
  4. We have just returned from Data Science Conference Europe 2022, and we’ll talk about our experience there - Vic, Antoni, Dimira

Discussion:

  • templates for community meetings - do we need one?
  • next community meeting x-mas/NY themed - 21st of Dec
  • YT live community meetings

26th of October Watch recording

Agenda

  1. Welcome and intro - Agi
  2. Release - Antoni
  3. Hackathon. We've applied for the Borathon, and we'll demo what we did there! - Antoni
  4. Demo of a new feature that allows skipping the remaining steps of a data job execution via the job input object - Momchil
  5. Latest articles about VDK

28th of September: Creating First Data Job Watch recording

Agenda:

  1. Welcome and intro, if you are new to VDK I encourage you to say hi :)
  2. Quick intro to the project (Agi)
  3. Release announcement (Antoni)
  4. GitHub Star History example demo (Agi)
  5. Discussion topics:
  • Two PR reviewers
  • VDK catchphrase

VDK catchphrase (also anchor text):

  1. unique value
  2. clear
  3. short and sweet

Examples:

  • A high-performance observability data pipeline.
  • Declarative continuous deployment for Kubernetes.
  • The easiest way to coordinate your dataflow
  • A cloud-native Pipeline resource.
  • Always know what to expect from your data.
  • Data-Centric Pipelines and Data Versioning
  • An orchestration platform for the development, production, and observation of data assets.
  • Build powerful pipelines in any programming language.
  • Build data pipelines, the easy way
  • Machine Learning Pipelines for Kubeflow

I have included data pipelines and other tools that have more than 2000 stars only Airbyte has a longer message:

  • Airbyte is an open-source EL(T) platform that helps you replicate your data in your warehouses, lakes and databases.

I think the difference is felt in the readability of the message So, for this catchphrase it would be nice to come up with something that is #uniquevalue

Ideas:

  • Building and Managing Data Pipelines with SQL or Python
  • Data Pipelines covering full DataOps lifecycle
  • Building and managing your data pipelines with python or SQL on the cloud (or Kubernetes)
  • Build, run and manage your data jobs
  • Build, run and manage your data pipelines
  • Develop, run and manage your data pipelines on the cloud
  • Automate and abstract the Data and DevOps cycle
  • Automate and abstract the Data Journey and the DevOps cycle
  • Orchestrate

A bit more abstract and unclear ideas:

  • Efficient data engineering
  • Enable everyone to focus on work that requires their core skills

(because SQL or python is maybe not our unique value prop)

Questions:

  • Add cloud or Kubernetes ?
  • Data Pipelines OR DataOps pipelines ?

Helpful questions:

  • What do you think is the unique value of VDK
  • How would you google to find this framework? (if you don't know it exists)

"VDK I think rather has a lot of possibilities in the “T” part - templates (kimball or generic), managed connection plugins enable quality, lineage (when implemented). And also in the abstracting DevOps part - though we need to do more around testing."

Action items: create a form where we can rank the catchphrase

24th of August: VDK Templates https://youtu.be/HIRt4bX4ddk

Attendees:

Agenda:

  1. Welcome and team (Agi)
  2. Intro to the project (Agi)
  3. Momchil Zhivkov about templates:
    Templates are reusable code in the context of data jobs. They are intended to solve a common use case among different users. A template is executed through a data job. An example of a common use case is loading data into a data warehouse.

This presentation will demo:

  • what is a template
  • how does it look
  • the purpose of templates
  • using and developing templates
  • our already existing templates that can be reused
  1. Duygu - csv-export
    A new feature was added to the already existing CSV plugin, which allows people to export the result of a SQL query to a CSV file.
  2. Toni - VDK release v0.6
  3. Open discussion

20th of July: How to promote an opensource project https://youtu.be/wmdx7ngocr4

15:00 (GMT+01:00) - Add to Google calendar

Attendees:

Agenda:

  1. Welcome and team (Agi)
  2. Michael Gasch about how to promote an opensource project, tips, and questions

22nd of June: Airflow integration https://youtu.be/c3j1aOALjVU

11:00 (GMT+01:00) - Add to Google calendar

Attendees:

  • Agita Jaunzeme aka Agi (VDK Community Manager)
  • Gabriel Georgiev
  • Antoni Ivanov
  • Dimira Petrova

Agenda:

  1. Welcome and team (Agi)
  2. Intro to the project (Agi)
  3. Announcement of recent changes (Antoni)
  4. Airflow Provider Demo by Gabriel
  5. Discussion:
  • VDK community update (Agi)
  • how to find community meeting links
  • Community and Resources page
  • ODSC Europe conference, volunteering, speakers, Jacob Tomlinson Guglielmo Iozzia Carl Osipov Shawn Kyzer on Data Mesh
  • Invitation to be DataOps community lead for Techies of Baltics - devops.lv also this guy from CDK James Craig
  • next meeting possibly someone will join to tell us their story of growing an OSS community govmomi OR rapids
  • Next meeting date
  • Next meeting time (Let’s make next community meeting during US friendly time zone)

Useful links:

May 25, 2022 : KubeCon https://youtu.be/w0teqOw9qjc

Attendees:

  • Agita Jaunzeme aka Agi (VDK Community Manager)

Agenda:

  1. Welcome and team (Agi)
    / intro to the project
  2. VDK community update (Agi):
  1. Latest release
  2. Roadmap (Dako)
  • Apache Airflow integration
  • Security Improvements
  • Provide users with better notifications/information about non-gracefully failed data job execution
  1. Open questions about Kubecon – discussion
  2. Conclusion and relevant links (Twitter / Slack / YT / blogs etc. )

Discussion Topics:

Useful links:

Attendees:

  • Agita Jaunzeme
  • Dimira Petrova
  • Dako Dakov
  • Antoni Ivanov
  • Gabriel Georgiev

Agenda:

  • Welcome (Agi)
  • Intro of the team (all)
  • Intro of the project (Agi)
  • What are we doing lately (Antoni)
  • What are we planning to do in the near future (Dako)
  • Discussion
  • Conclusion (Agi)

Discussion Topics:

  • Kubernetes / ..
  • Meeting frequency / next meeting - the week of 23rd of May
  • Agenda for the next meeting to be more specific

Useful links:

Clone this wiki locally