Skip to content

Description of some databricks workshops and learning material

License

paalvibe/databricks-workshops

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 

Repository files navigation

Databricks Workshops

Description of some databricks workshops and learning material we have developed at Knowit.

Workshops (Knowit toppturer)

These workshops are 2.5h hands-on workshops for learning various important aspects of databricks.

At Knowit we call these workshops Toppturer, giving quick but meaningful experience with a technology/tool/framework.

Available workshops:

  • Workshop: BotOps/AgentOps/LLMOps on Databricks
  • Workshop: DataOps on Databricks, using git and versioning of tables, jobs and code
  • Workshop: DataOps on Databricks p2: DLT, Data Quality Checks, DQX, Data contracts
  • Workshop: Data engineering on Databricks
  • Workshop: Using LangChain and open LLM-models on Databricks
  • Workshop: LLM Adaptation on Databricks
image

Workshop: BotOps/AgentOps/LLMOps on Databricks, using git and versioning of models agents and code

Link: https://github.com/brickops/databricks-botops-course

For: Data Engineers, Full stack data scientists, ML Engineers, Data Platform Engineers

Learn how to build git-based LLM apps, with proper environment separation (dev, staging, prod).

  • How to build LLM Apps on Databricks
  • Git based LLM App Operations
  • Deploy development versions of LLM Apps
  • Deploy production versions of LLM Apps
  • Evaluation driven development of LLM Apps
  • How does app versioning relate to test data

LLM apps could also be called bots, or agents if they can take autonomous actions.

Pre-requisites: Some python knowledge

Workshop: DataOps on Databricks, using git and versioning of tables, jobs and code

Link: https://github.com/brickops/databricks-dataops-course

For: Data Engineers, Full stack data scientists, ML Engineers, Data Platform Engineers

Topics:

  • Opinionated git-based approach to DataOps
  • Structure your environments to allow for dev runs of data pipelines
  • Move data pipelines from dev to prod
  • Using git branches and commits to name and manage data and jobs responsibly
  • Will not do Github Actions here, but the processed needed are used
  • Does not cover data quality nor pipeline management

Pre-requisites: Some python knowledge

Workshop: DataOps on Databricks part 2

Link: https://github.com/brickops/databricks-dataops-course

For: Data Engineers, Full stack data scientists, ML Engineers, Data Platform Engineers

  • How to enable data contracts and data quality checks in pipelines
  • Difference between Delta Live Tables and regular databricks notebooks

Pre-requisites: Some python knowledge

Workshop: Data engineering on Databricks

Link: https://github.com/knowit/AWS-Databricks-NYC-Taxi-Workshop

For: Developers, analysts, data scientists, data engineers.

Pre-requisites: Some python knowledge

Topics:

  • Basic understanding of components and tools in Databricks
  • Perform data transformation in Spark SQL and Pyspark
  • Use Databricks Reops for git-versioned Data Engineering
  • Deploy a Spark job with Databricks Workflows
  • Write ETL code and data quality checks in Delta Live Tables

Link:

Workshop: Using LangChain and open LLM-models on Databricks

Link: https://github.com/paalvibe/llm-langchain-course

For: Anybody

Topics:

  • Setup and use of LLMs in Databricks
  • Use of Langchain-rammeverket for:
    • LLM-wrapping
    • LLM-serving
    • Summarizing
  • Context embedding with chromadb
  • Reformating
  • Multi query retrieval
  • Prompt engineering

Workshop: LLM Adaptation on Databricks

Link: https://github.com/paalvibe/llm-tune-course

For: Anybody

Topics:

  • What is an LLM (Large Language Model)?
  • Tuning of LLM models on Databricks
  • Different modes of adapting LLMs
  • When and when not to train your own LLM?

About

Description of some databricks workshops and learning material

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published