Skip to content

Commit ab84a8f

Browse files
lidavidmianmcookamoeba
authored
Add blog post describing optimized drivers (#20)
Co-authored-by: Ian Cook <ianmcook@gmail.com> Co-authored-by: Bryce Mecum <petridish@gmail.com>
1 parent 620d312 commit ab84a8f

File tree

2 files changed

+86
-0
lines changed

2 files changed

+86
-0
lines changed
Lines changed: 53 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,53 @@
1+
---
2+
layout: blog
3+
title: Faster ADBC drivers for BigQuery, MySQL, SQL Server, and Trino
4+
author: ADBC Drivers Contributors
5+
---
6+
7+
Today the ADBC Drivers Contributors released updated drivers for four database systems, available immediately via [dbc](https://docs.columnar.tech/dbc/). To update, just `dbc install <driver>` to get the latest version.
8+
9+
## Updated Drivers
10+
11+
The main highlight for all drivers is improved performance. Each newly released driver is now significantly faster than its previous release and is typically faster than or on par with its reference non-ADBC driver, with some variation depending on workload and data types such as strings.
12+
13+
### Google BigQuery driver version 1.11.0
14+
15+
- Improved query performance by identifying and patching an issue in the Google BigQuery SDK for Go. [#102](https://github.com/adbc-drivers/bigquery/pull/102)
16+
- Added experimental support for bulk ingest via the Storage Write API, instead of by uploading Parquet files. This is still a work in progress and is not recommended for production use. [#105](https://github.com/adbc-drivers/bigquery/pull/105)
17+
18+
### MySQL driver version 0.3.0
19+
20+
- Improved bulk ingest performance. [#66](https://github.com/adbc-drivers/mysql/pull/66)
21+
22+
### Microsoft SQL Server driver version 1.3.0
23+
24+
- Improved query performance by identifying and submitting fixes for [microsoft/go-mssqldb](https://github.com/microsoft/go-mssqldb).
25+
26+
### Trino driver version 0.3.0
27+
28+
- Improved bulk ingest performance. [#58](https://github.com/adbc-drivers/trino/pull/58)
29+
30+
## Benchmarks
31+
32+
Query benchmarks measure the time to retrieve a PyArrow Table, while ingest benchmarks measure the time to write a PyArrow Table. The benchmark harness in all cases uses Python.
33+
34+
### Querying Data
35+
36+
| Database | Data Size | ADBC Before (s) | ADBC After (s) | Competitor (s) | Relative Time[^speedup] |
37+
| :-- | :-- | --: | --: | --: | --: |
38+
| BigQuery | 6 million rows | 95.2 ± 2.34 | **31.0 ± 2.1** | 56.4 ± 1.3[^bigquery] | 0.55x |
39+
| SQL Server | 1.2 million rows | 9.6 ± 0.1 | 3.3 ± 0.0 | **3.0 ± 0.2**[^mssql] | 1.1x |
40+
41+
[^bigquery]: python-bigquery-sqlalchemy with Storage Read API enabled
42+
[^mssql]: turbodbc 5.1.2 + msodbcsql 18.6.1.1-1
43+
[^speedup]: Versus competitor. Lower is better.
44+
45+
### Ingesting Data
46+
47+
| Database | Data Size | ADBC Before (s) | ADBC After (s) | Competitor (s) | Relative Time[^speedup] |
48+
| :-- | :-- | --: | --: | --: | --: |
49+
| MySQL | 600k rows | 467.8 ± 17.9 | **6.7 ± 0.3** | 13.1 ± 0.7[^mysql] | 0.51x |
50+
| Trino | 60k rows | 1962.8 ± 182.2 | **52.2 ± 8.2** | 2061.8 ± 15.1[^trino] | 0.03x |
51+
52+
[^mysql]: mysql-connector-python using DuckDB to convert Arrow data to Python objects; using PyArrow `Table.to_pylist` instead, timing was 15.31 ± 0.59s
53+
[^trino]: trino-python-client

assets/css/styles.css

Lines changed: 33 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -34,6 +34,7 @@
3434
:root {
3535
--primary-bg: rgb(255, 255, 255);
3636
--secondary-bg: rgb(255, 255, 255);
37+
--highlight-bg: rgba(0, 0, 0, 0.05);
3738
--text-primary-color: rgb(65, 65, 65);
3839
--text-primary-color-dim: rgba(50, 50, 50, 0.80);
3940
--text-secondary-color: rgb(50, 50, 50);
@@ -46,6 +47,7 @@
4647
:root {
4748
--primary-bg: rgb(30, 30, 30);
4849
--secondary-bg: rgb(30, 30, 30);
50+
--highlight-bg: rgba(255, 255, 255, 0.05);
4951
--text-primary-color: rgb(205, 205, 205);
5052
--text-primary-color-dim: rgba(225, 225, 225, 0.80);
5153
--text-secondary-color: rgb(225, 225, 225);
@@ -59,6 +61,7 @@
5961
html[data-theme="dark"] {
6062
--primary-bg: rgb(30, 30, 30);
6163
--secondary-bg: rgb(30, 30, 30);
64+
--highlight-bg: rgba(255, 255, 255, 0.05);
6265
--text-primary-color: rgb(205, 205, 205);
6366
--text-primary-color-dim: rgba(225, 225, 225, 0.80);
6467
--text-secondary-color: rgb(225, 225, 225);
@@ -70,6 +73,7 @@ html[data-theme="dark"] {
7073
html[data-theme="light"] {
7174
--primary-bg: rgb(255, 255, 255);
7275
--secondary-bg: rgb(255, 255, 255);
76+
--highlight-bg: rgba(0, 0, 0, 0.05);
7377
--text-primary-color: rgb(65, 65, 65);
7478
--text-primary-color-dim: rgba(50, 50, 50, 0.80);
7579
--text-secondary-color: rgb(50, 50, 50);
@@ -110,6 +114,35 @@ code {
110114
font-weight: 600;
111115
}
112116

117+
table {
118+
border-collapse: collapse;
119+
width: 100%;
120+
}
121+
122+
table thead tr:last-child {
123+
border-bottom: 0.1em solid var(--text-primary-color);
124+
}
125+
126+
table tr td, table tr th {
127+
/* makes numbers align nicely in tables */
128+
font-variant-numeric: tabular-nums;
129+
padding: 0 0.25em;
130+
/* stops footnotes from misaligning the text vertically */
131+
vertical-align: bottom;
132+
}
133+
134+
table tr td:first-child, table tr th:first-child {
135+
padding-left: 0;
136+
}
137+
138+
table tr td:last-child, table tr th:last-child {
139+
padding-right: 0;
140+
}
141+
142+
table tbody tr:nth-child(even) {
143+
background-color: var(--highlight-bg);
144+
}
145+
113146
.text-muted {
114147
color: rgb(140, 140, 140);
115148
}

0 commit comments

Comments
 (0)