@@ -53,7 +53,7 @@ This distribution uses the [cstore_fdw](https://github.com/citusdata/cstore_fdw)
5353into a column-oriented database. This means that you get the rich featureset of Postgres,
5454but with a huge improvement in speed and disk usage. To install and run the database server:
5555
56- ` docker run -d -p 5433:5432 -v ~/boxball/postgres-cstore-fdw:/var/lib/postgresql/data doublewick/boxball:postgres-cstore-fdw-0.0.2 `
56+ ` docker run --name postgres-cstore-fdw - d -p 5433:5432 -v ~/boxball/postgres-cstore-fdw:/var/lib/postgresql/data doublewick/boxball:postgres-cstore-fdw-0.0.2 `
5757
5858Roughly an hour after the image is downloaded, the data will be fully loaded into the database, and you can connect to it on port ` 5433 `
5959(either using the ` psql ` command line tool or a database client of your choice). The data will be persisted on your machine in
@@ -65,7 +65,7 @@ when you turn it back on.
6565disk space than Postgres cstore_fdw, but significantly more RAM (~ 5GB). I've yet to run any query performance comparisons.
6666To install and run the database server:
6767
68- ` docker run -d -p 8123:8123 -v ~/boxball/clickhouse:/var/lib/clickhouse doublewick/boxball:clickhouse-0.0.2 `
68+ ` docker run --name clickhouse - d -p 8123:8123 -v ~/boxball/clickhouse:/var/lib/clickhouse doublewick/boxball:clickhouse-0.0.2 `
6969
707015-30 minutes after the image is downloaded, the data will be fully loaded into the database, and you can connect to it either by attaching the
7171container and using the ` clickhouse-client ` CLI or by using a local database client on port ` 8123 ` .
@@ -77,7 +77,7 @@ when you turn it back on.
7777[ Drill] ( https://drill.apache.org/ ) is a framework that allows for SQL queries directly on files, without having to declare any schema.
7878It is usually used on a computing cluster with massive datasets, but we use a single-node setup. To install and run:
7979
80- ` docker run -id -p 8047:8047 -p 31010:31010 -v ~/boxball/clickhouse :/data doublewick/boxball:drill-0.0.2 `
80+ ` docker run --name drill - id -p 8047:8047 -p 31010:31010 -v ~/boxball/drill :/data doublewick/boxball:drill-0.0.2 `
8181
8282Data will be immediately available to query after the image is downloaded. Use port ` 8047 ` to access the Web UI
8383(which includes a SQL runner) and port ` 31010 ` to connect via a database client.
@@ -94,7 +94,7 @@ more disk space than their columnar counterparts.
9494#### Postgres
9595Similar configuration to the cstore_fdw extended version above, but stored in the conventional way.
9696
97- ` docker run -d -p 5432:5432 -v ~/boxball/postgres:/var/lib/postgresql/data doublewick/boxball:postgres-0.0.2 `
97+ ` docker run --name postgres - d -p 5432:5432 -v ~/boxball/postgres:/var/lib/postgresql/data doublewick/boxball:postgres-0.0.2 `
9898
9999Roughly 90 minutes after the image is downloaded, the data will be fully loaded into the database,
100100and you can connect to it on port ` 5432 `
@@ -105,7 +105,7 @@ when you turn it back on.
105105#### MySQL
106106To install and run:
107107
108- ` docker run -d -p 3306:3306 -v ~/boxball/mysql:/var/lib/mysql doublewick/boxball:mysql-0.0.2 `
108+ ` docker run --name mysql - d -p 3306:3306 -v ~/boxball/mysql:/var/lib/mysql doublewick/boxball:mysql-0.0.2 `
109109
110110Roughly two hours after the image is downloaded, the data will be fully loaded into the database,
111111and you can connect to it on port ` 3306 ` . The data will be persisted on your machine in
@@ -115,7 +115,7 @@ when you turn it back on.
115115#### SQLite (with web UI)
116116To install and run:
117117
118- ` docker run -d -p 8080:8080 -v ~/boxball/sqlite:/db doublewick/boxball:sqlite-0.0.2 `
118+ ` docker run --name sqlite - d -p 8080:8080 -v ~/boxball/sqlite:/db doublewick/boxball:sqlite-0.0.2 `
119119
120120Roughly two minutes after the image is downloaded, the data will be fully loaded into the database. ` localhost:8080 `
121121will provide a [ web UI] ( https://github.com/coleifer/sqlite-web ) where you can write queries and perform schema exploration.
@@ -125,11 +125,11 @@ will provide a [web UI](https://github.com/coleifer/sqlite-web) where you can wr
125125#### Parquet
126126Parquet is a columnar data format originally developed for the Hadoop ecosystem. It has solid support in Spark, Pandas,
127127and many other frameworks.
128- [ OneDrive] ( https://1drv.ms/u/s!AtpEocFNRNBWgyfyRX793TQmzj49 ?e=iJ4PQQ )
128+ [ OneDrive] ( https://1drv.ms/u/s!AtpEocFNRNBWg1eR5L-U7bupJqyt ?e=RbxuMp )
129129
130130#### CSV
131131The original CSVs from the extract step (folder stored as ` .tar.gz ` ).
132- [ OneDrive] ( https://1drv.ms/u/s!AtpEocFNRNBWg1a6aEsv0qDwtYJt ?e=Timw8N )
132+ [ OneDrive] ( https://1drv.ms/u/s!AtpEocFNRNBWhAb_gwNbBLPB1pDv ?e=qyrU3L )
133133
134134### Interactive Data Exploration
135135
0 commit comments