Skip to content

Commit 33a351d

Browse files
alambclaude
andcommitted
refactor: use 'trino' (not 'java') consistently for the Trino TPC-DS port
'java' is ambiguous — there may be multiple Java TPC-DS implementations. The reference we target is specifically the Trino library, so name everything after it for clarity: - tests/fixtures/scale-N-java/ -> tests/fixtures/scale-N-trino/ - scripts/bootstrap-java.sh -> scripts/bootstrap-trino.sh - TPCDS_JAVA_REPO env var -> TPCDS_TRINO_REPO - JAVA_DIR / JAVA_REPO_URL vars -> TRINO_DIR / TRINO_REPO_URL - find_java_jar / clone_java_repo / build_java / test_java -> find_trino_jar / clone_trino_repo / build_trino / test_trino - CI artifact `test-fixtures-java` -> `test-fixtures-trino` - "Java fixture" log labels -> "Trino fixture" - Doc references throughout READMEs and script headers updated. Kept as-is: `actions/setup-java@v5`, `Java 11+` requirement, `java -jar` / `java -version` invocations, and `mvn`/`openjdk` references — those refer to the Java language/runtime, not the Trino implementation. The CLI flag and Rust `CompatMode::Trino` were already named `trino`; this commit aligns the rest of the codebase. Verified: `./scripts/test-all-tables.sh` passes 24/24 vs Trino, and `./scripts/test-all-tables.sh --compat c` passes 23/23 vs C dsdgen (customer.dat still skipped pending alamb/tpcds-data regeneration). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent d1d9643 commit 33a351d

10 files changed

Lines changed: 95 additions & 100 deletions

File tree

.github/workflows/tpcdsgen-conformance.yml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -51,7 +51,7 @@ jobs:
5151
- name: Bootstrap Java TPC-DS implementation
5252
run: |
5353
cd tpcdsgen
54-
./scripts/bootstrap-java.sh
54+
./scripts/bootstrap-trino.sh
5555
5656
- name: Build Rust table generators
5757
run: |
@@ -71,7 +71,7 @@ jobs:
7171
if: failure() # Upload fixtures if tests fail for debugging
7272
uses: actions/upload-artifact@v7
7373
with:
74-
name: test-fixtures-java
74+
name: test-fixtures-trino
7575
path: tpcdsgen/tests/fixtures/
7676
retention-days: 7
7777

tpcdsgen/README.md

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -29,9 +29,9 @@ Fixtures are pre-generated TPC-DS data files used for conformance testing.
2929

3030
```
3131
tests/fixtures/
32-
├── scale-1-java/ # Java reference fixtures (`--compat trino`)
32+
├── scale-1-trino/ # Java reference fixtures (`--compat trino`)
3333
├── scale-1-c/ # C dsdgen reference fixtures (`--compat c`)
34-
└── scale-10-java/ # higher scale factors as needed
34+
└── scale-10-trino/ # higher scale factors as needed
3535
```
3636

3737
### Conformance Testing
@@ -44,9 +44,9 @@ scripts that do byte-for-byte (MD5) comparison of `.dat` output. See
4444

4545
```bash
4646
# One-time: clone & build the Java TPC-DS implementation.
47-
./scripts/bootstrap-java.sh
47+
./scripts/bootstrap-trino.sh
4848

49-
# Generate Java reference fixtures into tests/fixtures/scale-N-java/.
49+
# Generate Java reference fixtures into tests/fixtures/scale-N-trino/.
5050
./scripts/generate-fixtures.sh
5151

5252
# Compare Rust output byte-for-byte against the Java fixtures.
@@ -77,13 +77,13 @@ Each fixture directory contains an `MD5SUMS` file for verification.
7777

7878
**On Linux:**
7979
```bash
80-
cd tests/fixtures/scale-1-java
80+
cd tests/fixtures/scale-1-trino
8181
md5sum -c MD5SUMS
8282
```
8383

8484
**On macOS:**
8585
```bash
86-
cd tests/fixtures/scale-1-java
86+
cd tests/fixtures/scale-1-trino
8787
while read hash file; do
8888
[[ $(md5 -q "$file") == "$hash" ]] && echo "$file: OK" || echo "$file: FAILED"
8989
done < MD5SUMS

tpcdsgen/scripts/README.md

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,7 @@ MD5/`diff` comparison.
2020
tpcdsgen/
2121
├── tests/
2222
│ └── fixtures/ # Reference data (gitignored)
23-
│ ├── scale-1-java/ # Java reference (`--compat trino`)
23+
│ ├── scale-1-trino/ # Java reference (`--compat trino`)
2424
│ │ ├── call_center.dat
2525
│ │ ├── warehouse.dat
2626
│ │ └── ... (all 25 tables)
@@ -29,7 +29,7 @@ tpcdsgen/
2929
│ ├── warehouse.dat
3030
│ └── ... (all 25 tables)
3131
└── scripts/
32-
├── bootstrap-java.sh # Clone + build the Java TPC-DS impl
32+
├── bootstrap-trino.sh # Clone + build the Java TPC-DS impl
3333
├── generate-fixtures.sh # Generate/download reference fixtures
3434
│ # (Java via --compat trino; C via --compat c)
3535
├── compare-table.sh # Compare one table
@@ -42,9 +42,9 @@ tpcdsgen/
4242

4343
```bash
4444
# 1. Bootstrap Java implementation (first time only)
45-
./scripts/bootstrap-java.sh
45+
./scripts/bootstrap-trino.sh
4646

47-
# 2. Generate Java reference fixtures into tests/fixtures/scale-N-java/.
47+
# 2. Generate Java reference fixtures into tests/fixtures/scale-N-trino/.
4848
./scripts/generate-fixtures.sh
4949

5050
# 3. Test all ported tables against the Java reference.
@@ -93,8 +93,8 @@ table below is just a roadmap.
9393

9494
| Script | Purpose |
9595
|---------------------------|---------------------------------------------------------------------------------------------------------------------------------|
96-
| `bootstrap-java.sh` | Clone and build the Java / Trino reference implementation into `../tpcds/`. Run once before Java conformance. |
97-
| `generate-fixtures.sh` | Populate `tests/fixtures/scale-N-{java,c}/` with reference data. `--compat trino` (default) runs the Java impl; `--compat c` downloads pre-generated C `dsdgen` data from [alamb/tpcds-data](https://github.com/alamb/tpcds-data). |
96+
| `bootstrap-trino.sh` | Clone and build the Java / Trino reference implementation into `../tpcds/`. Run once before Java conformance. |
97+
| `generate-fixtures.sh` | Populate `tests/fixtures/scale-N-{trino,c}/` with reference data. `--compat trino` (default) runs the Java impl; `--compat c` downloads pre-generated C `dsdgen` data from [alamb/tpcds-data](https://github.com/alamb/tpcds-data). |
9898
| `compare-table.sh` | Compare one table's Rust output against the selected reference (`--compat trino` or `--compat c`) via MD5 + diff. |
9999
| `test-all-tables.sh` | Run the full conformance suite for one compat mode (the main CI entry point). Honors per-mode skip lists at the top of the script. |
100100
| `clean-fixtures.sh` | Remove all generated fixtures under `tests/fixtures/`. |
@@ -134,7 +134,7 @@ Run any script with `--help` to print its usage block.
134134

135135
## Requirements
136136

137-
- **Java:** Maven-built TPC-DS JAR at `../tpcds/target/tpcds-*-jar-with-dependencies.jar` (`bootstrap-java.sh` handles this).
137+
- **Java:** Maven-built TPC-DS JAR at `../tpcds/target/tpcds-*-jar-with-dependencies.jar` (`bootstrap-trino.sh` handles this).
138138
- **C dsdgen reference:** `git`, `tar`, `bzip2` for `generate-fixtures.sh --compat c`. No C compiler required — data is pre-generated.
139139
- **Rust:** Cargo-built `tpcdsgen` binary at `target/debug/tpcdsgen` or `target/release/tpcdsgen`.
140140
- **Disk space:** ~1 GB for SF1 Java fixtures; ~2.4 GB for SF1 C fixtures.
@@ -178,7 +178,7 @@ These scripts are designed to be CI-friendly:
178178

179179
```yaml
180180
# Java conformance
181-
- run: ./scripts/bootstrap-java.sh
181+
- run: ./scripts/bootstrap-trino.sh
182182
- run: ./scripts/generate-fixtures.sh --quiet
183183
- run: ./scripts/test-all-tables.sh --quiet
184184

Lines changed: 58 additions & 63 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
#!/usr/bin/env bash
22
#
3-
# bootstrap-java.sh — Set up the Java / Trino TPC-DS reference
3+
# bootstrap-trino.sh — Set up the Trino TPC-DS Java reference
44
# implementation used by `--compat trino` conformance testing.
55
#
66
# Please see print_usage() below for details.
@@ -9,25 +9,25 @@ set -euo pipefail
99

1010
print_usage() {
1111
cat << 'EOF'
12-
bootstrap-java.sh — Set up the Java / Trino TPC-DS reference implementation.
12+
bootstrap-trino.sh — Set up the Trino TPC-DS Java reference implementation.
1313
1414
What it does:
1515
1. Checks that Java 11+ and Maven are installed.
16-
2. Clones the Java TPC-DS repository into ../tpcds/ (if not present).
17-
3. Builds the Java implementation with `mvn clean package -DskipTests`.
16+
2. Clones the Trino TPC-DS repository into ../tpcds/ (if not present).
17+
3. Builds the implementation with `mvn clean package -DskipTests`.
1818
4. Runs a small smoke test to confirm the JAR works.
1919
2020
Usage:
21-
bootstrap-java.sh [OPTIONS]
21+
bootstrap-trino.sh [OPTIONS]
2222
2323
Options:
2424
--rebuild Force rebuild even if the JAR already exists.
2525
--verify Only verify the existing installation; do not clone/build.
2626
--help Show this help message.
2727
2828
Environment variables:
29-
TPCDS_JAVA_REPO Git URL for Java TPC-DS repo.
30-
Default: https://github.com/trinodb/tpcds.git
29+
TPCDS_TRINO_REPO Git URL for the Trino TPC-DS repo.
30+
Default: https://github.com/trinodb/tpcds.git
3131
3232
Requirements: Java 11+, Maven, git.
3333
@@ -36,9 +36,9 @@ Output:
3636
../tpcds/target/tpcds-*-jar-with-dependencies.jar.
3737
3838
Examples:
39-
bootstrap-java.sh # Clone and build if needed.
40-
bootstrap-java.sh --rebuild # Force clean rebuild.
41-
bootstrap-java.sh --verify # Just check existing install.
39+
bootstrap-trino.sh # Clone and build if needed.
40+
bootstrap-trino.sh --rebuild # Force clean rebuild.
41+
bootstrap-trino.sh --verify # Just check existing install.
4242
4343
See scripts/README.md for the full conformance-testing workflow.
4444
EOF
@@ -56,10 +56,10 @@ NC='\033[0m' # No Color
5656
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
5757
PROJECT_ROOT="$(cd "$SCRIPT_DIR/.." && pwd)"
5858
TPCDS_ROOT="$(cd "$PROJECT_ROOT/.." && pwd)"
59-
JAVA_DIR="$TPCDS_ROOT/tpcds"
59+
TRINO_DIR="$TPCDS_ROOT/tpcds"
6060

6161
# Configuration
62-
JAVA_REPO_URL="${TPCDS_JAVA_REPO:-https://github.com/trinodb/tpcds.git}"
62+
TRINO_REPO_URL="${TPCDS_TRINO_REPO:-https://github.com/trinodb/tpcds.git}"
6363
FORCE_REBUILD=0
6464
VERIFY_ONLY=0
6565

@@ -109,64 +109,59 @@ check_prerequisites() {
109109
return 0
110110
}
111111

112-
# Find Java JAR file
113-
find_java_jar() {
114-
local jar_pattern="$JAVA_DIR/target/tpcds-*-jar-with-dependencies.jar"
112+
# Find the built Trino TPC-DS JAR
113+
find_trino_jar() {
115114
local jar_file
116-
117-
jar_file=$(find "$JAVA_DIR/target" -name "tpcds-*-jar-with-dependencies.jar" 2>/dev/null | head -1)
118-
115+
jar_file=$(find "$TRINO_DIR/target" -name "tpcds-*-jar-with-dependencies.jar" 2>/dev/null | head -1)
119116
if [[ -z "$jar_file" ]]; then
120117
return 1
121118
fi
122-
123119
echo "$jar_file"
124-
return 0
125120
}
126121

127-
# Clone the Java repository
128-
clone_java_repo() {
129-
log_info "Cloning Java TPC-DS repository..."
130-
log_info "Source: $JAVA_REPO_URL"
131-
log_info "Target: $JAVA_DIR"
122+
# Clone the Trino TPC-DS repository
123+
clone_trino_repo() {
124+
log_info "Cloning Trino TPC-DS repository..."
125+
log_info "Source: $TRINO_REPO_URL"
126+
log_info "Target: $TRINO_DIR"
132127

133-
if [[ -d "$JAVA_DIR" ]]; then
134-
log_warn "Directory already exists: $JAVA_DIR"
128+
if [[ -d "$TRINO_DIR" ]]; then
129+
log_warn "Directory already exists: $TRINO_DIR"
135130

136131
# Check if it's a git repo
137-
if [[ -d "$JAVA_DIR/.git" ]]; then
132+
if [[ -d "$TRINO_DIR/.git" ]]; then
138133
log_info "Existing git repository found, pulling latest changes..."
139-
cd "$JAVA_DIR"
134+
cd "$TRINO_DIR"
140135
git pull || log_warn "Failed to pull latest changes"
141136
cd - >/dev/null
142137
return 0
143138
else
144139
log_error "Directory exists but is not a git repository"
145-
log_error "Please remove $JAVA_DIR and try again"
140+
log_error "Please remove $TRINO_DIR and try again"
146141
return 1
147142
fi
148143
fi
149144

150145
# Clone the repository
151-
if ! git clone "$JAVA_REPO_URL" "$JAVA_DIR"; then
152-
log_error "Failed to clone Java repository"
146+
if ! git clone "$TRINO_REPO_URL" "$TRINO_DIR"; then
147+
log_error "Failed to clone Trino TPC-DS repository"
153148
return 1
154149
fi
155150

156-
log_success "Successfully cloned Java TPC-DS repository"
151+
log_success "Successfully cloned Trino TPC-DS repository"
157152
return 0
158153
}
159154

160-
# Build the Java implementation
161-
build_java() {
162-
log_info "Building Java TPC-DS implementation..."
155+
# Build the Trino TPC-DS JAR
156+
build_trino() {
157+
log_info "Building Trino TPC-DS implementation..."
163158

164-
if [[ ! -d "$JAVA_DIR" ]]; then
165-
log_error "Java directory does not exist: $JAVA_DIR"
159+
if [[ ! -d "$TRINO_DIR" ]]; then
160+
log_error "Trino directory does not exist: $TRINO_DIR"
166161
return 1
167162
fi
168163

169-
cd "$JAVA_DIR"
164+
cd "$TRINO_DIR"
170165

171166
# Clean build
172167
log_info "Running: mvn clean package -DskipTests"
@@ -180,7 +175,7 @@ build_java() {
180175

181176
# Verify JAR was created
182177
local jar_file
183-
if jar_file=$(find_java_jar); then
178+
if jar_file=$(find_trino_jar); then
184179
local jar_size
185180
jar_size=$(du -h "$jar_file" | cut -f1)
186181
log_success "Build complete: $jar_file ($jar_size)"
@@ -191,12 +186,12 @@ build_java() {
191186
fi
192187
}
193188

194-
# Test the Java implementation
195-
test_java() {
196-
log_info "Testing Java TPC-DS implementation..."
189+
# Smoke-test the built JAR
190+
test_trino() {
191+
log_info "Testing Trino TPC-DS JAR..."
197192

198193
local jar_file
199-
if ! jar_file=$(find_java_jar); then
194+
if ! jar_file=$(find_trino_jar); then
200195
log_error "JAR file not found"
201196
return 1
202197
fi
@@ -233,32 +228,32 @@ test_java() {
233228
fi
234229
}
235230

236-
# Verify installation
231+
# Verify the installation
237232
verify_installation() {
238-
log_info "Verifying Java TPC-DS installation..."
233+
log_info "Verifying Trino TPC-DS installation..."
239234

240235
# Check directory exists
241-
if [[ ! -d "$JAVA_DIR" ]]; then
242-
log_error "Java directory does not exist: $JAVA_DIR"
236+
if [[ ! -d "$TRINO_DIR" ]]; then
237+
log_error "Trino directory does not exist: $TRINO_DIR"
243238
return 1
244239
fi
245240

246241
# Check JAR exists
247242
local jar_file
248-
if ! jar_file=$(find_java_jar); then
249-
log_error "JAR file not found in $JAVA_DIR/target/"
243+
if ! jar_file=$(find_trino_jar); then
244+
log_error "JAR file not found in $TRINO_DIR/target/"
250245
log_error "Run without --verify to build"
251246
return 1
252247
fi
253248

254249
log_success "Found JAR: $jar_file"
255250

256251
# Test it works
257-
if ! test_java; then
252+
if ! test_trino; then
258253
return 1
259254
fi
260255

261-
log_success "Java TPC-DS installation verified"
256+
log_success "Trino TPC-DS installation verified"
262257
return 0
263258
}
264259

@@ -291,10 +286,10 @@ main() {
291286
done
292287

293288
log_info "========================================="
294-
log_info "Java TPC-DS Bootstrap"
289+
log_info "Trino TPC-DS Bootstrap"
295290
log_info "========================================="
296-
log_info "Java directory: $JAVA_DIR"
297-
log_info "Repository: $JAVA_REPO_URL"
291+
log_info "Trino directory: $TRINO_DIR"
292+
log_info "Repository: $TRINO_REPO_URL"
298293
log_info "========================================="
299294

300295
start_time=$(date +%s)
@@ -314,26 +309,26 @@ main() {
314309
fi
315310

316311
# Clone repository if needed
317-
if [[ ! -d "$JAVA_DIR" ]]; then
318-
if ! clone_java_repo; then
312+
if [[ ! -d "$TRINO_DIR" ]]; then
313+
if ! clone_trino_repo; then
319314
exit 1
320315
fi
321316
else
322-
log_success "Java repository already exists"
317+
log_success "Trino repository already exists"
323318
fi
324319

325320
# Build if needed or forced
326321
local jar_file
327-
if [[ $FORCE_REBUILD -eq 1 ]] || ! find_java_jar >/dev/null 2>&1; then
328-
if ! build_java; then
322+
if [[ $FORCE_REBUILD -eq 1 ]] || ! find_trino_jar >/dev/null 2>&1; then
323+
if ! build_trino; then
329324
exit 1
330325
fi
331326
else
332-
log_success "JAR already built: $(find_java_jar)"
327+
log_success "JAR already built: $(find_trino_jar)"
333328
fi
334329

335330
# Test the installation
336-
if ! test_java; then
331+
if ! test_trino; then
337332
exit 1
338333
fi
339334

@@ -344,7 +339,7 @@ main() {
344339
log_info "========================================="
345340
log_info "Bootstrap Complete"
346341
log_info "========================================="
347-
log_success "Java TPC-DS is ready for conformance testing"
342+
log_success "Trino TPC-DS is ready for conformance testing"
348343
log_info "Time: ${duration}s"
349344
log_info ""
350345
log_info "Next steps:"

0 commit comments

Comments
 (0)