Skip to content
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
17 changes: 17 additions & 0 deletions .claude/commands/build.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
Build the project.

If argument is "quick" or "fast", skip tests:
```
mvn clean package -DskipTests
```

Otherwise run the full build with tests:
```
mvn clean package
```

Rules:
1. Before building, run `mvn spotless:check` first — if formatting fails, run `mvn spotless:apply` and report which files were fixed
2. If the build fails, read the error output and provide a clear summary of what went wrong
3. Do NOT automatically fix build errors — report them and let the user decide
4. Show the final artifact path on success (target/*.jar)
26 changes: 26 additions & 0 deletions .claude/commands/ci-local.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
Run the full CI pipeline locally to verify everything passes before pushing.

This mirrors what GitHub Actions runs. Execute these steps in order, stopping on first failure:

1. **Lint**: `mvn spotless:check`
- If it fails, ask the user if they want to auto-fix with `mvn spotless:apply`

2. **Compile**: `mvn compile`
- Report any compilation errors clearly

3. **Test**: `mvn test`
- Summarize test results (total, passed, failed, skipped)

4. **Package**: `mvn package -DskipTests`
- Confirm the JAR was built successfully

Report a final summary:
```
CI Local Results:
Lint: PASS/FAIL
Compile: PASS/FAIL
Test: PASS/FAIL (X passed, Y failed)
Package: PASS/FAIL
```

Do NOT continue to the next step if any step fails — report the failure and stop.
29 changes: 29 additions & 0 deletions .claude/commands/commit.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
Create a git commit for the current staged/unstaged changes.

Rules:
1. Run `git status` and `git diff` to understand what changed
2. Stage relevant files (prefer specific files over `git add -A`)
3. Write a SHORT commit message (max 50 chars for subject line, imperative mood)
4. Use `--signoff` to sign off using the committer's git config (do NOT hardcode any name/email)
5. Do NOT add Co-Authored-By or any other trailers
6. If there are no changes, say so and stop

Commit format:
```
git commit --signoff -m "short imperative message"
```

The `--signoff` flag automatically uses the name and email from `git config user.name` and `git config user.email`, so each collaborator's own identity is used.

Examples of good messages:
- "add user and permission models"
- "implement executor allocation logic"
- "fix gang scheduling annotation prefix"
- "add table-driven tests for config parsing"

Do NOT:
- Use long descriptive messages
- Add Co-Authored-By trailers
- Use past tense ("added", "fixed")
- Prefix with type tags ("feat:", "fix:") unless asked
- Hardcode any author name or email
11 changes: 11 additions & 0 deletions .claude/commands/lint.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
Check and fix code formatting using Spotless/Scalafmt.

Steps:
1. Run `mvn spotless:check` to see if there are formatting violations
2. If violations are found, run `mvn spotless:apply` to auto-fix them
3. After applying, run `git diff --stat` to show what files were reformatted
4. Summarize the changes (which files, what kind of formatting was fixed)

If no violations are found, say so and stop.

Do NOT commit the formatting changes — just apply and report.
27 changes: 27 additions & 0 deletions .claude/commands/summary.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
Generate a concise implementation summary for a PR description.

Steps:
1. Run `git diff master...HEAD --stat` and `git log master..HEAD --oneline` to understand all changes on this branch
2. Read the changed files to understand what was implemented and why
3. Write a summary suitable for a GitHub PR description

Summary format rules:
- Start with a one-line "What" statement explaining the change
- Follow with a "Why" section (2-3 sentences max) explaining the motivation
- List the key changes as plain bullet points (no nested bullets)
- If there are new tests, mention what they cover in one line
- End with a "How to verify" section with concrete steps if applicable
- Keep the total summary under 30 lines
- Use plain text with minimal markdown (no tables, no headers larger than ##, no code blocks unless showing a command)
- Do not repeat file paths or class names unnecessarily
- Focus on behavior changes, not implementation details
- Write in present tense, active voice

Do NOT:
- Use heavy markdown formatting (no bold, no tables, no badges)
- List every single file changed
- Include generic boilerplate like "This PR adds..."
- Add emojis
- Over-explain things that are obvious from the diff

Output the summary directly so the user can copy-paste it into the PR description.
18 changes: 18 additions & 0 deletions .claude/hooks/spotless-apply.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
#!/bin/bash
# Auto-format Scala files after Edit/Write using Spotless
INPUT=$(cat)
FILE_PATH=$(echo "$INPUT" | jq -r '.tool_input.file_path // empty')

# Only run for Scala source files
if [[ "$FILE_PATH" != *.scala ]]; then
exit 0
fi

RESULT=$(cd "$CLAUDE_PROJECT_DIR" && mvn spotless:apply -q 2>&1)
EXIT_CODE=$?

if [ $EXIT_CODE -eq 0 ]; then
echo "{\"systemMessage\": \"Spotless formatting applied successfully\"}"
else
echo "{\"systemMessage\": \"Spotless formatting failed: $RESULT\"}"
fi
23 changes: 23 additions & 0 deletions .claude/hooks/verify-build.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
#!/bin/bash
# Verify the project compiles before Claude stops
INPUT=$(cat)
STOP_HOOK_ACTIVE=$(echo "$INPUT" | jq -r '.stop_hook_active // false')

# Prevent infinite loops - skip if already triggered by a Stop hook
if [ "$STOP_HOOK_ACTIVE" = "true" ]; then
echo '{"systemMessage": "Skipped build verification (stop hook already active)"}'
exit 0
fi

cd "$CLAUDE_PROJECT_DIR"
RESULT=$(mvn compile -q 2>&1)
EXIT_CODE=$?

if [ $EXIT_CODE -ne 0 ]; then
ESCAPED_RESULT=$(echo "$RESULT" | head -20 | jq -Rs .)
echo "{\"systemMessage\": \"Build failed. Fix compilation errors before finishing: ${ESCAPED_RESULT}\"}" >&2
exit 2
fi

echo '{"systemMessage": "Build verification passed"}'
exit 0
93 changes: 93 additions & 0 deletions .claude/settings.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,93 @@
{
"permissions": {
"allow": [
"Read",
"Edit",
"Write",
"Glob",
"Grep",
"Bash(mvn *)",
"Bash(mvn)",
"Bash(./scripts/*)",
"Bash(git status)",
"Bash(git status *)",
"Bash(git diff)",
"Bash(git diff *)",
"Bash(git log *)",
"Bash(git log)",
"Bash(git add *)",
"Bash(git commit *)",
"Bash(git checkout *)",
"Bash(git branch *)",
"Bash(git branch)",
"Bash(git switch *)",
"Bash(git merge *)",
"Bash(git stash *)",
"Bash(git stash)",
"Bash(git tag *)",
"Bash(git push *)",
"Bash(git push)",
"Bash(git pull *)",
"Bash(git pull)",
"Bash(git fetch *)",
"Bash(git fetch)",
"Bash(git remote *)",
"Bash(git config *)",
"Bash(gh *)",
"Bash(ls *)",
"Bash(ls)",
"Bash(mkdir *)",
"Bash(which *)",
"Bash(java *)",
"Bash(scala *)",
"Bash(docker *)"
],
"deny": [
"Bash(rm -rf *)",
"Bash(git reset --hard *)",
"Bash(git push --force *)",
"Bash(git push -f *)",
"Bash(git clean -f *)"
]
},
"hooks": {
"PostToolUse": [
{
"matcher": "Edit|Write",
"hooks": [
{
"type": "command",
"command": "\"$CLAUDE_PROJECT_DIR\"/.claude/hooks/spotless-apply.sh",
"timeout": 120,
"statusMessage": "Formatting with Spotless...",
"async": true
}
]
}
],
"Stop": [
{
"hooks": [
{
"type": "command",
"command": "\"$CLAUDE_PROJECT_DIR\"/.claude/hooks/verify-build.sh",
"timeout": 300,
"statusMessage": "Verifying build compiles..."
}
]
}
]
},
"enabledPlugins": {
"backend-development@claude-code-workflows": true,
"code-review@claude-plugins-official": true,
"code-simplifier@claude-plugins-official": true,
"cicd-automation@claude-code-workflows": true,
"code-documentation@claude-code-workflows": true,
"documentation-generation@claude-code-workflows": true,
"unit-testing@claude-code-workflows": true,
"debugging-toolkit@claude-code-workflows": true,
"error-debugging@claude-code-workflows": true,
"dependency-management@claude-code-workflows": true
}
}
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -83,3 +83,6 @@ scripts/.tmp/

# Jupyter
example/jupyter/workspace/

# Claude Code local settings (personal per-user config)
.claude/settings.local.json
132 changes: 132 additions & 0 deletions CLAUDE.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,132 @@
# CLAUDE.md - armada-spark

## Project Overview

Apache Spark plugin that integrates with [Armada](https://armadaproject.io/), a multi-cluster Kubernetes batch scheduler. Implements Spark's `ExternalClusterManager` SPI to submit and manage Spark jobs via Armada's gRPC API.

## Build & Run

```bash
# Build
mvn clean package

# Run tests
mvn test

# Lint check / auto-fix
mvn spotless:check
mvn spotless:apply

# Set Spark/Scala versions (e.g., Spark 3.5.5, Scala 2.13.8)
./scripts/set-version.sh 3 5 5 2 13 8
```

**Stack:** Scala 2.13 | Maven | Spark 3.5 | Java 17 | Fabric8 Kubernetes Client | gRPC/Protobuf (via armada-scala-client)

## Project Structure

```
src/main/scala/org/apache/spark/
├── deploy/armada/ # Configuration & job submission
│ ├── Config.scala # All spark.armada.* config entries (ConfigBuilder API)
│ ├── DeploymentModeHelper.scala
│ ├── submit/ # Job submission pipeline
│ │ ├── ArmadaClientApplication.scala # Main submission logic
│ │ ├── PodSpecConverter.scala # Fabric8 <-> Protobuf conversion
│ │ ├── PodMerger.scala # JSON deep merge for pod specs
│ │ └── ...
│ └── validators/K8sValidator.scala
└── scheduler/cluster/armada/ # Cluster manager & scheduling
├── ArmadaClusterManager.scala # ExternalClusterManager SPI entry point
├── ArmadaClusterManagerBackend.scala # Executor lifecycle management
├── ArmadaEventWatcher.scala # gRPC event stream processing
└── ArmadaExecutorAllocator.scala # Dynamic allocation
```

Version-specific sources live in `src/main/scala-spark-{version}/`.

## Code Style

- **Formatter:** Scalafmt 3.9.5 (enforced by Spotless Maven plugin)
- **Max line length:** 100 columns
- **Alignment:** `align.preset = more`
- **Dialect:** scala213
- Always run `mvn spotless:apply` before committing

### Naming Conventions

- Classes/traits: `PascalCase` (e.g., `ArmadaClusterManager`)
- Methods/variables: `camelCase`
- Config constants: `UPPER_SNAKE_CASE` (e.g., `ARMADA_JOB_QUEUE`)
- Test files: `{ClassName}Suite.scala`

### Scala Patterns Used

- **Scoped visibility:** `private[spark]` for package-private classes, `private[submit]` / `private[armada]` for internal APIs
- **Case classes** for data types (e.g., `ClientArguments`, `CLIConfig`, `ResourceConfig`)
- **Companion objects** for factory methods and constants
- **Option/Try monads** over null/exceptions; `NonFatal` for catch blocks
- **For-comprehensions** for chained Option/Try operations
- **Call-by-name parameters** (`=> Option[T]`) for lazy evaluation
- **`scala.jdk.CollectionConverters._`** for Java/Scala interop (`.asScala` / `.asJava`)
- **Spark's `Logging` trait** for all logging (`logInfo`, `logWarning`, `logDebug`)
- **Spark's `ConfigBuilder` API** for all configuration entries in `Config.scala`

### Import Order

1. Java/javax imports
2. Scala stdlib imports
3. Third-party imports (io.armadaproject, io.fabric8, com.fasterxml)
4. Spark imports (org.apache.spark)

### License Header

All source files must include the Apache 2.0 license header (see any existing file).

## Testing Standards

- **Framework:** ScalaTest 3.2.16 (`AnyFunSuite` style exclusively)
- **Mocking:** Mockito 5.12 (`mock(classOf[...])`, `when(...).thenReturn(...)`)
Copy link
Collaborator

@GeorgeJahad GeorgeJahad Feb 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there are a lot of hard coded version numbers of dependencies in this file that have been copied over from the pom file.

Wouldn't it be better to tell claude to read the pom for the versions of these dependencies?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah reffered to pom.xml. If it drifts too much, we can always ask Claude to update the CLAUDE.md.

- **Assertions:** ScalaTest matchers (`shouldBe`, `shouldEqual`, `should contain`)

### Test Patterns

```scala
// Standard test class structure
class FooSuite extends AnyFunSuite with BeforeAndAfter with Matchers {
before { /* setup */ }
after { /* cleanup */ }
test("description of behavior") { /* assertions */ }
}

// Table-driven property tests (preferred for parameterized cases)
class BarSuite extends AnyFunSuite with TableDrivenPropertyChecks with Matchers {
test("validates multiple inputs") {
val testCases = Table(("input", "expected"), ("a", true), ("", false))
forAll(testCases) { (input, expected) =>
validate(input) shouldBe expected
}
}
}
```

- Use `BeforeAndAfter` or `BeforeAndAfterEach` for fixtures (temp files, mocks)
- Use `TableDrivenPropertyChecks` for parameterized/data-driven tests
- Mock SparkContext/SparkConf rather than creating real Spark sessions
- Clean up temp files in `after` blocks
- No shared base test class; use trait composition
- E2E tests tagged with custom `E2ETest` ScalaTest tag (excluded from `mvn test`)

## Agent Workflow

**The main agent must act as an orchestrator.** Never do work inline that can be delegated to a subagent.

- **Delegate everything:** Use the Task tool with specialized subagents for all research, code exploration, code writing, testing, and analysis. The main agent should plan, coordinate, and summarize — not do the work itself.
- **Maximize parallelism:** Launch multiple subagents concurrently whenever their tasks are independent. For example, when exploring code patterns AND analyzing tests AND checking dependencies, spawn all three agents in a single message rather than sequentially. Always send independent Task calls in a **single message** with multiple tool-use blocks.
- **Use the right agent type:** Pick `Explore` for codebase search/understanding, `Plan` for architecture decisions, `Bash` for commands, and specialized agents (e.g., `code-reviewer`, `test-automator`, `debugger`) when they match the task.
- **Keep the main context clean:** Offload large file reads, multi-file searches, and deep analysis to subagents so the main conversation stays focused on coordination and user communication.
- **Hooks run automatically — use subagents to respond:** When a hook (Spotless, build verification, code review, or simplification) reports an issue, delegate the fix to a subagent rather than doing it inline. If multiple hooks fail simultaneously, spawn parallel subagents to address each issue concurrently.

## CI/CD

GitHub Actions with matrix builds across Spark 3.3/3.5/4.1 and Scala 2.12/2.13. Pipeline: lint -> build -> e2e tests.
Loading