Optimize eval runtime for precompiled tuples and structured input#399
Conversation
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 53ec0cb301
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
There was a problem hiding this comment.
Pull request overview
This PR refactors Jawk’s eval/interpret runtime to reduce per-invocation overhead for precompiled AwkTuples, adds first-class structured input support (record text and/or pre-split fields), and serializes tuple execution metadata so deserialized tuples are execution-ready.
Changes:
- Introduces
AwkTuples.ExecutionProfile(serialized) to drive optimized eval preparation and to omit redundantSET_NUM_GLOBALSfor simple eval expressions. - Updates
InputSourceto supportgetRecordText()(nullable) alongsidegetFields()for lazy$0/field synchronization and fields-only sources. - Splits/cleans up AVM/JRT responsibilities (runtime preparation, IO/resource lifecycle, output sinks) and expands tests + docs for the new eval and structured-input flows.
Reviewed changes
Copilot reviewed 18 out of 18 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| src/main/java/org/metricshub/jawk/Awk.java | Reworks eval(...) APIs to use cached settings snapshots + fresh AVM per call and supports InputSource directly. |
| src/main/java/org/metricshub/jawk/util/AwkSettings.java | Adds modificationCount to support cache invalidation for derived runtime snapshots. |
| src/main/java/org/metricshub/jawk/intermediate/AwkTuples.java | Adds serialized ExecutionProfile and removes redundant eval SET_NUM_GLOBALS when safe. |
| src/main/java/org/metricshub/jawk/jrt/InputSource.java | Renames getRecord() → getRecordText() and clarifies nullable record-text contract. |
| src/main/java/org/metricshub/jawk/jrt/JRT.java | Introduces lazy record-state model + output sink abstraction and refactors IO state handling. |
| src/main/java/org/metricshub/jawk/backend/AVM.java | Aligns eval/interpret preparation around tuple metadata, resets execution state per run, and routes print/printf through JRT sinks. |
| src/main/java/org/metricshub/jawk/backend/RuntimeStack.java | Switches to ArrayDeque, adds reset support, and avoids null elements via sentinel. |
| src/main/java/org/metricshub/jawk/backend/RegexRuntimeSupport.java | Extracts regex replacement helpers from JRT for AVM usage. |
| src/main/java/org/metricshub/jawk/jrt/StreamInputSource.java | Updates implementation to getRecordText(). |
| src/main/java/org/metricshub/jawk/SandboxedAwk.java | Updates AVM creation hook to match new Awk#createAvm(AwkSettings) shape. |
| src/test/java/org/metricshub/jawk/AwkEvalTest.java | Adds coverage for eval execution paths, tuple metadata behavior, and reuse/reset semantics. |
| src/test/java/org/metricshub/jawk/InputSourceTest.java | Adds coverage for fields-only sources and eval with InputSource (text-only and fields-only). |
| src/test/java/org/metricshub/jawk/ListInputSourceTest.java | Updates tests to new getRecordText() API. |
| src/test/java/org/metricshub/jawk/RegexpTupleAndCachingTest.java | Adds serialization test for execution profile metadata. |
| src/site/markdown/java.md | Updates docs for eval(...) / InputSource contract and adds AVM advanced reuse section. |
| src/site/markdown/index.md | Adds quick-start docs for precompiled eval and structured sources. |
| README.md | Updates README examples for eval(...), tuple reuse, and structured input. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
src/main/java/org/metricshub/jawk/backend/RegexRuntimeSupport.java
Outdated
Show resolved
Hide resolved
src/test/java/org/metricshub/jawk/RegexpTupleAndCachingTest.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 18 out of 18 changed files in this pull request and generated 4 comments.
Comments suppressed due to low confidence (1)
src/main/java/org/metricshub/jawk/backend/AVM.java:2402
applySpecialVariablestreatsARGCas a JRT-managed special variable and callsvm.assignVariable("ARGC", ...)duringprepareExecution(), beforeSET_NUM_GLOBALShas allocated the runtime stack globals. WhenglobalVariableOffsetsis already installed,assignVariablewill try to write intoruntimeStackwhileglobalsis still null, which can trigger an assertion/NPE. Consider deferring assignments until globals are allocated (e.g., treatruntimeStack.getNumGlobals()==nullthe same as missing offsets and store into the baseline initial-variable snapshot), or exclude ARGC from the early special-variable application path.
public final void assignVariable(String name, Object obj) {
// When offsets are not available yet, treat the assignment as part of this
// AVM's baseline initial-variable snapshot.
if (globalVariableOffsets == null || globalVariableArrays == null) {
baseInitialVariables.put(name, obj);
if (JRT.isJrtManagedSpecialVariable(name)) {
baseSpecialVariables.put(name, obj);
}
return;
}
// make sure we're not receiving funcname=value assignments
if (functionNames != null && functionNames.contains(name)) {
throw new IllegalArgumentException("Cannot assign a scalar to a function name (" + name + ").");
}
Integer offsetObj = globalVariableOffsets.get(name);
Boolean arrayObj = globalVariableArrays.get(name);
if (offsetObj != null) {
assert arrayObj != null;
if (arrayObj.booleanValue()) {
throw new IllegalArgumentException("Cannot assign a scalar to a non-scalar variable (" + name + ").");
} else {
runtimeStack.setFilelistVariable(offsetObj.intValue(), obj);
}
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 18 out of 18 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 18 out of 18 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 20 out of 20 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Summary
SET_NUM_GLOBALSfor simple eval expressionsTesting
mvn formatter:formatmvn testmvn verify(still hits the existing Windows compatibility-suite failures inBwkMiscTest,BwkTTest, andGawkTest)Closes #398