Skip to content

Latest commit

 

History

History
68 lines (40 loc) · 3.25 KB

File metadata and controls

68 lines (40 loc) · 3.25 KB

Notes for Artifact Evaluation

This document summarizes the main evaluation claims of the SpecFS artifact and provides practical guidance for reviewers to validate the reported results in the paper. All claims below correspond directly to the evaluation sections in the paper and are intended to be reproducible using the provided artifact.


Main Claims

Accuracy Evaluation (Section 6.1)

Claim.
The SpecFS filesystem generated by our framework accurately implements the given specifications.

Expected Results.
When running the full generation and validation pipeline, the generated filesystem should pass all validation tests. Upon successful completion, the pipeline will print the following message:

✅ All tests completed successfully!

This indicates that the generated implementation conforms to the specification and satisfies all correctness checks.

Productivity Evaluation (Section 6.4)

Claim.
The specification descriptions consistently require fewer lines than their corresponding generated C source code, demonstrating improved developer productivity.

Expected Results.
The figures generated at:

  • result/eval-loc-atomfs.pdf
  • result/eval-loc-feat.pdf

should show that, for all evaluated modules, the number of lines of code (LOC) in the specifications is lower than that of the generated C source code.

Performance Optimizations (Section 6.5)

Claims.

  • Inline Data reduces the number of allocated blocks for the QEMU and Linux source code modules.
  • Pre-allocation improves contiguous block allocation in read/write microbenchmarks.
  • rbtree-based allocation enhances preallocation pool access efficiency in the write microbenchmark.
  • Extent-based optimization improves read and/or write performance for the filesystem test suite in certain scenarios.
  • Delayed Allocation improves read and/or write performance for the filesystem test suite in certain scenarios.

The filesystem test suite includes both real-world workloads (e.g., xv6 compilation and QEMU copy) and stress tests (e.g., large-file and small-file workloads).

Expected Results.
The performance plots generated under the result/ directory should qualitatively match the trends reported in the paper, showing measurable performance improvements when comparing optimized versions of the filesystem against their unoptimized counterparts.

For reference, example plots corresponding to the paper are provided in the docs/ directory.

Prepared Environment

  • We have prepared a dedicated Huawei Cloud ECS instance with all required dependencies pre-installed.
    Each reviewer will receive access to an individual account and can reproduce the results by following the provided instructions.
    The instance IP address will be shared via HotCRP. Reviewers are kindly asked to provide their SSH public key for access.

  • Each reviewer will also be provided with a Google API key to use Google Gemini as the LLM backend.
    We allocate USD 10 in API credits per reviewer, which is sufficient to complete the entire generation and evaluation pipeline.

Contact

If you encounter any issues or have questions during the evaluation, please contact the authors via HotCRP.