Restructured Evaluation in Readme

AryazE · AryazE · commit 820506e53200 · 2025-04-08T19:47:32.000+02:00
diff --git a/README.md b/README.md
@@ -22,23 +22,26 @@ pip install -r requirements.txt
 The checkers are implemented in `src/analyses`.
 
 ## Evaluation
-All experiments are self-contained, i.e. they download the required repositories, source code, and data.
-Run the micro-benchmark (RQ1):
+All experiments (except the micro-benchmark run) run inside Docker containers.
+The scripts provided are self-contained, i.e. they build the required Docker container, and download the required repositories, source code, and data.
+
+### RQ1: Effectiveness
+Run DyLin on the micro-benchmark:
 ```bash
+python -m venv venv
+source venv/bin/activate
 pip install -r requirements-tests.txt
 pytest tests
 ```
 
-Run DyLin on GitHub repositories (RQ1, RQ4, and RQ5):
+Run DyLin on the GitHub repositories:
 ```bash
 bash build_projects.sh
-bash run_all_with_cov.sh # to collect analysis coverage
-bash run_all_no_cov.sh # no analysis coverage
+bash run_all_no_cov.sh
 ```
 Results will be in `project_results`.
-For each repository a directory is created with two sub-directories and a text file.
-The first sub-directory is named `dynapyt_coverage-<unique id of the run>`, which contains a json file with detailed analysis coverage data.
-The second sub-directory is named `dynapyt_output-<unique id of the run>`, with the following contents:
+For each repository a directory is created with one sub-directory and a text file.
+The sub-directory is named `dynapyt_output-<unique id of the run>`, with the following contents:
 - A `findings.csv` file, which summarizes the findings.
 - An `output.json` file, which contains the details of the findings.
 The text file contains the name of the project, the instrumentation duration in seconds, and the analysis time in seconds.
@@ -52,21 +55,22 @@ This will generate a text file with the format
 <file>:<line>:<column>: <issue code> <issue message>
 ```
 
-Run the GitHub project's test suites without DyLin to get test suite coverage (used in RQ4):
+Run DyLin on a Kaggle competition:
 ```bash
-bash build_testcov.sh
-bash run_all_testcov.sh
+bash build_kaggle.sh <kaggle competition id: e.g. titanic>
+bash run_kaggle.sh
 ```
-Results will be in `project_testcovs`.
-For each repository a directory is created with a json file containing the detailed test coverage data.
+Results will be in `kaggle_results`.
+For each competition a directory is created with 3 subdirectories:
+- `coverage`, which contains analysis coverage information in a json file.
+- `submissions`, which contains the submissions analyzed by DyLin.
+- `table`, which contains the findings in a json file.
 
-To calculate the ratio of analysis coverage to test coverage you can run
-```bash
-python scripts/coverage_report.py coverage_comparison --analysis_dir <path to the subdirectory in project_results> --test_dir <path to the subdirectory in project_testcovs>
-```
-This generates a csv file with a summary of analysis and test coverage similar to `Supplementary_Material_FSE2025/DyLin - FSE 2025 Artifact.pdf` page 1.
+### RQ2: Severity of Detected Issues
+The submitted pull requests and issues are available in `Supplementary_Material_FSE2025/DyLin Issues - *.pdf`
 
-Run static linters on GitHub projects (RQ3):
+### RQ3: Comparison with Existing Tools
+Run static linters on GitHub projects:
 ```bash
 bash build_lint.sh
 bash run_all_linters.sh
@@ -79,13 +83,22 @@ python scripts/compare_static_dynamic_linters.py --static_dir <path to the direc
 ```
 This will output all lines that both approaches have warned about.
 
-Run DyLin on a Kaggle competition (RQ1):
+### RQ4: Analysis Coverage
+Run DyLin with analysis coverage on:
 ```bash
-bash build_kaggle.sh <kaggle competition id: e.g. titanic>
-bash run_kaggle.sh
+bash build_projects.sh
+bash run_all_with_cov.sh
 ```
-Results will be in `kaggle_results`.
-For each competition a directory is created with 3 subdirectories:
-- `coverage`, which contains analysis coverage information in a json file.
-- `submissions`, which contains the submissions analyzed by DyLin.
-- `table`, which contains the findings in a json file.
+Run the GitHub project's test suites without DyLin to get test suite coverage:
+```bash
+bash build_testcov.sh
+bash run_all_testcov.sh
+```
+Results will be in `project_testcovs`.
+For each repository a directory is created with a json file containing the detailed test coverage data.
+
+To calculate the ratio of analysis coverage to test coverage you can run
+```bash
+python scripts/coverage_report.py coverage_comparison --analysis_dir <path to the subdirectory in project_results> --test_dir <path to the subdirectory in project_testcovs>
+```
+This generates a csv file with a summary of analysis and test coverage similar to `Supplementary_Material_FSE2025/DyLin - FSE 2025 Artifact.pdf` page 1.