Output visualizer #296

elzorroartico · 2025-04-11T23:22:39Z

No description provided.

?

Matistjati · 2025-04-11T23:27:26Z

This is not ready to be merged. Known issues:

Incorrect arguments when calling visualizer
The output visualizer uses pillow, which is not available to problemtools (? too tired to check rn). I don't think we want it as a dependency. The alternative is to include the library manually, but that's in the magnitude of thousands of lines of source code, which is not nice either. Perhaps we could generate raw images RGB and use imagemagick to convert to png? I haven't yet investigated if ImageMagick is sufficient for our SVG sanitization demands, will take a look tomorrow.
SVGs are not sanitized. We may want to disable them until we have SVG sanitization up and running?

Some added thoughts:

Before this PR, the feedback dir did not persist outside of the function outputvalidator.validate. The current system is that you can pass optionally a feedback directory to it, which it won't delete. Each submission on each testcase gets its own feedback dir in the root of the "problem verification directory". On new submissions, we delete old ones if they exist. The reason for this architecture is that it's useful for both multipass and output visualizers.
What mem/time limits should we place on it? Currently, the spec doesn't comment on this. IMO, reasonable defaults is enough. Given that we produce PNGs, maybe 5-10s TL and 1GB ram? IMO, 1GB ram suffices; 1920x1080 raw is ~6MB.
Currently, we don't run the output visualizer on the secret test data. It feels unnecessary, but may be nice for debugging?
We always generate the images even if we don't pass the flag to save them to disk in order to sanity check them (check for proper header and ensure the output visualizer returns exit code 0). I think this is good.
The folder structure when it outputs the images is arbitrary and there may be a better one
We don't delete old folder structures of outputted images, because that feels scary and may not be what the user wants
There's also some discussion over at Output visualizer on interactive/multipass problem problem-package-format#437

gkreitz

I took a quick look before I noticed there was already feedback from @Matistjati that needs to be addressed, so I'm holding off looking in more detail. Luckily, I think the comments I added do not overlap with his.

gkreitz · 2025-04-12T08:02:44Z

examples/different/output_visualizer/outputvisualizer.py

+
+def main():
+    if not len(sys.argv) == 3:
+        print("Usage: output_visualizer.py <submission_output> <feedback_dir>")


This does not seem to match the standard? https://www.kattis.com/problem-package-format/spec/2023-07-draft.html#invocation-2

gkreitz · 2025-04-12T08:02:49Z

problemtools/verifyproblem.py

+
+    Example of created file structure when run on different:
+    different_images
+    ├── different.c


What happens if I have different.c both as an accepted and as a wrong_answer submission?

gkreitz · 2025-04-12T08:02:51Z

problemtools/verifyproblem.py

+            self.warning(f'Wrong amount of visualizers. \nExcpected: 1\nActual: {len(self._visualizer)}')
+
+    # Checks if a file's extension is allowed, and if so validates its header
+    def check_is_valid_image(self, file) -> bool:


There are libraries and standard tools to do this type of file type detection. I'd prefer we use them, if possible, instead of re-implementing that logic ourselves. There's a python package called python-magic. (An uglier option would be to just use file --mime-type from the shell, but that feels hacky).

gkreitz · 2025-04-12T08:10:59Z

The output visualizer uses pillow, which is not available to problemtools (? too tired to check rn). I don't think we want it as a dependency. The alternative is to include the library manually, but that's in the magnitude of thousands of lines of source code, which is not nice either.

This feels like a problem in the standard to me. How is one even intended to sanely write an output visualizer? The only option I can see is to dump a huge dependency in the validator directory. And, as you point out, we definitely do not want to do that in our examples folder.

Perhaps we could generate raw images RGB and use imagemagick to convert to png?

To do this safely, you'd need to include the imagemagick source (i.e., same issue as above with using PIL). Sure, problemtools may ensure it's there as a dependency, but there's no such guarantee for sandboxes on judge systems (and at some point, problemtools should also sandbox its runs).

Examples included in problemtools should be good examples for users to base stuff off of, so it's not a good sign for the standard if we can't come up with a good example for an output visualizer.

Matistjati · 2025-04-13T05:03:51Z

To do this safely, you'd need to include the imagemagick source (i.e., same issue as above with using PIL). Sure, problemtools may ensure it's there as a dependency, but there's no such guarantee for sandboxes on judge systems (and at some point, problemtools should also sandbox its runs).

Good point, the examples should actually work outside of problemtools.

@elzorroartico I've thought more about it, and I think we do want to run the output visualizers on the secret data. Currently, the structure is basically {shortname}_images/{submission_name}/{testcase}/res.png. It's pretty natural then to run the visualizer on the data in secret and sample, and replace {submission_name} with either secret or sample. So for example, we could have hello_images/secret/01/vis.png.

I propose changes in the {submission_name} part. As Gunnar pointed out, we get problems if multiple submissions are named the same thing. I think the most reasonable solution is to include the submission folder it's in, so for example, accepted.joshua.cpp and wrong_answer.joshua.cpp would create unique folders.

In general, I also think we need to be more careful with filenames. Some annoying examples:

Test data that has deep levels of nesting. Test cases can be nested arbitrarily deeply in folders
Test cases with the same name across different subgroups
Submissions consisting of a folder. Especially one named secret or sample....

Perhaps the solution is to prepend a unique counter to all filenames? Although at that point, the order is going to be nondeterministic due to multithreading. I can't think of a better solution than to keep {submission_name} as I proposed above, and change {testcase} to be the full path relative either sample or secret. I think Fredrik considered requiring all testcase names to be unique, I'll make a ticket later today. But I don't think we can count on that.

elzorroartico and others added 15 commits April 12, 2025 01:04

General structure of Output Visualizer class

1ecfe46

Created function to check file signature, missing SVG support

c40c549

Folder creation method and cleanup

2c65bbc

?

Began work on tmp dir for images

b19fabb

works now

f34d46e

Fixed file header check, clean up and documentation

3f82d10

Fixed issues regarding order, style and logic

5e21ef4

save image function works proberly, fix path left

afe9ffc

New directory structure so visualize is not inside of validator

0fa162e

Fixed issue with images overwriting and clean up

232fd26

Documentation, clean up and slight improvement with save_image method

120d2b0

Clean up

2956674

feedbackdir persists longer in problem validation folder

4b33732

Remove C++ image library

7ace997

Output visualizer cleanup

a53f4ec

Fix interactive visualization and pass correct file

af8a099

gkreitz requested changes Apr 12, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Output visualizer #296

Output visualizer #296

Uh oh!

elzorroartico commented Apr 11, 2025

Uh oh!

Matistjati commented Apr 11, 2025 •

edited

Loading

Uh oh!

gkreitz left a comment

Uh oh!

gkreitz Apr 12, 2025

Uh oh!

gkreitz Apr 12, 2025

Uh oh!

gkreitz Apr 12, 2025

Uh oh!

gkreitz commented Apr 12, 2025 •

edited

Loading

Uh oh!

Matistjati commented Apr 13, 2025

Uh oh!

Uh oh!

Output visualizer #296

Are you sure you want to change the base?

Output visualizer #296

Uh oh!

Conversation

elzorroartico commented Apr 11, 2025

Uh oh!

Matistjati commented Apr 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gkreitz left a comment

Choose a reason for hiding this comment

Uh oh!

gkreitz Apr 12, 2025

Choose a reason for hiding this comment

Uh oh!

gkreitz Apr 12, 2025

Choose a reason for hiding this comment

Uh oh!

gkreitz Apr 12, 2025

Choose a reason for hiding this comment

Uh oh!

gkreitz commented Apr 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Matistjati commented Apr 13, 2025

Uh oh!

Uh oh!

Matistjati commented Apr 11, 2025 •

edited

Loading

gkreitz commented Apr 12, 2025 •

edited

Loading