Skip to content

Profiling #260

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 6 commits into
base: main
Choose a base branch
from
Open

Profiling #260

wants to merge 6 commits into from

Conversation

jhiemstrawisc
Copy link
Collaborator

No description provided.

This commit adds the needed bits for the main Python process to create
a peer cgroup (linux only) such that when profiling is enabled, the PRM
containers are run under this cgroup with the `memory.peak` and `cpu.stat`
controllers enabled.

Unfortunately we can't just point Python at some PID, because the PRM
containers launch various processes without reporting the PIDs back to
the originating process. This prevents us from regular inline monitoring.
@jhiemstrawisc jhiemstrawisc requested a review from agitter June 10, 2025 17:37
Copy link
Collaborator

@tristan-f-r tristan-f-r left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Almost every comment is a nitpick, except for the test comment - something to verify (not for correctness) that profiling doesn't suddenly error would be reassuring.


# Write the contents of the file
write_header = not os.path.exists(profile_path) or os.path.getsize(profile_path) == 0
with open(profile_path, "a", newline="") as out_f:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A test on apptainer for testing for the existence of profile_path would be good.

@@ -240,14 +242,98 @@ def run_container_docker(container: str, command: List[str], volumes: List[Tuple
return out


def run_container_singularity(container: str, command: List[str], volumes: List[Tuple[PurePath, PurePath]], working_dir: str, environment: str = 'SPRAS=True'):
def create_cgroup() -> str:
Copy link
Collaborator

@tristan-f-r tristan-f-r Jun 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[I'm generally made uncomfortable with including such a specific function in this general containers file, and I would like to see profiling in general moved to a separate file (especially if we decided to add profiling for other platforms/container frameworks). Understandably, though, that is a little much.]

path_elements = container.split("/")

# Get the last element, which will indicate the base container name
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do quite miss this comment, as base_cont wasn't descriptive. [It could be worthwhile to specify why the last element is the base container name]

Comment on lines -283 to -289

# Adding 'docker://' to the container indicates this is a Docker image Singularity must convert
image_path = Client.pull('docker://' + container, name=sif_file)

# Check if the directory for base_cont already exists. When running concurrent jobs, it's possible
# something else has already pulled/unpacked the container.
# Here, we expand the sif image from `image_path` to a directory indicated by `base_cont`
Copy link
Collaborator

@tristan-f-r tristan-f-r Jun 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(and these comments - the former comment for clarifying that it's necessary to hint to Apptainer that it is a docker container that should be converted)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants