peak memory calculation

I saw this line in source code:
https://github.com/interestingLSY/swiftLLM/blob/682cf9a28f97f7490409981a2f181528f377eb5d/swiftllm/worker/model.py#L116-L122

after `forward`,  some memory has been released, for example memory for Intermediate Activations and memory for input ids .etc
so could the calculation make a high block numbers than reality?
thanks

	_ = self.forward(input_ids, seq_ids, [], ignore_kvcache=True)
	torch.cuda.synchronize()

	# peak_memory = torch.cuda.max_memory_allocated()
	# total_memory = torch.cuda.get_device_properties(0).total_memory
	free_memory, total_memory = torch.cuda.mem_get_info()
	peak_memory = total_memory - free_memory

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

peak memory calculation #9

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

peak memory calculation #9

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions