Skip to content

When I use mixed precision for your code, model's output throw NaN value embedding during evaluation step. #83

@DevKiHyun

Description

@DevKiHyun

Hi,

Thanks to share your repository.

I found the something weird issue in your code when I use mixed precision, autocast() function.

I add just simple mixed precision code into your code like below:

for num, (data, labels) in enumerate(loader, start = 1):
	self.zero_grad()
	labels            = torch.LongTensor(labels).cuda()
	# speaker_embedding = self.speaker_encoder.forward(data.cuda(), aug = True)
	# nloss, prec       = self.speaker_loss.forward(speaker_embedding, labels)			
	# nloss.backward()
	# self.optim.step()

	if self.mixedprec:
		with autocast():
			speaker_embedding = self.speaker_encoder.forward(data.cuda(), aug = True)
			nloss, prec       = self.speaker_loss.forward(speaker_embedding, labels)			
		self.scaler.scale(nloss).backward()
		self.scaler.step(self.optim)
		self.scaler.update()
	else:
		speaker_embedding = self.speaker_encoder.forward(data.cuda(), aug = True)
		nloss, prec       = self.speaker_loss.forward(speaker_embedding, labels)
		nloss.backward()
		self.optim.step()

I found that if I trained ECAPA-TDNN with mixed precision, then your ecapa_tdnn throw nan value of embedding and it makes NaN value within score variable.

Finally, evaluation code couldn't calculate eer and minDCF score.

Can I discuss this issue with you?

I want to get a some cue from you who are made this code.

Thanks
Best regards

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions