-
Notifications
You must be signed in to change notification settings - Fork 123
Open
Description
As shown over here, the 2 labels creating method only masked the starting special tokens until the start of user content, not the whole query inputs until the start assistant content like what regular SFT does.
Does this make sense? train not just the answer but the query content as well? Or what it wants is not only, just train for instruction following, answer generation ability, but also a bit like pretrain for latent sentiment space alignment for the adapters?
And what does these comment notes means? "Not sure... Possibly what they want is...", "...Not 100% this is correct"
Aren't the authors of 《KBLaM》 provided this code for paper reproduction? What labels they used for training exactly? Does anyone have any ideas?
Metadata
Metadata
Assignees
Labels
No labels