I am using OLMoTrace's implementation of the API for a study, and I realized that since the attributions get filtered, it is very hard to get stage specific traces. I think this would be extremely useful for data-centeric interpretability studies, investigating instruction-training, post-training etc.
I can look to implementing it in this code with the maintainers.
I am using OLMoTrace's implementation of the API for a study, and I realized that since the attributions get filtered, it is very hard to get stage specific traces. I think this would be extremely useful for data-centeric interpretability studies, investigating instruction-training, post-training etc.
I can look to implementing it in this code with the maintainers.