Log regression task metrics in multitask model by ntravis22 · Pull Request #3648 · flairNLP/flair

ntravis22 · 2025-03-24T22:25:38Z

Recently per-task metrics were added to multitask_model, however we did not include any for regression tasks and we did not check that the metric keys are present which can throw an error, so this addresses both of those concerns.

MattGPT-ai

Are there metrics for regression type models that we could put into scores, or do those perhaps go into the "classification report" - e.g. Pearson, spearman. In the regression models, results is written as

            eval_metrics = {
                "loss": eval_loss.item(),
                "mse": metric.mean_squared_error(),
                "mae": metric.mean_absolute_error(),
                "pearson": metric.pearsonr(),
                "spearman": metric.spearmanr(),
            }

So maybe we could either check for the base model class that defines evaluate, or just check for the keys. Then maybe we could write e.g. scores[(task_id, 'mse')]. What do you think?

ntravis22 · 2025-03-25T16:08:54Z

Are there metrics for regression type models that we could put into scores, or do those perhaps go into the "classification report" - e.g. Pearson, spearman. In the regression models, results is written as
            eval_metrics = {
                "loss": eval_loss.item(),
                "mse": metric.mean_squared_error(),
                "mae": metric.mean_absolute_error(),
                "pearson": metric.pearsonr(),
                "spearman": metric.spearmanr(),
            }
So maybe we could either check for the base model class that defines evaluate, or just check for the keys. Then maybe we could write e.g. scores[(task_id, 'mse')]. What do you think?

Ok I added mse.

MattGPT-ai · 2025-03-25T17:47:12Z

Oh actually, can we just add all four of the metrics?

ntravis22 · 2025-03-25T19:05:42Z

Oh actually, can we just add all four of the metrics?

Done

alanakbik · 2025-03-26T12:43:08Z

@ntravis22 @MattGPT-ai Could you paste a script to test this PR?

ntravis22 · 2025-04-01T22:39:33Z

@alanakbik Here is a script to test:

from flair.data import Sentence, Corpus
from flair.embeddings import TransformerEmbeddings
from flair.models import TextRegressor
from flair.trainers import ModelTrainer
from flair.trainers.plugins.base import TrainerPlugin
from flair.nn.multitask import make_multitask_model_and_corpus


class MetricPlugin(TrainerPlugin):
    """Test plugin. In practice this could do something like logging metrics to WandB."""

    @TrainerPlugin.hook
    def metric_recorded(self, record):
        print(f"Metric: {record}")

sentences = [Sentence("This is a sentence") for _ in range(100)]
for sentence in sentences:
    sentence.add_label("regression_label", 1.0)
corpus = Corpus(sentences)
model = TextRegressor(TransformerEmbeddings('bert-base-uncased'), label_name="regression_label")

multitask_model, multicorpus = make_multitask_model_and_corpus([(model, corpus)])

trainer = ModelTrainer(multitask_model, multicorpus)
trainer.train('regression_model/', plugins=[MetricPlugin()])

Running this with the changes in this PR you can see lines printed like:

Metric: MetricRecord(dev/Task_0/mse at step 6, 1743547102.8555)
Metric: MetricRecord(dev/Task_0/mae at step 6, 1743547102.8556)
Metric: MetricRecord(dev/Task_0/pearson at step 6, 1743547102.8556)
Metric: MetricRecord(dev/Task_0/spearman at step 6, 1743547102.8556)

ntravis22 added 5 commits March 24, 2025 15:15

handle case where not all metrics present

eab64f5

small change

699377c

real fix

c2e7e77

formatting

bf608f0

black formatting

f3ea87d

MattGPT-ai reviewed Mar 25, 2025

View reviewed changes

add mse for regression tasks

aa375f5

ntravis22 changed the title ~~Prevent error if not all metrics are present for a task~~ Log mse for regression tasks (and make sure metrics are present) Mar 25, 2025

add other regression metrics

6e1e47d

ntravis22 changed the title ~~Log mse for regression tasks (and make sure metrics are present)~~ Log regression task metrics in multitask model Mar 25, 2025

MattGPT-ai approved these changes Mar 25, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Log regression task metrics in multitask model#3648

Log regression task metrics in multitask model#3648
ntravis22 wants to merge 7 commits intoflairNLP:masterfrom
ZipRecruiter:ntravis.metrics_fix

ntravis22 commented Mar 24, 2025 •

edited

Loading

Uh oh!

MattGPT-ai left a comment

Uh oh!

ntravis22 commented Mar 25, 2025

Uh oh!

MattGPT-ai commented Mar 25, 2025

Uh oh!

ntravis22 commented Mar 25, 2025

Uh oh!

alanakbik commented Mar 26, 2025

Uh oh!

ntravis22 commented Apr 1, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

ntravis22 commented Mar 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

MattGPT-ai left a comment

Choose a reason for hiding this comment

Uh oh!

ntravis22 commented Mar 25, 2025

Uh oh!

MattGPT-ai commented Mar 25, 2025

Uh oh!

ntravis22 commented Mar 25, 2025

Uh oh!

alanakbik commented Mar 26, 2025

Uh oh!

ntravis22 commented Apr 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

ntravis22 commented Mar 24, 2025 •

edited

Loading

ntravis22 commented Apr 1, 2025 •

edited

Loading