Skip to content

Defining new/custom labels at inference time (zero-shot?) #3

@Taytay

Description

@Taytay

First, this is great! Thank you for publishing the results and code!

This is my favorite part of the paper:

During inference, when provided with a new text,
we classify it to the most similar class
with respect to a similarity metric S. This method
draws inspiration from the way inference is conducted in retrieval systems, eliminating the need
for a classification head and aligning the training and inference objectives.

I love that this approach doesn't require a predetermined classification head!

As a result, would I be right to presume that I could provide new labels at inference time? If those labels bore a resemblance to my training set, I think it would do quite well. If they don't, it would "revert" to determining the most similar label, which should still work, right? That makes this a capable zero-shot classifier as well, right?

Here is my initial experiment with it. I appears to "work", although of course its confidence isn't nearly as high if the new labels don't overlap semantically with the original banking labels. I would presume you could fix this by training a more generalized FastFit model?

Am I understanding this correctly?

from typing import List

from fastfit import FastFit
from transformers import AutoTokenizer, pipeline
from transformers.pipelines.base import Pipeline

# Assuming we did the example where we pretrained a model on banking-77 and saved it:

model = FastFit.from_pretrained("fast-fit")
tokenizer = AutoTokenizer.from_pretrained("roberta-large")

classifier: Pipeline = pipeline("text-classification", model=model, tokenizer=tokenizer, device="cuda")

print("\n\nOriginal classifier:")

# Assign inputs variable
inputs = ["I need to pay off my card", "What is my PIN?", "I have a pending top up"]
outputs = classifier(inputs)
# print the inputs and outputs formatted together:
for inp, out in zip(inputs, outputs):
    print(f"Input: {inp}\nOutput: {out}")


def configure_model_with_new_labels(model, new_labels: List[str]):
    # Tokenize the documents
    # ("documents") are what the FastFit model calls the labels
    tokenized_labels = tokenizer(new_labels, padding=True, truncation=True, return_tensors="pt")
    input_ids = tokenized_labels["input_ids"]
    attention_mask = tokenized_labels["attention_mask"]

    # Set the tokenized documents in the model
    model.set_documetns((input_ids, attention_mask))

    # Create and update label mappings
    label_to_id = {label: idx for idx, label in enumerate(new_labels)}
    id_to_label = {idx: label for label, idx in label_to_id.items()}

    # Update model configuration for label mappings
    model.config.label2id = label_to_id
    model.config.id2label = id_to_label
    model.config.num_labels = len(new_labels)

    return model


def test_model_with_labels_and_input(classifier, new_labels, inputs):
    print("\n************")
    print("Configuring with new labels: ", new_labels)
    # Configure the model with new labels
    configure_model_with_new_labels(classifier.model, new_labels)

    # Run the model with the new labels
    outputs = classifier(inputs)

    # Print the inputs and outputs formatted together
    for inp, out in zip(inputs, outputs):
        print(f"Input: {inp}\nOutput: {out}")


test_model_with_labels_and_input(
    classifier,
    # New labels are really close to two of the original labels
    new_labels=["I have a pending card payment", "my pin is blocked"],
    inputs=[
        "I need to pay off my card",
        "What is my PIN?",
        "I have a pending top up",  # this last one is not in the new labels, but is very close to an original label. Let's see if it works too.
    ],
)

# Now some very novel labels:
test_model_with_labels_and_input(classifier, ["positive", "negative"], ["I love you", "I hate it."])

test_model_with_labels_and_input(classifier, ["sports", "politics"], ["Hockey is just the best", "I need to vote", "Vote on the new team captain"])

#prints:

# Original classifier:
# Input: I need to pay off my card
# Output: {'label': 'card payment not recognised', 'score': 0.637493908405304}
# Input: What is my PIN?
# Output: {'label': 'get physical card', 'score': 0.29754528403282166}
# Input: I have a pending top up
# Output: {'label': 'pending top up', 'score': 0.8865770101547241}

# ************
# Configuring with new labels:  ['I have a pending card payment', 'my pin is blocked']
# Input: I need to pay off my card
# Output: {'label': 'I have a pending card payment', 'score': 0.9976436495780945}
# Input: What is my PIN?
# Output: {'label': 'my pin is blocked', 'score': 0.9877238273620605}
# Input: I have a pending top up
# Output: {'label': 'I have a pending card payment', 'score': 0.9031936526298523}

# ************
# Configuring with new labels:  ['positive', 'negative']
# Input: I love you
# Output: {'label': 'positive', 'score': 0.5160248875617981}
# Input: I hate it.
# Output: {'label': 'negative', 'score': 0.5863736271858215}

# ************
# Configuring with new labels:  ['sports', 'politics']
# Input: Hockey is just the best
# Output: {'label': 'sports', 'score': 0.7964810729026794}
# Input: I need to vote
# Output: {'label': 'politics', 'score': 0.6664907336235046}
# Input: Vote on the new team captain
# Output: {'label': 'politics', 'score': 0.7508028149604797}

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions