Skip to content

Latest commit

 

History

History
1081 lines (862 loc) · 23.2 KB

ContrastiveModel.md

File metadata and controls

1081 lines (862 loc) · 23.2 KB

TFSimilarity.models.ContrastiveModel

Model groups layers into an object with training and inference features.

TFSimilarity.models.ContrastiveModel(
    backbone: tf.keras.Model,
    projector: tf.keras.Model,
    predictor: Optional[tf.keras.Model] = None,
    algorithm: str = simsiam,
    **kwargs
) -> None

Args

inputs The input(s) of the model: a keras.Input object or list of keras.Input objects.
outputs The output(s) of the model. See Functional API example below.
name String, the name of the model.

There are two ways to instantiate a Model:

1 - With the "Functional API", where you start from Input, you chain layer calls to specify the model's forward pass, and finally you create your model from inputs and outputs:

import tensorflow as tf

inputs = tf.keras.Input(shape=(3,))
x = tf.keras.layers.Dense(4, activation=tf.nn.relu)(inputs)
outputs = tf.keras.layers.Dense(5, activation=tf.nn.softmax)(x)
model = tf.keras.Model(inputs=inputs, outputs=outputs)

Note: Only dicts, lists, and tuples of input tensors are supported. Nested inputs are not supported (e.g. lists of list or dicts of dict).

A new Functional API model can also be created by using the intermediate tensors. This enables you to quickly extract sub-components of the model.

Example:

inputs = keras.Input(shape=(None, None, 3))
processed = keras.layers.RandomCrop(width=32, height=32)(inputs)
conv = keras.layers.Conv2D(filters=2, kernel_size=3)(processed)
pooling = keras.layers.GlobalAveragePooling2D()(conv)
feature = keras.layers.Dense(10)(pooling)

full_model = keras.Model(inputs, feature)
backbone = keras.Model(processed, conv)
activations = keras.Model(conv, feature)

Note that the backbone and activations models are not created with keras.Input objects, but with the tensors that are originated from keras.Inputs objects. Under the hood, the layers and weights will be shared across these models, so that user can train the full_model, and use backbone or activations to do feature extraction. The inputs and outputs of the model can be nested structures of tensors as well, and the created models are standard Functional API models that support all the existing APIs.

2 - By subclassing the Model class: in that case, you should define your layers in init() and you should implement the model's forward pass in call().

import tensorflow as tf

class MyModel(tf.keras.Model):

  def __init__(self):
    super().__init__()
    self.dense1 = tf.keras.layers.Dense(4, activation=tf.nn.relu)
    self.dense2 = tf.keras.layers.Dense(5, activation=tf.nn.softmax)

  def call(self, inputs):
    x = self.dense1(inputs)
    return self.dense2(x)

model = MyModel()

If you subclass Model, you can optionally have a training argument (boolean) in call(), which you can use to specify a different behavior in training and inference:

import tensorflow as tf

class MyModel(tf.keras.Model):

  def __init__(self):
    super().__init__()
    self.dense1 = tf.keras.layers.Dense(4, activation=tf.nn.relu)
    self.dense2 = tf.keras.layers.Dense(5, activation=tf.nn.softmax)
    self.dropout = tf.keras.layers.Dropout(0.5)

  def call(self, inputs, training=False):
    x = self.dense1(inputs)
    if training:
      x = self.dropout(x, training=training)
    return self.dense2(x)

model = MyModel()

Once the model is created, you can config the model with losses and metrics with model.compile(), train the model with model.fit(), or use the model to do prediction with model.predict().

Methods

calibrate

View source

calibrate(
    x: <a href="../../TFSimilarity/callbacks/FloatTensor.md">TFSimilarity.callbacks.FloatTensor```
</a>,
    y: <a href="../../TFSimilarity/callbacks/IntTensor.md">TFSimilarity.callbacks.IntTensor```
</a>,
    thresholds_targets: MutableMapping[str, float] = {},
    k: int = 1,
    calibration_metric: Union[str, <a href="../../TFSimilarity/callbacks/ClassificationMetric.md">TFSimilarity.callbacks.ClassificationMetric```
</a>] = f1,
    matcher: Union[str, <a href="../../TFSimilarity/callbacks/ClassificationMatch.md">TFSimilarity.callbacks.ClassificationMatch```
</a>] = match_nearest,
    extra_metrics: MutableSequence[Union[str, ClassificationMetric]] = [precision, recall],
    rounding: int = 2,
    verbose: int = 1
) -> <a href="../../TFSimilarity/indexer/CalibrationResults.md">TFSimilarity.indexer.CalibrationResults```
</a>

Calibrate model thresholds using a test dataset.

Args
x examples to use for the calibration.
y labels associated with the calibration examples.
thresholds_targets Dict of performance targets to (if possible) meet with respect to the calibration_metric.
calibration_metric - [ClassificationMetric()](classification_metrics/overview.md) used to evaluate the performance of the index.
k How many neighboors to use during the calibration. Defaults to 1.
matcher 'match_nearest', 'match_majority_vote' or ClassificationMatch object. Defines the classification matching, e.g., match_nearest will count a True Positive if the query_label is equal to the label of the nearest neighbor and the distance is less than or equal to the distance threshold. Defaults to 'match_nearest'.
extra_metrics List of additional tf.similarity.classification_metrics.ClassificationMetric() to compute and report. Defaults to ['precision', 'recall'].
rounding Metric rounding. Default to 2 digits.
verbose Be verbose and display calibration results. Defaults to 1.
Returns
CalibrationResults containing the thresholds and cutpoints Dicts.

create_index

View source

create_index(
    distance: Union[<a href="../../TFSimilarity/distances/Distance.md">TFSimilarity.distances.Distance```
</a>, str] = cosine,
    search: Union[<a href="../../TFSimilarity/indexer/Search.md">TFSimilarity.indexer.Search```
</a>, str] = nmslib,
    kv_store: Union[<a href="../../TFSimilarity/indexer/Store.md">TFSimilarity.indexer.Store```
</a>, str] = memory,
    evaluator: Union[<a href="../../TFSimilarity/callbacks/Evaluator.md">TFSimilarity.callbacks.Evaluator```
</a>, str] = memory,
    embedding_output: Optional[int] = None,
    stat_buffer_size: int = 1000
) -> None

Create the model index to make embeddings searchable via KNN.

This method is normally called as part of SimilarityModel.compile(). However, this method is provided if users want to define a custom index outside of the compile() method.

NOTE: This method sets SimilarityModel._index and will replace any existing index.

Args
distance Distance used to compute embeddings proximity. Defaults to 'auto'.
kv_store How to store the indexed records. Defaults to 'memory'.
search Which Search() framework to use to perform KNN search. Defaults to 'nmslib'.
evaluator What type of Evaluator() to use to evaluate index performance. Defaults to in-memory one.
embedding_output Which model output head predicts the embeddings that should be indexed. Defaults to None which is for single output model. For multi-head model, the callee, usually the SimilarityModel() class is responsible for passing the correct one.
stat_buffer_size Size of the sliding windows buffer used to compute index performance. Defaults to 1000.
Raises
ValueError Invalid search framework or key value store.

evaluate_classification

View source

evaluate_classification(
    x: <a href="../../TFSimilarity/callbacks/Tensor.md">TFSimilarity.callbacks.Tensor```
</a>,
    y: <a href="../../TFSimilarity/callbacks/IntTensor.md">TFSimilarity.callbacks.IntTensor```
</a>,
    k: int = 1,
    extra_metrics: MutableSequence[Union[str, ClassificationMetric]] = [precision, recall],
    matcher: Union[str, <a href="../../TFSimilarity/callbacks/ClassificationMatch.md">TFSimilarity.callbacks.ClassificationMatch```
</a>] = match_nearest,
    verbose: int = 1
) -> DefaultDict[str, Dict[str, Union[str, np.ndarray]]]

Evaluate model classification matching on a given evaluation dataset.

Args
x Examples to be matched against the index.
y Label associated with the examples supplied.
k How many neighbors to use to perform the evaluation. Defaults to 1.
extra_metrics List of additional tf.similarity.classification_metrics.ClassificationMetric() to compute and report. Defaults to ['precision', 'recall'].
matcher 'match_nearest', 'match_majority_vote' or ClassificationMatch object. Defines the classification matching, e.g., match_nearest will count a True Positive if the query_label is equal to the label of the nearest neighbor and the distance is less than or equal to the distance threshold.

verbose (int, optional): Display results if set to 1 otherwise results are returned silently. Defaults to 1.

Returns
Dictionary of (distance_metrics.md)[evaluation metrics]
Raises
IndexError Index must contain embeddings but is currently empty.
ValueError Uncalibrated model: run model.calibration()")

evaluate_retrieval

View source

evaluate_retrieval(
    x: <a href="../../TFSimilarity/callbacks/Tensor.md">TFSimilarity.callbacks.Tensor```
</a>,
    y: <a href="../../TFSimilarity/callbacks/IntTensor.md">TFSimilarity.callbacks.IntTensor```
</a>,
    retrieval_metrics: Sequence[<a href="../../TFSimilarity/indexer/RetrievalMetric.md">TFSimilarity.indexer.RetrievalMetric```
</a>],
    verbose: int = 1
) -> Dict[str, np.ndarray]

Evaluate the quality of the index against a test dataset.

Args
x Examples to be matched against the index.
y Label associated with the examples supplied.
retrieval_metrics List of - [RetrievalMetric()](retrieval_metrics/overview.md) to compute.

verbose (int, optional): Display results if set to 1 otherwise results are returned silently. Defaults to 1.

Returns
Dictionary of metric results where keys are the metric names and values are the metrics values.
Raises
IndexError Index must contain embeddings but is currently empty.

index

View source

index(
    x: <a href="../../TFSimilarity/callbacks/Tensor.md">TFSimilarity.callbacks.Tensor```
</a>,
    y: <a href="../../TFSimilarity/callbacks/IntTensor.md">TFSimilarity.callbacks.IntTensor```
</a> = None,
    data: Optional[<a href="../../TFSimilarity/callbacks/Tensor.md">TFSimilarity.callbacks.Tensor```
</a>] = None,
    build: bool = True,
    verbose: int = 1
)

Index data.

Args
x Samples to index.
y class ids associated with the data if any. Defaults to None.
store_data store the data associated with the samples in the key value store. Defaults to True.
build Rebuild the index after indexing. This is needed to make the new samples searchable. Set it to false to save processing time when calling indexing repeatidly without the need to search between the indexing requests. Defaults to True.
verbose Output indexing progress info. Defaults to 1.

index_single

View source

index_single(
    x: <a href="../../TFSimilarity/callbacks/Tensor.md">TFSimilarity.callbacks.Tensor```
</a>,
    y: <a href="../../TFSimilarity/callbacks/IntTensor.md">TFSimilarity.callbacks.IntTensor```
</a> = None,
    data: Optional[<a href="../../TFSimilarity/callbacks/Tensor.md">TFSimilarity.callbacks.Tensor```
</a>] = None,
    build: bool = True,
    verbose: int = 1
)

Index data.

Args
x Sample to index.
y class id associated with the data if any. Defaults to None.
data store the data associated with the samples in the key value store. Defaults to None.
build Rebuild the index after indexing. This is needed to make the new samples searchable. Set it to false to save processing time when calling indexing repeatidly without the need to search between the indexing requests. Defaults to True.
verbose Output indexing progress info. Defaults to 1.

index_size

View source

index_size() -> int

Return the index size

index_summary

View source

index_summary()

Display index info summary.

load_index

View source

load_index(
    filepath: str
)

Load Index data from a checkpoint and initialize underlying structure with the reloaded data.

Args
path Directory where the checkpoint is located.
verbose Be verbose. Defaults to 1.

lookup

View source

lookup(
    x: <a href="../../TFSimilarity/callbacks/Tensor.md">TFSimilarity.callbacks.Tensor```
</a>,
    k: int = 5,
    verbose: int = 1
) -> List[List[Lookup]]

Find the k closest matches in the index for a set of samples.

Args
x Samples to match.
k Number of nearest neighboors to lookup. Defaults to 5.
verbose display progress. Default to 1.

Returns list of list of k nearest neighboors: List[List[Lookup]]

match

View source

match(
    x: <a href="../../TFSimilarity/callbacks/FloatTensor.md">TFSimilarity.callbacks.FloatTensor```
</a>,
    cutpoint=optimal,
    no_match_label=-1,
    k=1,
    matcher: Union[str, <a href="../../TFSimilarity/callbacks/ClassificationMatch.md">TFSimilarity.callbacks.ClassificationMatch```
</a>] = match_nearest,
    verbose=0
)

Match a set of examples against the calibrated index

For the match function to work, the index must be calibrated using calibrate().

Args
x Batch of examples to be matched against the index.
cutpoint Which calibration threshold to use. Defaults to 'optimal' which is the optimal F1 threshold computed using calibrate().
no_match_label Which label value to assign when there is no match. Defaults to -1.
k How many neighboors to use during the calibration. Defaults to 1.
matcher 'match_nearest', 'match_majority_vote' or ClassificationMatch object. Defines the classification matching, e.g., match_nearest will count a True Positive if the query_label is equal to the label of the nearest neighbor and the distance is less than or equal to the distance threshold.

verbose. Be verbose. Defaults to 0.

Returns
List of class ids that matches for each supplied example

Notes:

This function matches all the cutpoints at once internally as there is little performance downside to do so and allows to do the evaluation in a single go.

reset_index

View source

reset_index()

Reinitialize the index

save_index

View source

save_index(
    filepath, compression=True
)

Save the index to disk

Args
path directory where to save the index
compression Store index data compressed. Defaults to True.

single_lookup

View source

single_lookup(
    x: <a href="../../TFSimilarity/callbacks/Tensor.md">TFSimilarity.callbacks.Tensor```
</a>,
    k: int = 5
) -> List[<a href="../../TFSimilarity/indexer/Lookup.md">TFSimilarity.indexer.Lookup```
</a>]

Find the k closest matches in the index for a given sample.

Args
x Sample to match.
k Number of nearest neighboors to lookup. Defaults to 5.

Returns list of the k nearest neigboors info: List[Lookup]

to_data_frame

View source

to_data_frame(
    num_items: int = 0
) -> <a href="../../TFSimilarity/indexer/PandasDataFrame.md">TFSimilarity.indexer.PandasDataFrame```
</a>

Export data as pandas dataframe

Args
num_items (int, optional): Num items to export to the dataframe. Defaults to 0 (unlimited).
Returns
pd.DataFrame a pandas dataframe.