Usage question

I am interested in evaluating the text2earth model for text-to-image retrieval and want to compare it to [CLIP-based models](https://element84.com/machine-learning/towards-a-queryable-earth-with-vision-language-foundation-models/).

My assumption was that text2earth is a text encoder that encodes text to the same space as the Clay image embeddings. I had assumed that I could do the following:
1. Use the Clay v1 model to create embeddings for some chips
2. Find a text2earth model compatible with the v1 model
3. Use it to embed natural language text queries like "running track", "house with swimming pool" etc.
4. Compute similarity scores between the text embedding and the chip embeddings

But I am a little confused by the example notebooks (such as [this one](https://github.com/Clay-foundation/earth-text/blob/rramosp/notebooks/models/11%20-%20full%20query%20workflow.ipynb)).

Questions:
- Is the workflow described above currently supported?
- Is there a text2earth model compatible with the v1 model?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Usage question #28

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Usage question #28

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions