-
Notifications
You must be signed in to change notification settings - Fork 4
Open
Description
I am interested in evaluating the text2earth model for text-to-image retrieval and want to compare it to CLIP-based models.
My assumption was that text2earth is a text encoder that encodes text to the same space as the Clay image embeddings. I had assumed that I could do the following:
- Use the Clay v1 model to create embeddings for some chips
- Find a text2earth model compatible with the v1 model
- Use it to embed natural language text queries like "running track", "house with swimming pool" etc.
- Compute similarity scores between the text embedding and the chip embeddings
But I am a little confused by the example notebooks (such as this one).
Questions:
- Is the workflow described above currently supported?
- Is there a text2earth model compatible with the v1 model?
Metadata
Metadata
Assignees
Labels
No labels