empty db, curious errors, empty output, long gen time, outputs noise

I am writing here because the discord invite in the README.md is invalid.

I am not sure I am doing this "right". Using the dataset provided on Google Drive and the prompt "violins playing Tchaikovsky", it takes 10 minutes on an RTX 4070Ti to generate tokens and create a 4-second clip of chaotic humming sounds, and when I make a 30 seconds clip, which takes over an hour to generate tokens, it creates a 3 meg file that sounds like car horns under water :/

Is there a preferred prompt to use with the test data?   What sounds were sampled to make the test data?

When I tried to sample my own sounds, after 24 hours, the semantic encoding was less than 10% finished. It is "normal' that it should take 10 days to sample a clip?


Also, using the Google Drive data, and `--model_config    ./model/musiclm_large_small_context.json` I get the errors...

`Some weights of RobertaModel were not initialized from the model checkpoint at roberta-base and are newly initialized: ['roberta.pooler.dense.weight', 'roberta.pooler.dense.bias']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.

`You are using a model of type mert_model to instantiate a model of type hubert. This is not supported for all configurations of models and can yield errors.`

What are the correct settings for using the Google Drive data?

My current command is:
```
python scripts/infer_top_match.py \
    "violins playing Tchaikovsky" \
    --num_samples 4 \
    --num_top_matches 1 \
    --semantic_path   ./model/semantic.transformer.14000.pt \
    --coarse_path     ./model/coarse.transformer.18000.pt \
    --fine_path       ./model/fine.transformer.24000.pt \
    --rvq_path        ./model/clap.rvq.950_no_fusion.pt \
    --kmeans_path     ./model/kmeans_10s_no_fusion.joblib \
    --model_config    ./model/musiclm_large_small_context.json \
    --duration 4

```
I had to use the Goggle Drive because the code, while not generating any errors, generated a 0 byte `preprocessed.db` file in the semantic section,  which caused errors in the generation section.

Is there a working example of this code somewhere with proper checkpoints?  

Thanks 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

empty db, curious errors, empty output, long gen time, outputs noise #32

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

empty db, curious errors, empty output, long gen time, outputs noise #32

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions