Description
Hello,
I've ran the pipeline without HTCondor up until the processing results
part (which I assume is not currently possible without running the pipeline in HTCondor unless I write a custom script that takes the non-HTCondor energize_output
and packages it into a database understandable by metl
).
From my understanding, it's unfeasible to generate a good enough training set without parallelizing the computation of rosetta's energy parameters for all variants. I've setup my own HTCondor instance to which I'm able to connect a few execute
nodes, and would like to run metl-sim
on my this cluster. The part that I don't understand is: do I really need to upload rosetta
and python
to osdf/squid if I'm running the algorithm only on my own machines? Or is there another way (such as adding the rosetta and python env to all execute
nodes through my docker-compose
)?
I might be wrong, but it seems like I would only need to upload to squid if I'm connecting to a highly distributed HTCondor cluster to which I don't have admin privileges to right?
Where in the scripts are the osdf
python/rosetta env being accessed? Is there a workaround to skip that step and instead use a local install?