Open
Description
I ran the following experiment:
- Domain: GB1 (56AA)
- Variants: 1000
- Cores: 12 (all being used)
- Environment: local MacOS
The runtime was 0.87 hours.
That's a rate of 0.01 CPU hours / variant (or 0.6 minutes / variant), which I would consider best-case-scenario, since 56AA is quite a small protein.
You guys generated 20M variants for 148 proteins = 3B total variants. At that rate, we're looking at least 30M CPU hours... It seems impractical, right? Or does this line up with your runtimes?
A few considerations:
- I'm using my own re-implementation of this workflow, since I found it hard this codebase hard to plug into. Happy to share my code if you're interested.
- One of the biggest differences is that I'm running Rosetta through their
linux/amd64
Docker image. But since I'm on an M3 mac chip, maybe the emulation is slow?
Metadata
Metadata
Assignees
Labels
No labels