Description
While testing the provided demo code, I observed that the algorithm works well for the prompt "The object is flying", producing a smooth and visually plausible animation. However, when I replace "flying" with other action prompts like "walking" or "roaring", the generated results become significantly worse—either the motion is unnatural, the object deforms incorrectly, or it barely moves at all.
Expected Behavior
The algorithm should generalize to different action prompts beyond "flying". At minimum, "walking" or "roaring" should produce recognizable motion patterns, even if not perfectly realistic.
Questions
- Is the current model overfitted to the "flying" action used in the demo?
- Would fine-tuning on other actions help, or is the model architecture fundamentally limited to certain motion types?
Any insights or suggestions would be greatly appreciated! Happy to provide more details if needed.
Description
While testing the provided demo code, I observed that the algorithm works well for the prompt
"The object is flying", producing a smooth and visually plausible animation. However, when I replace"flying"with other action prompts like"walking"or"roaring", the generated results become significantly worse—either the motion is unnatural, the object deforms incorrectly, or it barely moves at all.Expected Behavior
The algorithm should generalize to different action prompts beyond "flying". At minimum, "walking" or "roaring" should produce recognizable motion patterns, even if not perfectly realistic.
Questions
Any insights or suggestions would be greatly appreciated! Happy to provide more details if needed.