Skip to content

Conversation

@baolef
Copy link
Contributor

@baolef baolef commented Nov 19, 2025

This should be a trival fix based on #76 (comment). Could you take a look @pjannaty @aryamancodes ?

@baolef
Copy link
Contributor Author

baolef commented Nov 20, 2025

However, one thing to notice is that although the commit fixes the last_frames bug, it leads to inconsistent output between chunks. I'm not sure if it is expected because chunks should remain consistent throughout the whole video. So maybe we should always have num_conditional_frames=1 for chunks other than the first one?

Also, I think what users expect from num_conditional_frames=0 is to prevent the output look the same as the input video #3 (comment), but it seems that in the first chunk, num_conditional_frames and prev_output are always zero. Does that mean the first chunk always conditions on nothing?

robot_depth.mp4

@baolef
Copy link
Contributor Author

baolef commented Nov 20, 2025

Not sure whether num_conditional_frames is designed to behave like this or not

@aryamancodes
Copy link
Contributor

While this PR does allow 0 conditional frames, that results in a lack of chunkwise consistency. Let me discuss if that's useful internally - maybe it's better to only allow num_conditional_frames > 1 for Transfer.

@aryamancodes
Copy link
Contributor

aryamancodes commented Dec 3, 2025

I updated the documentation internally to better explain that num_conditional_frames is frames to condition on in chunkwise generations. We're also introducing params to better condition on the input video or an input image. Both these things will be seen in the next release. With all these changes, I think it makes sense to merge your MR.

Could you resolve the conflict and run linting via precommit? Happy to merge as soon as that's done and sorry about the delay!

@KieranRatcliffeInvertedAI

I just noticed this PR and I wanted to mention that allowing 0 conditional frames is VERY useful to my application! It is the only way I have found to be able to run Cosmos2.5 Transfer with only HDMap input as we do not have access to RGB input for our application.

If there's another way to handle no RGB input I would be glad to hear it because the performance is not as good as Cosmos2.5 should be but so far this is the only work around I've seen that allows me to run Cosmos2.5 Transfer at all.

# Conflicts:
#	cosmos_transfer2/_src/transfer2/inference/inference_pipeline.py
@baolef
Copy link
Contributor Author

baolef commented Dec 4, 2025

I updated the documentation internally to better explain that num_conditional_frames is frames to condition on in chunkwise generations. We're also introducing params to better condition on the input video or an input image. Both these things will be seen in the next release. With all these changes, I think it makes sense to merge your MR.

Could you resolve the conflict and run linting via precommit? Happy to merge as soon as that's done and sorry about the delay!

I've just resolved the merge conflict and run precommit. The behavior is the same as before the previous version before the conflict.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants