I've found that retrying the same codegen prompt can result in a clean diff application.
For fully automated tasks, such as PR creation, it's worth retrying until we get a clean diff.
I feel this warrants a new flag to have all or nothing code applications.