Hello, I retested the Qwen-Image metrics according to your method and found a significant discrepancy. What could be the reason? Model Cultural Time Space Biology Physics Chemistry Overall Qwen 0.62 0.63 0.77 0.57 0.75 0.4 0.62 Qwen 0.56 0.57 0.65 0.48 0.55 0.32 0.54 The metrics above are those you tested; the metrics below are those I actually tested. I'm using the internal API interface for testing.
Hello, I retested the Qwen-Image metrics according to your method and found a significant discrepancy. What could be the reason? Model Cultural Time Space Biology Physics Chemistry Overall Qwen 0.62 0.63 0.77 0.57 0.75 0.4 0.62 Qwen 0.56 0.57 0.65 0.48 0.55 0.32 0.54 The metrics above are those you tested; the metrics below are those I actually tested. I'm using the internal API interface for testing.