-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Open
Description
Hi,
Is this the same script with the Moses's multi-bleu.perl? I've seen that there are some modifications to the original version. I've been investigating that why my baseline model's (Google NIC with VGG-E) BLEU-2-3-4 performance is really low but what I've found is we are not using the same evaluation scripts. I know that this task is different than machine translation task, though. So, my questions are,
- What's the intention behind the BLEU evaluation script modification?
- Is all captioning people evaluate their models with this approach?
Thanks in advance.
Metadata
Metadata
Assignees
Labels
No labels