Without evaluation data, it is easy to misuse autonima with incorrect prompting and get a misleading result.
To prevent that, let's design user friendly tools to allow for researcher-in-the-loop prompt engineering and fine-tuning of the meta-analysis config file, and standard procedures for optimizing prompts with relatively minimal manual annotation and verification.
One idea is users could review a random sample of accepted/rejected studies, and based on that revise their prompt.
Without evaluation data, it is easy to misuse autonima with incorrect prompting and get a misleading result.
To prevent that, let's design user friendly tools to allow for researcher-in-the-loop prompt engineering and fine-tuning of the meta-analysis config file, and standard procedures for optimizing prompts with relatively minimal manual annotation and verification.
One idea is users could review a random sample of accepted/rejected studies, and based on that revise their prompt.