A Survey on Automatic Generation of Figurative Language: From Rule-based Systems to Large Language Models (ACM Computing Surveys)
| Figure of speech | Task | Dataset | train | Valid | Test | Lang | Para |
|---|---|---|---|---|---|---|---|
| Simile | Literal↔Simile | Data | 82,687 | 5,145 | 150 | en | ✓ |
| Simile↔Context | Data | 5.4M | 2,500 | 2,500 | zh | ✓ | |
| Narrative+Simile→Text | Data | 3,100 | 376 | 1,520 | en | ✓ | |
| Concept→Analogy + Explanation | Data | - | - | 148 | en | ✓ | |
| Metaphor | Literal↔Metaphor | Data | 260k | 15,833 | 250 | en | ✓ |
| Data | 90k | 3,498 | 150 | en | ✓ | ||
| Data | 248k | - | 150 | en | ✓ | ||
| Data | - | - | 171 | en | ✓ | ||
| CMC | 3,554/2,703 | - | - | zh | ✗ | ||
| Hyperbole | Literal↔Hyperbole | Paper | 709 | - | - | en | ✓ |
| HYPO-cn | 2,082/2,680 | - | - | zh | ✗ | ||
| HYPO-red | 2,163/1,167 | - | - | en | ✗ | ||
| HYPO-XL | -/17,862 | - | - | en | ✗ | ||
| Idiom | Idiom↔Literal | Paper | 88 | - | 84 | en | ✓ |
| Idiom (en)↔Literal (de) | Data | 1,998 | - | 1,500 | en/de | ✓ | |
| Idiom (de)↔Literal (en) | 1,848 | - | 1,500 | de/en | ✓ | ||
| Literal↔Idiom | PIE | 3,784 | 876 | 876 | en | ✓ | |
| Narrative+Idiom→Text | Data | 3,204 | 355 | 1,542 | en | ✓ | |
| Irony (Sarcasm) | Literal↔Irony (Sarcasm) | Data | 2,400 | 300 | 300 | en | ✓ |
| Data | - | - | 203 | en | ✓ | ||
| Data | 112k/262k | - | - | en | ✗ | ||
| Data | 4,762 | - | - | en | ✓ | ||
| Pun | Word senses→Pun | Data | 1,274 | - | - | en | ✓ |
| Context→Pun | Data | 2,753 | - | - | en | ✓ | |
| Personification | Topic→Personification | Data | 67,441 | 3,747 | 3,747 | zh | ✓ |
We review the modelling approaches, from traditional to state-of-the-art, and divide them into two categories: knowledge-based and neural-based approaches.
| Subcategory | Paper | Code | Form | Venue | Pros and Cons |
|---|---|---|---|---|---|
| Rule and template | Abe et al. | - | Metaphor | CSS 2006 | Pros: - Intuitive and simple - Tailored to specific forms Cons: - Poor flexibility and diversity |
| Terai et al. | - | Metaphor | ICANN 2010 | ||
| Joshi et al. | Code | Sarcasm | WISDOM 2015 | ||
| Veale et al. | - | Metaphor | Metaphor WS 2016 | ||
| Knowledge resource | Pereira et al. | - | Metaphor | AAAI WS 2006 | Pros: - Exploiting knowledge resource - High interpretability Cons: - Prior linguistic knowledge - Construct desired resources |
| Veale et al. | - | Metaphor | COLING 2008 | ||
| Petrović et al. | - | Pun | ACL 2013 | ||
| Hong et al. | - | Pun | CALC 2009 | ||
| Shutova et al. | - | Metaphor | NAACL 2010 | ||
| Valitutti et al. | - | Pun | ACL 2013 | ||
| Liu et al. | - | Idiom | NAACL 2016 | ||
| Gero et al. | - | Metaphor | CHI 2019 | ||
| Stowe et al. | - | Metaphor | ACL 2021 | ||
| Hervas et al. | - | Metaphor | MICAI 2007 | ||
| Ovchinnikova et al. | - | Metaphor | Arxiv 2014 | ||
| Harmon et al. | - | Simile | ICCC 2015 | ||
| Subcategory | Paper | Code | Form | Venue | Pros and Cons |
| Training from scratch | Peled et al. | Code | Sarcasm | ACL 2017 | Pros: - Straightforward - Combine retrieval approaches Cons: - Large-scale training data - Large computational resources |
| Fadaee et al. | Code | Idiom | LREC 2018 | ||
| Liu et al. | Code | Metaphor/ Personification |
ACL 2019 | ||
| Stowe et al. | Code | Metaphor | CoNLL 2021 | ||
| Yu et al. | - | Pun | ACL 2018 | ||
| Yu et al. | Code | Metaphor | NAACL 2019 | ||
| Li et al. | Code | Metaphor | INLG 2022 | ||
| He et al. | Code | Pun | NAACL 2019 | ||
| Yu et al. | Code | Pun | EMNLP 2020 | ||
| Zhou et al. | Code | Idiom | Arxiv 2021 | ||
| Zhu et al. | Code | Irony | Arxiv 2019 | ||
| Luo et al. | Code | Pun | EMNLP 2019 | ||
| Mishra et al. | Code | Sarcasm | EMNLP 2019 | ||
| Fine-tuning PLMs | Zhang et al. | Code | Simile | AAAI 2021 | Pros: - Straightforward -Pre-trained knowledge - State-of-the-art results Cons: - Large computational resources |
| Zhou et al. | Code | Idiom | AAAI 2022 | ||
| Zhang et al. | Code | Hyperbole | NAACL 2022 | ||
| Chakrabarty et al. | Code | Simile | EMNLP 2020 | ||
| Stowe et al. | Code | Metaphor | ACL 2021 | ||
| Chakrabarty, et al. | Code | Metaphor | NAACL 2021 | ||
| Stowe et al. | Code | Metaphor | CoNLL 2021 | ||
| Tian et al. | Code | hyperbole | EMNLP 2021 | ||
| Chakrabarty et al. | Code | Sarcasm | ACL 2020 | ||
| Mittal et al. | Code | Pun | NAACL 2022 | ||
| Chakrabarty et al. | Code | Idiom Simile |
TACL 2022 | ||
| Tian et al. | Code | Pun | EMNLP 2022 | ||
| Lai et al. | Code | Hyperbole Sarcasm Idiom Metaphor Simile |
COLING 2022 | ||
| Prompt learning | Chakrabarty et al. | Code | Idiom Simile |
TACL 2022 | Pros: - Straightforward - A few/no labelled samples Cons: - Prompt engineering - Large computational resources |
| Reif et al. | - | Metaphor | ACL 2022 | ||
| Mittal et al. | Code | Pun | NAACL 2022 | ||
| Bhavya et al. | Code | Analogy (Simile) |
INLG 2022 | ||
@article{lai-etal-2024-agfl,
title = "A Survey on Automatic Generation of Figurative Language: From Rule-based Systems to Large Language Models",
author = "Lai, Huiyuan and Nissim, Malvina",
journal = {ACM Computing Surveys},
year = {2024},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
}

