This project presents a head-to-head comparison of two state-of-the-art abstractive summarization models, BART and PEGASUS, both pre-trained on the CNN/Daily Mail dataset. The models were evaluated intrinsically using ROUGE metrics and extrinsically through manual review of generated summaries. Results show minimal differences between the two models, with BART slightly outperforming PEGASUS in their default configurations, contradicting the original PEGASUS publication. An ablation study across several hyperparameter configurations found negligible improvements overall, with minor gains from high-probability (90%) sampling in BART and aggressive beam search (num_beams = 16) in PEGASUS. Consistent with the original PEGASUS findings, PEGASUS with aggressive beam search was the top performer. However, improvements were largely limited to phrasing and structure, with only minor changes to content. These results suggest that, for real-world applications, model choice and configuration should be guided by personal preference and convenience. Given their near-identical architecture and comparable performance on the CNN/Daily Mail dataset, users are advised to simply pick one.
For full project details, please read the Just Pick One PDF.