The PDF of [[2024] UKFTT 31 (TC)](https://assets.caselaw.nationalarchives.gov.uk/ukftt/tc/2024/31/ukftt_tc_2024_31.pdf) contains a number of instances of the "fl" ligature (U+FB02). This is seen repeatedly in the phrase "potato flour": [Screencast from 17-01-24 08:49:19.webm](https://github.com/nationalarchives/ds-caselaw-pdf-conversion/assets/837136/49bb70b1-f862-4e6a-8085-392680056b2c) I do not have access to the original DOCX, although I note the ligature is also present in the [PDF judgement on the official Tribunals website](https://financeandtax.decisions.tribunals.gov.uk/Aspx/view.aspx?id=12932). The ligature is also present in the [HTML version](https://caselaw.nationalarchives.gov.uk/ukftt/tc/2024/31) but *not* in the [XML version](https://caselaw.nationalarchives.gov.uk/ukftt/tc/2024/31/data.xml). I suggest that the text undergoes [Unicode Normalisation](https://unicode.org/reports/tr15/) before a PDF is created. (Apologies if this isn't the correct repo. Feel free to move it somewhere more suitable.)
The PDF of [2024] UKFTT 31 (TC) contains a number of instances of the "fl" ligature (U+FB02).
This is seen repeatedly in the phrase "potato flour":
Screencast.from.17-01-24.08.49.19.webm
I do not have access to the original DOCX, although I note the ligature is also present in the PDF judgement on the official Tribunals website.
The ligature is also present in the HTML version but not in the XML version.
I suggest that the text undergoes Unicode Normalisation before a PDF is created.
(Apologies if this isn't the correct repo. Feel free to move it somewhere more suitable.)