Building AI Editors (NICAR 2023) Jonathan Soma, Columbia University, [email protected] The presentation as a PDF Code snippets with spaCy and OpenAI Every single link Title URL Grammarly https://www.grammarly.com/ BBC News style guide https://www.bbc.co.uk/newsstyleguide/ An update on AP style on Kyiv https://blog.ap.org/announcements/an-update-on-ap-style-on-kyiv Why There’s No Such Thing as a ‘Car Accident’ https://usa.streetsblog.org/2022/02/15/the-brake-why-theres-no-such-thing-as-a-car-accident/ Stop using ‘officer-involved shooting’ https://www.cjr.org/analysis/officer-involved-shooting.php 'Tragic': Teen apparently killed by stray police bullet in LA Burlington dressing room identified https://abcnews.go.com/US/14-year-girl-dressing-room-killed-stray-bullet/story?id=81919639 Source of the famous “Now you have two problems” quote http://regex.info/blog/2006-09-15/247 Women in Media Gender Scorecard https://womeninmedia.com.au/wp-content/uploads/2023/02/Women-in-Media-Gender-Scorecard_01.02.23_final.pdf A "Shortage" of Punishment Bureaucrats https://equalityalec.substack.com/p/a-shortage-of-punishment-bureaucrats spaCy (NLP tool) https://spacy.io/ spaCy's named entity recognition (NER) https://spacy.io/usage/linguistic-features#named-entities Source Matters https://sourcematters.com/ Gender detection tools (don't use them!!) https://www.npmjs.com/package/gender-detection-from-name, https://pypi.org/project/gender-detector/ OpenAI text completion API https://platform.openai.com/docs/guides/completion OpenAI API example gallery https://github.com/openai/openai-cookbook/ Vision Zero Reporting https://visionzeroreporting.com/ Editorial Patterns in Bicyclist and Pedestrian Crash Reporting https://www.researchgate.net/publication/330975590_Editorial_Patterns_in_Bicyclist_and_Pedestrian_Crash_Reporting Guardian/JournalismAI quote extraction writeup https://explosion.ai/blog/guardian Guardian/JournalismAI quote extraction GitHub repo https://github.com/JournalismAI-2021-Quotes/quote-extraction Prodigy demo with entity recognition https://demo.prodi.gy/?=null&view_id=ner_manual Fine-tuning a pretrained model on Hugging Face https://huggingface.co/docs/transformers/training Full Fact AI https://fullfact.org/ Lessons learned from Squash https://www.poynter.org/fact-checking/2021/the-lessons-of-squash-the-first-automated-fact-checking-platform/ ClaimHunter: An Unattended Tool for Automated Claim Detection on Twitter https://www.semanticscholar.org/paper/ClaimHunter%3A-An-Unattended-Tool-for-Automated-Claim-Beltr%C3%A1n-M%C3%ADguez/764335faadf30051dea7f6b3747498035f91d0f8 News outlets criticized for using Chinatown photos in coronavirus articles https://www.nbcnews.com/news/asian-america/news-outlets-criticized-using-chinatown-photos-coronavirus-articles-n1150626 NY Post tweet https://twitter.com/nypost/status/1234288662856245248 Visual Question Answering models on Hugging Face https://huggingface.co/datasets?task_categories=task_categories:visual-question-answering&sort=downloads Usage and comparison of visual question answering models https://huggingface.co/spaces/nielsr/comparing-VQA-models Image segmentation example using Prodigy https://demo.prodi.gy/?=null&view_id=imageseg