Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/workflows/workflow.yml
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,6 @@ jobs:

- name: Build docs
run: |
mkdocs build --strict
make build

# TODO: Internal link check
3 changes: 3 additions & 0 deletions .readthedocs.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,9 @@ build:
os: ubuntu-24.04
tools:
python: "3"
jobs:
pre_build:
- make pre-build

mkdocs:
configuration: mkdocs.yml
Expand Down
14 changes: 14 additions & 0 deletions Makefile
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@

pre-build:
python ./plugins/create_category_pages.py

build: pre-build
mkdocs build --strict

serve: pre-build
mkdocs serve

.PHONY: clean
clean:
rm -f docs/categories/*
rm -rf site/*
3 changes: 2 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
# UK TRE Glossary

[![Build](https://github.com/manics/uktre-glossary-rtd/actions/workflows/workflow.yaml/badge.svg)](https://github.com/manics/uktre-glossary-rtd/actions/workflows/workflow.yaml)
[![Build](https://github.com/manics/uktre-glossary/actions/workflows/workflow.yml/badge.svg)](https://github.com/manics/uktre-glossary/actions/workflows/workflow.yml)
[![readthedocs](https://app.readthedocs.org/projects/uktre-glossary/badge/?version=latest)](https://uktre-glossary.readthedocs.io/)

**⚠️⚠️⚠️⚠️⚠️ Under development ⚠️⚠️⚠️⚠️⚠️**
23 changes: 2 additions & 21 deletions assets/uktre-glossary.yaml
Original file line number Diff line number Diff line change
@@ -1,22 +1,3 @@
categories:
- Analysis
- Computing
- Data Management
- Data in General
- Data in general
- Health Research
- Health Services & Health Data
- Identifiability
- Management
- Other
- Processes
- Research Management
- Risk Management
- Running and Overseeing Research
- Running and overseeing research
- Security Management
- Special aspects in the NHS Context
- UK law and rules
glossary:
- term: AAI
tags:
Expand Down Expand Up @@ -397,7 +378,7 @@ glossary:
A person’s health records that are held digitally on a computer (as opposed to on paper). Also known as an electronic patient record (EPR).
- term: Ethical approvals
tags:
- Running and Overseeing Research
- Running and overseeing research
definition: |-
Ethical approvals are like getting the green light from a group of experts who make sure that research is done in a proper and respectful way. They ensure that participants' rights are protected and everything is conducted responsibly. It's like having a permission slip before starting the research to ensure everything is fair and safe.
- term: European Union (EU) General Data Protection Regulation (GDPR)
Expand Down Expand Up @@ -546,7 +527,7 @@ glossary:
For example: joining a health dataset with an employment dataset using a common key based on individual names and addresses.
- term: Longitudinal Dataset
tags:
- Data in General
- Data in general
definition: |-
A collection of data related to the same group of people over a long time to see how things change. This may involve asking the same questions at different ages.
- term: Machine Learning (ML)
Expand Down
1 change: 1 addition & 0 deletions docs/categories/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
*
33 changes: 33 additions & 0 deletions plugins/create_category_pages.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
#!/usr/bin/env python
from pathlib import Path
import re
import yaml


def _slugify(s: str):
return re.sub(r"\W", "-", s.lower())

ROOT = Path(__file__).parent / ".."

TEMPLATE = """
# %CATEGORY%

{{ read_yaml("uktre-glossary.yaml", record_path="glossary", category="%CATEGORY%") }}
"""

with open(ROOT / "assets" / "uktre-glossary.yaml") as f:
data = yaml.safe_load(f)
categories = set(t for term in data["glossary"] for t in term["tags"])

# Check categories don't have inconsistent names
slugs = {}
for category in categories:
slug = _slugify(category)
if slug in slugs:
raise ValueError(f"Category has inconsistent naming: '{slugs[slug]}' '{category}'")
slugs[slug] = category

for slug, category in sorted(slugs.items()):
print(f"{category:<40} {slug}")
with open(ROOT / "docs" / "categories" / f"{slug}.md", "w") as f:
f.write(TEMPLATE.replace("%CATEGORY%", category))
Original file line number Diff line number Diff line change
Expand Up @@ -23,45 +23,67 @@ def link_urls(s: str):
s = re.sub(rf"(https?://[\S]+[^{trailing_punctuation}])", r"[\1](\1)", s)
return s

def _crossref_terms(text):
def _crossref_terms(text, parent):
# Find [...] but not [...](...)
matches = re.findall(r"(\[[^]]+\])([^(]|$)", text)
# Get the first capture group
crossrefs = set(m[0] for m in matches)

for crossref in crossrefs:
target_term = crossref[1:-1]
link_target = "#term-" + _slugify(target_term)
link_target = f"{parent}#term-{_slugify(target_term)}"
link_md = f"[{target_term}]({link_target})"
text = text.replace(crossref, link_md)
return text


def to_glossary_html(df, **kwargs):
def to_glossary_html(df, category="", **kwargs):
"""
df.to_markdown() escapes some HTML, so create a HTML table ourselves
"""
if kwargs:
raise ValueError(f"Unsupported kwargs: {kwargs}")

out = """
# Don't show tags column if this is a single category
th_tags = "" if category else "<th>Tags</th>"
out = f"""
<table>
<tr>
<th>Term</th>
<th>Tags</th>
{th_tags}
<th>Definition</th>
</tr>
"""
for row in df.itertuples(index=False):

if category:
# Duplicate rows with multiple tags, one per tag
selected = df.explode("tags")
selected = selected[selected["tags"] == category]
else:
selected = df

for row in selected.itertuples(index=False):
anchor = "term-" + _slugify(row.term)
crossreferenced = _crossref_terms(link_urls(row.definition))
term = escape(row.term)

if category:
# Don't show tags column if this is a single category
tags = ""
# Need to link to top-level glossary since terms may not be in this category
parent = "../../"
else:
tags = "".join(markdown(escape(f"[{c}](categories/{_slugify(c)})")) for c in row.tags)
tags = f"<td>{tags}</td>"
parent = ""

crossreferenced = _crossref_terms(link_urls(row.definition), parent)
definition = markdown(escape(crossreferenced))

row = f"""
<tr>
<td id="{anchor}"><a href="#{anchor}">{escape(row.term)}</a></td>
<td>{escape(", ".join(row.tags))}</td>
<td>{markdown(definition)}</td>
<td id="{anchor}"><a href="#{anchor}">{term}</a></td>
{tags}
<td>{definition}</td>
</tr>
"""
out += row
Expand Down
2 changes: 1 addition & 1 deletion requirements.in
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
file:plugins/mkdocs-uktre-glossary-plugin#egg=mkdocs-uktre-glossary-plugin
-e file:plugins/mkdocs-uktre-glossary-plugin#egg=mkdocs-uktre-glossary-plugin
mkdocs
mkdocs-material
4 changes: 2 additions & 2 deletions requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,8 @@
#
# pip-compile
#
-e file:plugins/mkdocs-uktre-glossary-plugin#egg=mkdocs-uktre-glossary-plugin
# via -r requirements.in
babel==2.17.0
# via mkdocs-material
backrefs==5.8
Expand Down Expand Up @@ -51,8 +53,6 @@ mkdocs-material-extensions==1.3.1
# via mkdocs-material
mkdocs-table-reader-plugin==3.1.0
# via mkdocs-uktre-glossary-plugin
file:plugins/mkdocs-uktre-glossary-plugin#egg=mkdocs-uktre-glossary-plugin
# via -r requirements.in
numpy==2.2.4
# via pandas
packaging==24.2
Expand Down
Loading