First off, thank you for taking the time to contribute! This project relies on the community to keep the global university data accurate and up-to-date.
All university data is stored in world_universities_and_domains.json. When adding new entries:
- Complete Schema: Every entry MUST have the following fields:
name: Official name of the university (see naming rules below).country: Full country name in English, following ISO 3166-1 English short names (e.g.,"Germany", not"Deutschland").alpha_two_code: Standard ISO 3166-1 alpha-2 code (e.g.,"US","TR").domains: An array of the university's primary registered domains — no department or service prefixes (see critical note below).web_pages: An array of URL strings (even if there is only one). Must be the root URL of the university's website — no path beyond/(e.g.,https://www.university.edu/, nothttps://www.university.edu/admissions/). Subdomains are allowed (e.g.,https://newsite.university.edu/). Must begin withhttps://(preferred). Only usehttp://if the university's site does not support HTTPS. Must end with a trailing slash.state-province: State or province name in English. Usenullif not applicable.
- Accuracy: Ensure the domains and web pages are currently active.
- No Duplicates: Check if the university already exists under a different name or variation.
Latin-alphabet languages (French, German, Spanish, Turkish, Portuguese, etc.): Use the official name in the original language, including native characters.
"Boğaziçi University" ✓ (not "Bogazici University")
"Université Paris-Saclay" ✓ (not "University of Paris-Saclay")
"Technische Universität Wien" ✓ (not "Vienna University of Technology")
Non-Latin-script languages (Arabic, Chinese, Japanese, Korean, Cyrillic, etc.): Use the official English name. If no official English name exists, use a standard romanized transliteration.
"Peking University" ✓ (not "北京大学")
"King Abdulaziz University" ✓ (not "جامعة الملك عبدالعزيز")
"Moscow State University" ✓ (not "Московский государственный университет")
Note on consistency: If you are fixing an existing entry that uses ASCII instead of the correct native characters (e.g.,
"Ataturk University"instead of"Atatürk University"), correcting it is welcome — just note the fix in your PR description.
Note on character encoding: Native characters (ü, ö, ğ, é, etc.) are valid UTF-8 and work fine in JSON. However, applications consuming this data that do exact string matching may miss results if they search with ASCII equivalents (e.g., searching "Ataturk" won't match "Atatürk"). API consumers should apply Unicode normalization on their side when doing name lookups.
{
"name": "Boğaziçi University",
"domains": ["boun.edu.tr"],
"web_pages": ["https://www.boun.edu.tr/"],
"country": "Turkey",
"alpha_two_code": "TR",
"state-province": null
}
⚠️ CRITICAL:domainsFIELD — NO DEPARTMENT OR SERVICE PREFIXES Thedomainsfield is used for email address matching. It must contain the university's primary registered domain — not a department, portal, or service subdomain added on top of it.
- Correct:
usc.edu,itu.edu.tr,ox.ac.uk- Incorrect:
cs.usc.edu,ogr.itu.edu.tr,mail.ox.ac.ukThis restriction applies only to
domains. Theweb_pagesfield may include subdomains, but must still be a root URL (no path beyond/).Pull Requests containing department or service subdomains in
domainswill automatically fail the CI/CD checks.
- Fork the repository and create your branch from
master. - Update the JSON file.
- Ensure your JSON is valid.
- Submit the PR using the provided template.
Thank you for keeping the data clean!