|
| 1 | +--- |
| 2 | +layout: post |
| 3 | +title: "Using compact identifiers in project reports" |
| 4 | +date: 2026-03-29 |
| 5 | +doi: 10.59350/re9j2-hk972 |
| 6 | +tags: identifier semweb cito cito:usesMethodIn,includesQuotationFrom:10.1038/sdata.2018.29 |
| 7 | + cito:obtainsBackgroundFrom:10.1007/s12021-015-9284-3 cito:usesMethodIn:10.1093/bioinformatics/btaa864 |
| 8 | + cito:usesMethodIn:10.1038/s41597-022-01807-3 cito:obtainsBackgroundFrom:10.1038/sdata.2016.18 |
| 9 | + cito:includesQuotationFrom:10.1186/s13321-022-00614-7 cito:includesQuotationFrom:10.1186/s13321-020-00448-1 |
| 10 | +grants: |
| 11 | + - grant: |
| 12 | + title: "FAIR4ChemNL: Accelerating the adoption of universal data standards in chemistry" |
| 13 | + acronym: "FAIR4ChemNL" |
| 14 | + id: doi:10.61686/XVYQV45374 |
| 15 | + funder: |
| 16 | + name: "Dutch Research Council" |
| 17 | + ror: 04jsz6e67 |
| 18 | +#comments: |
| 19 | +# host: social.edu.nl |
| 20 | +# username: egonw |
| 21 | +# id: ... |
| 22 | +--- |
| 23 | + |
| 24 | +This document describes how you can improve the FAIR-ness of your project report by using |
| 25 | +compact identifiers. Of course, it can be applied to any other document too, and has been used |
| 26 | +in, for example, journal articles and online documentation already. |
| 27 | + |
| 28 | +Compact identifiers find a balance between compactness in writing and being a persistent, unique, |
| 29 | +and global identifier. It "is a string constructed by concatenating a namespace prefix, a separating colon, |
| 30 | +and a locally unique identifier (LUI)" (doi:[10.1038/sdata.2018.29](https://doi.org/10.1038/sdata.2018.29)). |
| 31 | +For example, for proteins it can represent the PDB structure [2gc4](https://bioregistry.io/pdb:2gc4) as |
| 32 | +*pdb:2gc4*. There is a clear similarity with the SciCrunch [Research Resource Identifiers](https://rrid.site/) |
| 33 | +(RRIDs) as used by several journals, like |
| 34 | +[eLife](https://elifesciences.org/inside-elife/ff683ecc/rrids-how-did-we-get-here-and-where-are-we-going) |
| 35 | +(doi:[10.1007/s12021-015-9284-3](https://doi.org/10.1007/s12021-015-9284-3)). |
| 36 | + |
| 37 | +When the prefixes are defined by community standards, then a compact identifier can be resolved. |
| 38 | +There currently are multiple providers of prefix files (doi:[10.1038/sdata.2018.29](https://doi.org/10.1038/sdata.2018.29)), |
| 39 | +including Identifiers.org (doi:[10.1093/bioinformatics/btaa864](https://doi.org/10.1093/bioinformatics/btaa864)) |
| 40 | +and Bioregistry (doi:[10.1038/s41597-022-01807-3](https://doi.org/10.1038/s41597-022-01807-3)). |
| 41 | +The Bioregistry has an overview of more than twenty registries of prefixes and their metadata |
| 42 | +(doi:[10.1038/s41597-022-01807-3](https://doi.org/10.1038/s41597-022-01807-3)). The metadata commonly |
| 43 | +includes information on the URL pattern for each identifier. Often this is more than one pattern, as |
| 44 | +there may more several databases with information for the same identifier. |
| 45 | + |
| 46 | +It is the URL pattern in the database that allows services to *resolve* the compact identifier |
| 47 | +into a link to a database. The above registries correspond to three existing *resolvers* that will take a compact |
| 48 | +identifier as part of a resolver URL and redirect to the database with the record matching |
| 49 | +that identifier: |
| 50 | + |
| 51 | +* Name-to-Thing (N2T): [https://n2t.net/](https://n2t.net/) |
| 52 | +* Identifiers.org: [https://identifiers.org/](https://identifiers.org/) |
| 53 | +* The Bioregistry: [https://bioregistry.io/](https://bioregistry.io/) |
| 54 | + |
| 55 | +Each of these URLs can be extended with a compact identifier. For example, a taxon record |
| 56 | +from the NCBI databases or the PDB entry mentioned earlier: |
| 57 | + |
| 58 | +* [https://bioregistry.io/pdb:2gc4](https://bioregistry.io/pdb:2gc4) |
| 59 | +* [https://identifiers.org/col:6MB3T](https://identifiers.org/col:6MB3T) (`col` is the prefix for the Catalogue of Life) |
| 60 | + |
| 61 | +## Why use in reports? |
| 62 | + |
| 63 | +Using persistent identifiers is generally accepted as a good practice that benefits science |
| 64 | +and has been part of the ideas of FAIR data (doi:[10.1038/sdata.2016.18](https://doi.org/10.1038/sdata.2016.18)) |
| 65 | +and of Open Science. Compact |
| 66 | +identifiers make it easy to be precise in reports about what things the reports talk about: they |
| 67 | +are relatively short but very precise at the same time. also, that has the benefit that they |
| 68 | +are much easier to reuse than labels of things and concepts that intrinsically have a certain |
| 69 | +level of uncertainty; a database entry has commonly a very specific meaning. |
| 70 | + |
| 71 | +## Examples uses |
| 72 | + |
| 73 | +The use of compact identifiers can be used in two ways. The simplest is to just put the |
| 74 | +compact identifier as plain text in the document, possibly in parentheses |
| 75 | +(with the compact identifier highlighted here in bold): |
| 76 | + |
| 77 | +<ul> |
| 78 | + <i>This report is only about the experimental data of the human (<b>NCBITaxon:9606</b>) cell lines.</i> |
| 79 | +</ul> |
| 80 | + |
| 81 | +Or: |
| 82 | + |
| 83 | +<ul> |
| 84 | + <i>We found that BRCA1 (<b>ensembl:ENSG00000012048</b>) played an important role.</i> |
| 85 | +</ul> |
| 86 | + |
| 87 | +Alternatively, you can add a hyperlink with one of the resolvers, for example, Identifiers.org: |
| 88 | + |
| 89 | +<ul> |
| 90 | + <i>We found that BRCA1 (<b><a href="https://identifiers.org/ensembl:ENSG00000012048">ensembl:ENSG00000012048</a></b>) played an important role.</i> |
| 91 | +</ul> |
| 92 | + |
| 93 | +### Compact identifiers for material identifiers |
| 94 | + |
| 95 | +The European Registry of Materials proposes to use the compact identifier for their |
| 96 | +ERM identifiers (doi:[10.1186/s13321-022-00614-7](https://doi.org/10.1186/s13321-022-00614-7)): |
| 97 | + |
| 98 | +<ul> |
| 99 | + <i> |
| 100 | + For example, the NanoSolveIT project registered a material with the ERM00000001 identifier. |
| 101 | + The full Uniform Resource Identifier (URI) for this compound is |
| 102 | + https://nanocommons.github.io/identifiers/registry#ERM00000001 which is too long to be used |
| 103 | + in documentation. The corresponding compact identifier <b>erm:ERM00000001</b> is easy to use in written |
| 104 | + material, analogous to the use of Protein Data Bank (PDB) identifiers for proteins in journals. |
| 105 | + </i> |
| 106 | +</ul> |
| 107 | + |
| 108 | +### Compact identifiers for citation intent annotations |
| 109 | + |
| 110 | +The compact identifier has also been used as the method to include citation intentions in journal |
| 111 | +articles (doi:[10.1186/s13321-020-00448-1](https://doi.org/10.1186/s13321-020-00448-1), |
| 112 | +compact identifier here highlighted in bold): |
| 113 | + |
| 114 | +<ul> |
| 115 | + <i> |
| 116 | + We take advantage here of the ability to add notes to full form [..] references in bibliographies. |
| 117 | + These are referred to as bibnotes. The content of the note will be strictly formatted: it will use |
| 118 | + the syntax [<b>cito:usesMethodIn</b>] and formatted in bold. That is, the bibnote starts with the |
| 119 | + [ character, followed by one of the CiTO types, and ends with the ] character. If you wish to |
| 120 | + provide more than one annotation, you can repeat this syntax, separated by one or more spaces, |
| 121 | + for example: [<b>cito:usesMethodIn</b>] [<b>cito:citeAsAuthority</b>]. |
| 122 | + </i> |
| 123 | +</ul> |
| 124 | + |
| 125 | +Note that in this use, the square brackets and bold typeface are used to make them easier to |
| 126 | +be recognized. Also, note that this document uses this approach to indicate the intention of |
| 127 | +why the cited articles are cited. |
| 128 | + |
| 129 | +## Conclusion |
| 130 | + |
| 131 | +This document described what the compact identifier is, how it helps linking to online |
| 132 | +databases, and how they can be used in written reports as plain text, optionally |
| 133 | +hyperlinked with one of the compact identifier resolvers. |
| 134 | + |
| 135 | +### Acknowledgments |
| 136 | + |
| 137 | +I thank [github:tabbassidaloii](https://n2t.net/github:tabbassidaloii), |
| 138 | +[github:cthoyt](https://n2t.net/github:cthoyt), and |
| 139 | +[github:larsgw](https://n2t.net/github:larsgw) for their comment on |
| 140 | +[this GitHub repo](https://github.com/egonw/compact-ids-in-reports). |
0 commit comments