@@ -6,6 +6,9 @@ context-specific rewrite rules, you can wrap a :class:`curies.Converter` and
66preprocessing rules encoded in an instance of :class: `curies.PreprocessingRules ` inside
77a :class: `curies.PreprocessingConverter `.
88
9+ Rewrites
10+ --------
11+
912For example, you always want to fix legacy references to the ``OBO_REL `` namespace:
1013
1114.. code-block :: python
@@ -15,21 +18,21 @@ For example, you always want to fix legacy references to the ``OBO_REL`` namespa
1518
1619 rules = PreprocessingRules(
1720 rewrites = PreprocessingRewrites(
18- full = {" OBO_REL:is_a" : " rdfs:subClassOf" }
19- )
21+ full = {" OBO_REL:is_a" : " rdfs:subClassOf" },
22+ ),
2023 )
2124
2225 converter = curies.get_obo_converter()
2326 converter = PreprocessingConverter.from_converter(
24- converter, rules = rules
27+ converter, rules = rules,
2528 )
2629
2730 >> > converter.parse_curie(" OBO_REL:is_a" )
2831 ReferenceTuple(' rdfs' , ' subClassOf' )
2932
30- Similarly, there may be a whole class of references that need to be fixed
31- based on their prefix, such as the ``APOLLO:SV_ `` references that are mangled
32- by the OWLAPI due to the OBO Foundry's PURL rules
33+ Similarly, there may be a whole class of references that need to be fixed based on their
34+ prefix, such as the ``APOLLO:SV_ `` references that are mangled by the OWLAPI due to the
35+ OBO Foundry's PURL rules
3336
3437.. code-block :: python
3538
@@ -38,18 +41,75 @@ by the OWLAPI due to the OBO Foundry's PURL rules
3841
3942 rules = PreprocessingRules(
4043 rewrites = PreprocessingRewrites(
41- prefix = {" APOLLO:SV_" : " APOLLO_SV:" }
44+ prefix = {" APOLLO:SV_" : " APOLLO_SV:" },
4245 )
4346 )
4447
4548 converter = curies.get_obo_converter()
4649 converter = PreprocessingConverter.from_converter(
47- converter, rules = rules
50+ converter, rules = rules,
4851 )
4952
5053 >> > converter.parse_curie(" APOLLO:SV_1234567" )
5154 ReferenceTuple(' APOLLO_SV' , ' 1234567' )
5255
53- Some rewrite rules only apply to a specific resource, because of its own quirks
54- in curation or encoding. For example, CHMO encodes OrangeBook entries with ``orange ``
55- as a prefix, which is not typically specific enough to
56+ Some rewrite rules only apply to a specific resource, because of its own quirks in
57+ curation or encoding. For example, CHMO encodes OrangeBook entries with ``orange `` as a
58+ prefix, which is not typically specific enough to warrant curating ``orange `` as a
59+ prefix, e.g., in the Bioregistry
60+
61+ .. code-block :: python
62+
63+ import curies
64+ from curies import PreprocessingRules, PreprocessingConverter, PreprocessingRewrites
65+
66+ rules = PreprocessingRules(
67+ rewrites = PreprocessingRewrites(
68+ resource_prefix = {
69+ " CHMO" : {" orange:" : " orangebook:" },
70+ },
71+ ),
72+ )
73+
74+ converter = curies.get_obo_converter()
75+ converter.add_prefix(" orangebook" , " https://bioregistry.io/orangebook:" )
76+ converter = PreprocessingConverter.from_converter(
77+ converter, rules = rules,
78+ )
79+
80+ >> > converter.parse_curie(" orange:10.2.1.1.3" )
81+ ReferenceTuple(' orangebook' , ' 10.2.1.1.3' )
82+
83+ Similarly, this can be used to inject knowledge about resources that improperly import
84+ EDAM sub-trees such as MCRO, which uses ``format `` as a prefix where it means
85+ ``edam.format ``
86+
87+ Blocks
88+ ------
89+
90+ Some references are _never_ informative, and can be configured to be thrown away, such
91+ as ``Bgee:curators ``, ``BioGRID:curators ``, ``GROUP:OBI ``, and similar group curation
92+ flags.
93+
94+ .. code-block :: python
95+
96+ import curies
97+ from curies import PreprocessingRules, PreprocessingConverter, PreprocessingBlocklists
98+
99+ rules = PreprocessingRules(
100+ blocklists = PreprocessingBlocklists(
101+ full = [" Bgee:curators" , " BioGRID:curators" , " GROUP:OBI" ],
102+ ),
103+ )
104+
105+ converter = curies.get_obo_converter()
106+ converter = PreprocessingConverter.from_converter(
107+ converter, rules = rules,
108+ )
109+
110+ # raises a BlocklistError
111+ >> > converter.parse_curie(" GROUP:OBI" )
112+
113+ Blocklists cause throwing an exception that can be handled by downstream code, such as
114+ returning a None. This is done because in some places, it's nice to have the distinction
115+ between ``None `` being returned by parsing failing, versus actively being blocked.
0 commit comments