Skip to content

Commit e65ba74

Browse files
committed
update docs
1 parent aced5c9 commit e65ba74

File tree

4 files changed

+36
-35
lines changed

4 files changed

+36
-35
lines changed

docs/api.rst

Lines changed: 17 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -83,7 +83,7 @@ minLength ``1`` ``0`` When all pre-processing steps are done, tokens w
8383
Fetching
8484
--------
8585

86-
The fetching parameters are implemented in `PubFetcher <https://github.com/edamontology/pubfetcher>`_ and thus are described in its documentation: `Fetching parameters <https://pubfetcher.readthedocs.io/en/latest/cli.html#fetching>`_.
86+
The fetching parameters are implemented in `PubFetcher <https://github.com/edamontology/pubfetcher>`_ and thus are described in its documentation: `Fetching parameters <https://pubfetcher.readthedocs.io/en/stable/cli.html#fetching>`_.
8787

8888
.. _mapping:
8989

@@ -108,7 +108,7 @@ Mapping algorithm
108108
==================== ============= ======= ======= ===========
109109
Parameter Default Min Max Description
110110
==================== ============= ======= ======= ===========
111-
compoundWords ``1`` ``0`` Try to match words that have accidentally been made compound (given number is maximum number of words in an accidental compound minus one). Not done for tokens from `fulltext <https://pubfetcher.readthedocs.io/en/latest/fetcher.html#fulltext>`_, `doc <https://pubfetcher.readthedocs.io/en/latest/output.html#content-of-docs>`_ and `webpage <https://pubfetcher.readthedocs.io/en/latest/output.html#content-of-webpages>`_. Set to 0 to disable (for a slight speed increase with only slight changes to the results).
111+
compoundWords ``1`` ``0`` Try to match words that have accidentally been made compound (given number is maximum number of words in an accidental compound minus one). Not done for tokens from `fulltext <https://pubfetcher.readthedocs.io/en/stable/fetcher.html#fulltext>`_, `doc <https://pubfetcher.readthedocs.io/en/stable/output.html#content-of-docs>`_ and `webpage <https://pubfetcher.readthedocs.io/en/stable/output.html#content-of-webpages>`_. Set to 0 to disable (for a slight speed increase with only slight changes to the results).
112112
mismatchMultiplier ``2.0`` ``0.0`` Multiplier for score decrease caused by mismatch
113113
matchMinimum ``1.0`` ``0.0`` ``1.0`` Minimum score allowed for approximate match. Not done for tokens from fulltext_, doc_ and webpage_. Set to ``1`` to disable approximate matching.
114114
positionOffBy1 ``0.35`` ``0.0`` ``1.0`` Multiplier of a position score component for the case when a word is inserted between matched words or matched words are switched
@@ -164,14 +164,14 @@ Parameter Default Min Max Description
164164
nameNormaliser ``0.81`` ``0.0`` ``1.0`` Score normaliser for matching a query name. Set to ``0`` to disable matching of names.
165165
keywordNormaliser ``0.77`` ``0.0`` ``1.0`` Score normaliser for matching a query keyword. Set to ``0`` to disable matching of keywords.
166166
descriptionNormaliser ``0.92`` ``0.0`` ``1.0`` Score normaliser for matching a query description. Set to ``0`` to disable matching of descriptions.
167-
publicationTitleNormaliser ``0.91`` ``0.0`` ``1.0`` Score normaliser for matching a publication `title <https://pubfetcher.readthedocs.io/en/latest/fetcher.html#title>`_. Set to ``0`` to disable matching of titles.
168-
publicationKeywordNormaliser ``0.77`` ``0.0`` ``1.0`` Score normaliser for matching a publication `keyword <https://pubfetcher.readthedocs.io/en/latest/fetcher.html#keywords>`_. Set to ``0`` to disable matching of keywords.
169-
publicationMeshNormaliser ``0.75`` ``0.0`` ``1.0`` Score normaliser for matching a publication `MeSH term <https://pubfetcher.readthedocs.io/en/latest/fetcher.html#mesh>`_. Set to ``0`` to disable matching of MeSH terms.
170-
publicationMinedTermNormaliser ``1.0`` ``0.0`` ``1.0`` Score normaliser for matching a publication mined term (`EFO <https://pubfetcher.readthedocs.io/en/latest/fetcher.html#efo>`_, `GO <https://pubfetcher.readthedocs.io/en/latest/fetcher.html#go>`_). Set to ``0`` to disable matching of mined terms.
171-
publicationAbstractNormaliser ``0.985`` ``0.0`` ``1.0`` Score normaliser for matching a publication `abstract <https://pubfetcher.readthedocs.io/en/latest/fetcher.html#theabstract>`_. Set to ``0`` to disable matching of abstracts.
172-
publicationFulltextNormaliser ``1.0`` ``0.0`` ``1.0`` Score normaliser for matching a publication `fulltext <https://pubfetcher.readthedocs.io/en/latest/fetcher.html#fulltext>`_. Set to ``0`` to disable matching of fulltexts.
173-
docNormaliser ``1.0`` ``0.0`` ``1.0`` Score normaliser for matching a query `doc <https://pubfetcher.readthedocs.io/en/latest/output.html#content-of-docs>`_. Set to ``0`` to disable matching of docs.
174-
webpageNormaliser ``1.0`` ``0.0`` ``1.0`` Score normaliser for matching a query `webpage <https://pubfetcher.readthedocs.io/en/latest/output.html#content-of-webpages>`_. Set to ``0`` to disable matching of webpages.
167+
publicationTitleNormaliser ``0.91`` ``0.0`` ``1.0`` Score normaliser for matching a publication `title <https://pubfetcher.readthedocs.io/en/stable/fetcher.html#title>`_. Set to ``0`` to disable matching of titles.
168+
publicationKeywordNormaliser ``0.77`` ``0.0`` ``1.0`` Score normaliser for matching a publication `keyword <https://pubfetcher.readthedocs.io/en/stable/fetcher.html#keywords>`_. Set to ``0`` to disable matching of keywords.
169+
publicationMeshNormaliser ``0.75`` ``0.0`` ``1.0`` Score normaliser for matching a publication `MeSH term <https://pubfetcher.readthedocs.io/en/stable/fetcher.html#mesh>`_. Set to ``0`` to disable matching of MeSH terms.
170+
publicationMinedTermNormaliser ``1.0`` ``0.0`` ``1.0`` Score normaliser for matching a publication mined term (`EFO <https://pubfetcher.readthedocs.io/en/stable/fetcher.html#efo>`_, `GO <https://pubfetcher.readthedocs.io/en/stable/fetcher.html#go>`_). Set to ``0`` to disable matching of mined terms.
171+
publicationAbstractNormaliser ``0.985`` ``0.0`` ``1.0`` Score normaliser for matching a publication `abstract <https://pubfetcher.readthedocs.io/en/stable/fetcher.html#theabstract>`_. Set to ``0`` to disable matching of abstracts.
172+
publicationFulltextNormaliser ``1.0`` ``0.0`` ``1.0`` Score normaliser for matching a publication `fulltext <https://pubfetcher.readthedocs.io/en/stable/fetcher.html#fulltext>`_. Set to ``0`` to disable matching of fulltexts.
173+
docNormaliser ``1.0`` ``0.0`` ``1.0`` Score normaliser for matching a query `doc <https://pubfetcher.readthedocs.io/en/stable/output.html#content-of-docs>`_. Set to ``0`` to disable matching of docs.
174+
webpageNormaliser ``1.0`` ``0.0`` ``1.0`` Score normaliser for matching a query `webpage <https://pubfetcher.readthedocs.io/en/stable/output.html#content-of-webpages>`_. Set to ``0`` to disable matching of webpages.
175175
============================== ========= ======= ======= ===========
176176

177177
.. _query_weights:
@@ -371,7 +371,7 @@ _`args`
371371
fetching
372372
Always ``true``
373373
db
374-
Name of the used `database <https://pubfetcher.readthedocs.io/en/latest/output.html#database>`_ file
374+
Name of the used `database <https://pubfetcher.readthedocs.io/en/stable/output.html#database>`_ file
375375
idf
376376
Name of the used :ref:`IDF <idf>` file
377377
idfStemmed
@@ -406,11 +406,11 @@ The type_ ``"full"`` includes everything from core_, plus the following:
406406
mapping
407407
queryFetched
408408
_`webpages`
409-
Array of metadata objects corresponding to webpageUrls_ in query_. Webpages are implemented in PubFetcher_ and thus are described in its documentation: `Content of webpages <https://pubfetcher.readthedocs.io/en/latest/output.html#content-of-webpages>`_. The structure of webpages here will be the same as described in PubFetcher, except for `content <https://pubfetcher.readthedocs.io/en/latest/output.html#webpage-content>`_ which will be missing. The values of `startUrl <https://pubfetcher.readthedocs.io/en/latest/output.html#starturl>`_ of webpages will be the URLs given in webpageUrls_ in query_.
409+
Array of metadata objects corresponding to webpageUrls_ in query_. Webpages are implemented in PubFetcher_ and thus are described in its documentation: `Content of webpages <https://pubfetcher.readthedocs.io/en/stable/output.html#content-of-webpages>`_. The structure of webpages here will be the same as described in PubFetcher, except for `content <https://pubfetcher.readthedocs.io/en/stable/output.html#webpage-content>`_ which will be missing. The values of `startUrl <https://pubfetcher.readthedocs.io/en/stable/output.html#starturl>`_ of webpages will be the URLs given in webpageUrls_ in query_.
410410
_`docs`
411411
Array of metadata objects corresponding to docUrls_ in query_. Structure of objects same as in webpages_.
412412
_`publications`
413-
Array of metadata objects corresponding to publicationIds_ in query_. Publications are implemented in PubFetcher_ and thus are described in its documentation: `Content of publications <https://pubfetcher.readthedocs.io/en/latest/output.html#content-of-publications>`_. The structure of publications here will be the same as described in PubFetcher, except for fulltext_ which will be missing.
413+
Array of metadata objects corresponding to publicationIds_ in query_. Publications are implemented in PubFetcher_ and thus are described in its documentation: `Content of publications <https://pubfetcher.readthedocs.io/en/stable/output.html#content-of-publications>`_. The structure of publications here will be the same as described in PubFetcher, except for fulltext_ which will be missing.
414414
results
415415
topic/operation/data/format
416416
Array of objects defined in topic_, i.e. the same content as in core_, plus the field parts_ defined below.
@@ -625,7 +625,7 @@ To supply the same data (except the "keywords") as `bio.tools input`_, the follo
625625
Prefetching
626626
***********
627627

628-
Once a query has been received by the API, content corresponding to webpageUrls_, docUrls_ and publicationIds_ has to be `fetched <https://pubfetcher.readthedocs.io/en/latest/fetcher.html>`_ (unless it has been fetched and stored in some previous occurrence), before mapping can take place.
628+
Once a query has been received by the API, content corresponding to webpageUrls_, docUrls_ and publicationIds_ has to be `fetched <https://pubfetcher.readthedocs.io/en/stable/fetcher.html>`_ (unless it has been fetched and stored in some previous occurrence), before mapping can take place.
629629

630630
This content could be prefetched and prestored in the database_ as a separate step, before the mapping query is sent. This is useful in the web application, where content can be fetched as soon as the user has entered the corresponding query details, and thus mapping time could be less when the entire query form is finally submitted. It might be of less use in the API, but has been included nevertheless.
631631

@@ -652,7 +652,7 @@ webpageUrls
652652
id
653653
A webpage URL specified in the request
654654
status
655-
The status of that webpage. One of "`broken <https://pubfetcher.readthedocs.io/en/latest/output.html#broken>`_", "`empty <https://pubfetcher.readthedocs.io/en/latest/output.html#webpage-empty>`_", "non-`usable <https://pubfetcher.readthedocs.io/en/latest/output.html#webpage-usable>`_", "non-`final <https://pubfetcher.readthedocs.io/en/latest/output.html#webpage-final>`_", "`final <https://pubfetcher.readthedocs.io/en/latest/output.html#webpage-final>`_".
655+
The status of that webpage. One of "`broken <https://pubfetcher.readthedocs.io/en/stable/output.html#broken>`_", "`empty <https://pubfetcher.readthedocs.io/en/stable/output.html#webpage-empty>`_", "non-`usable <https://pubfetcher.readthedocs.io/en/stable/output.html#webpage-usable>`_", "non-`final <https://pubfetcher.readthedocs.io/en/stable/output.html#webpage-final>`_", "`final <https://pubfetcher.readthedocs.io/en/stable/output.html#webpage-final>`_".
656656

657657
/api/doc
658658
========
@@ -689,12 +689,12 @@ publicationIds
689689
doi
690690
The DOI of the publication
691691
status
692-
The status of that publication. One of `"empty" <https://pubfetcher.readthedocs.io/en/latest/output.html#publication-empty>`_, "non-`usable" <https://pubfetcher.readthedocs.io/en/latest/output.html#publication-usable>`_, "non-`final" <https://pubfetcher.readthedocs.io/en/latest/output.html#publication-final>`_, `"final" <https://pubfetcher.readthedocs.io/en/latest/output.html#publication-final>`_, `"totally final" <https://pubfetcher.readthedocs.io/en/latest/output.html#totallyfinal>`_.
692+
The status of that publication. One of `"empty" <https://pubfetcher.readthedocs.io/en/stable/output.html#publication-empty>`_, "non-`usable" <https://pubfetcher.readthedocs.io/en/stable/output.html#publication-usable>`_, "non-`final" <https://pubfetcher.readthedocs.io/en/stable/output.html#publication-final>`_, `"final" <https://pubfetcher.readthedocs.io/en/stable/output.html#publication-final>`_, `"totally final" <https://pubfetcher.readthedocs.io/en/stable/output.html#totallyfinal>`_.
693693

694694
Example
695695
=======
696696

697-
Try to prefetch the publication with PMID "23479348" and PMCID "PMC3654706", increasing connect and read `timeout <https://pubfetcher.readthedocs.io/en/latest/cli.html#timeout>`_ to give the server more time to fetch the whole publication:
697+
Try to prefetch the publication with PMID "23479348" and PMCID "PMC3654706", increasing connect and read `timeout <https://pubfetcher.readthedocs.io/en/stable/cli.html#timeout>`_ to give the server more time to fetch the whole publication:
698698

699699
.. code-block:: bash
700700

docs/conf.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@
33

44
project = 'EDAMmap'
55
author = 'Erik Jaaniso'
6-
copyright = '2016-2019, Erik Jaaniso'
6+
copyright = '2016-2020, Erik Jaaniso'
77
version = '1.0.1-SNAPSHOT'
88
release = '1.0.1-SNAPSHOT'
99

docs/future.rst

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -23,7 +23,7 @@ Algorithm
2323
*********
2424

2525
* Currently, scores are not totally comparable across queries. Try to make a score in one query mean the same thing in another query as exactly as possible.
26-
* An extra query part could be tags present in some web pages, like software registries or code repositories. This would require `changes in PubFetcher <https://pubfetcher.readthedocs.io/en/latest/future.html#structure-changes>`_.
26+
* An extra query part could be tags present in some web pages, like software registries or code repositories. This would require `changes in PubFetcher <https://pubfetcher.readthedocs.io/en/stable/future.html#structure-changes>`_.
2727
* Maybe `WordNet <https://wordnet.princeton.edu/>`_ could be used as part of the mapping algorithm. For example use lemmatisation instead of stemming.
2828
* In results got from running EDAMmap against existing entries of bio.tools, look at FNs and see if anything can be done to increase their score.
2929

@@ -76,7 +76,8 @@ Server
7676
Maintenance
7777
***********
7878

79-
* Update PubFetcher's `scraping rules <https://pubfetcher.readthedocs.io/en/latest/scraping.html#scraping-rules>`_, by `testing the rules <https://pubfetcher.readthedocs.io/en/latest/scraping.html#testing-of-rules>`_ and modifying outdated rules in `journals.yaml <https://github.com/edamontology/pubfetcher/blob/master/core/src/main/resources/scrape/journals.yaml>`_, `webpages.yaml <https://github.com/edamontology/pubfetcher/blob/master/core/src/main/resources/scrape/webpages.yaml>`_ and most importantly the hardcoded rules for `Europe PMC <https://europepmc.org/>`_ and other built-in `resources <https://pubfetcher.readthedocs.io/en/latest/fetcher.html#resources>`_.
79+
* Update PubFetcher's `scraping rules <https://pubfetcher.readthedocs.io/en/stable/scraping.html#scraping-rules>`_, by `testing the rules <https://pubfetcher.readthedocs.io/en/stable/scraping.html#testing-of-rules>`_ and modifying outdated rules in `journals.yaml <https://github.com/edamontology/pubfetcher/blob/master/core/src/main/resources/scrape/journals.yaml>`_, `webpages.yaml <https://github.com/edamontology/pubfetcher/blob/master/core/src/main/resources/scrape/webpages.yaml>`_ and most importantly the hardcoded rules for `Europe PMC <https://europepmc.org/>`_ and other built-in `resources <https://pubfetcher.readthedocs.io/en/stable/fetcher.html#resources>`_.
8080
* Update dependencies in `pom.xml <https://github.com/edamontology/edammap/blob/master/pom.xml>`_ (but care should be taken to not cause regressions).
81+
* Check for broken links in the documentation using ``make linkcheck``.
8182
* When a new `biotoolsSchema <https://github.com/bio-tools/biotoolsSchema>`_ is released, some code modifications might be necessary to adhere to it.
8283
* Also, when a new `EDAM ontology <https://github.com/edamontology/edamontology>`_ is released, some modifications might be necessary (for example in `blacklist.txt <https://github.com/edamontology/edammap/blob/master/core/src/main/resources/edam/blacklist.txt>`_ and `blacklist_synonyms.txt <https://github.com/edamontology/edammap/blob/master/core/src/main/resources/edam/blacklist_synonyms.txt>`_; also, any running :ref:`server` instances could be restarted to use the new ontology version).

0 commit comments

Comments
 (0)