Sort concepts by (ES hits * ES score per concept) / PITs per concept: #10

Currently, Histograph does the following:
- API queries Elasticsearch (e.g. `q=utrecht`), ES returns list of PITs
- List of PITs probably contains many forms and spellings of _Utrecht_, and maybe some results like _Abcoude bij Utrecht_
- Those PITs are sent to Neo4j Plugin, [BFSs](https://en.wikipedia.org/wiki/Breadth-first_search) are computed for each PIT, and subgraphs/concepts/klonten are returned, ordered by number of PITs per concept
- This may cause _Abcoude_ to show up first in the list of results.
- **This is wrong!**

Possible solution:
- API queries Elasticsearch (e.g. `q=utrecht`), ES returns list of PITs
- List of PITs probably contains many forms and spellings of _Utrecht_, and maybe some results like _Abcoude bij Utrecht_
- Those PITs are sent to Neo4j Plugin, together with their respective [Elasticsearch score](https://www.elastic.co/guide/en/elasticsearch/guide/current/practical-scoring-function.html)
- BFSs are computed for each PIT, just like before, but now the Neo4j Plugin orders the list of resulting concepts by (ES hits \* ES score per concept) / PITs per concept
- This way, the concept of _Utrecht_ will have many ES hits (and high ES scores, too) per concept, while the concept of _Abcoude_ will have at least one ES hit (_Abcoude bij Utrecht_) in its concept, but probably not many more. The new ordering algorithm will make sure this concept is not returned first.
- **This is better!**


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sort concepts by (ES hits * ES score per concept) / PITs per concept: #10 #10

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Sort concepts by (ES hits * ES score per concept) / PITs per concept: #10 #10

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions