-
Notifications
You must be signed in to change notification settings - Fork 0
Open
Labels
enhancementNew feature or requestNew feature or request
Description
There could be 4 different IDs in a page:
- Definition id (in the
articletag) - Meaning id (in the
liof the firstolwithin an article) - Locution Name id (locutions appear in a
h3) - Locution Meaning id (in
liof a locution'sol)
We are currently working with the first, second and fourth types. The third type is not being extracted since most of the redirections to locutions refer to one of their meanings. However, there are ~700 (~0.5% of the total number of meanings) that refer to the locution name. At this point we consider them exceptions. It is not a big deal since locutions are barely asked in the show, but we probably we could handle them by extracting the h3 IDs and creating a dict to store the relations (key: locution name id, value: locution meaning id) to use it in the next stage.
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or request