Think about adding a pre-pipeline coding step that geocodes complete articles (rather than sentences) to the country. This would be useful for two things:
- Associating actors that don't have country codes with the correct country. E.g., "Egyptian police fired on protesters" -> EGYCOP, ~CVL. If the document is geocoded to one country, we could say with enough confidence that ~CVL = EGYCVL. This process would happen in the pipeline rather than in Petrarch.
- Mordecai's country coding works much better at the document level than the sentence level, and its place function can take in a country limit as an argument. Doing a first-pass whole article country-level coding could help the sentence level geocoding.
Because the pipeline operates at the sentence level, actually geocoding the articles would have to happen outside the pipeline. Changes to the pipeline would just be in order to use the new info.
Think about adding a pre-pipeline coding step that geocodes complete articles (rather than sentences) to the country. This would be useful for two things:
Because the pipeline operates at the sentence level, actually geocoding the articles would have to happen outside the pipeline. Changes to the pipeline would just be in order to use the new info.