-
Notifications
You must be signed in to change notification settings - Fork 16
Description
The prototype crawler from #19 only captures a few initial basic project details, Code for Kenya/HURUmap provides a good example of a record with all fields filled in well and some natural changes to them in the history already
What are some more details a v2 should capture? Any thoughts on how we should organize it? (TOML has great support for grouping things any number of levels deep)
I don't think we want to capture any details in the index that routinely change day-by-day in the life of a project (e.g. number of open issues, number of contributors), BUT maybe we do capture things like that as binary or tiered buckets (e.g. has-issue=true or contributors=5-10)
I think we should pull in the GitHub description and/or opening paragraph of the README directly, and then for other big wordy things record their presence, link to them, and maybe measure their health or summarize them if there's a valuable way to do so. (e.g. we can record which license is used and link to the license, we can record which of GitHub's standard community health files are present and link to them)
We should also record the presence of any civic.json or publiccode.yaml file, and pull in some or all of their contents into a normalized form.
Metadata
Metadata
Assignees
Labels
Type
Projects
Status