Today we are duplicating the data format used by NVD in the nvd-data-overrides repo. This data format is less than ideal
We should discuss some goals and ideas for how to best enrich this data in the future.
Here are some high level goals for enriching data
- Do not create new IDs, only enrich existing IDs (this avoids trying figure out a new ID format)
- Defer to upstream data whenever possible
- The idea here isn't to overrule upstream data, but to add things they cannot. For example: A GitHub ID that affects an ecosystem they do not currently cover.
- If an upstream data source has an error, try to submit fixes there first
- Allow anyone to submit modifications to the data. Those modification should be reviewed by a trusted project member before being accepted just like all open source projects work
- Have the ability to output the enriched data in multiple formats. For example we could publish cve5, OSV, and NVD formats
- Make sure the data is future proof to a degree. By capturing more details than we need for the existing formats capture today, we raise our chances of not needing to overhaul everything in the future
Two data format examples that are pretty good
cve5
https://github.com/CVEProject/cvelistV5
OSV
https://ossf.github.io/osv-schema/
Today we are duplicating the data format used by NVD in the nvd-data-overrides repo. This data format is less than ideal
We should discuss some goals and ideas for how to best enrich this data in the future.
Here are some high level goals for enriching data
Two data format examples that are pretty good
cve5
https://github.com/CVEProject/cvelistV5
OSV
https://ossf.github.io/osv-schema/