-
Notifications
You must be signed in to change notification settings - Fork 11
Labels
fipsRelated to FIPS 140 certificationRelated to FIPS 140 certification
Description
Initial description by @J08nY
Data from the FIPS algorithm dataset is not utilized and mined fully. We can follow the links to the algorithm page and get more data that will help us. This can help in cert id cleanup to get rid of the algo references.
Details
Currently, the FIPSAlgorithm object is built from rows of a pandas DataFrame constructed merely from the list of Algorithms, see below
| df = pd.read_html(html_path)[0] |
This table does not include valuable attributes found on the individual pages of the algorithm. The proposed enhancement should:
- Track the URL for each of the algorithms, e.g., https://csrc.nist.gov/projects/Cryptographic-Algorithm-Validation-Program/details?product=3989
- Scrape the contents of the corresponding html and extract the values for the following attributes:
- algorithm type
- description
- version
- algorithm capabilities
- The
FIPSAlgorithmobject (see below) should be enriched with the attributes mentioned above.
| class FIPSAlgorithm(PandasSerializableType, ComplexSerializableType): |
Further guidance
One can isolate the pipeline stage that processes the algorithm dataset simply by
from sec_certs.dataset.fips_algorithm import FIPSAlgorithmDataset
alg_dset = FIPSAlgorithmDataset.from_web()
alg_dset.to_json("/path/to/some/file.json")The PR implementing this enhancement should modify the parse_algorithms_from_html method.
Metadata
Metadata
Assignees
Labels
fipsRelated to FIPS 140 certificationRelated to FIPS 140 certification