Skip to content

Commit 016139a

Browse files
authored
Merge pull request #228 from stonezdj/add_cve_search
Add proposal for Security Hub feature
2 parents 1feecdb + 0e1f7fe commit 016139a

File tree

4 files changed

+278
-0
lines changed

4 files changed

+278
-0
lines changed
474 KB
Loading
565 KB
Loading
357 KB
Loading

proposals/new/securityhub.md

Lines changed: 278 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,278 @@
1+
# Proposal: Security Hub
2+
3+
Author: stonezdj
4+
5+
## Discussion
6+
7+
[#10496](https://github.com/goharbor/harbor/issues/10496)
8+
[#13](https://github.com/goharbor/pluggable-scanner-spec/issues/13)
9+
10+
## Abstract
11+
12+
Security Hub feature provides a flexible way to search and report the project or application’s vulnerability information for administrators.
13+
14+
## Background
15+
16+
Harbor 2.7.0 provide a way to scan image and export CVE information by project, but it could not search involved images by CVE ID. it could not review all the CVEs in system level or project level. it could not provide the summary information of current existing CVEs. it can’t provide a way to present a vulnerability in a holistic view with different level such as system level, project level, or any freestyle (label) level. with the CVE search and report feature, administrator could search and report the vulnerability information in a flexible way.
17+
18+
## User stories
19+
20+
1. As an administrator, he/she can check the total number of vulnerabilities scanned in current view (system/project/label), the report include the total number of vulnerabilities count by severity level: Critical, High, Medium, Low, None and Unknown. Most dangerous vulnerabilities, Most dangerous artifacts in current report.
21+
1. As an administrator, he/she can filter vulnerabilities by score range in the current scope, the scope should be project or system level.
22+
1. As an administrator, he/she can filter vulnerabilities by image tag in the current scope.
23+
1. As an administrator, he/she can filter vulnerabilities by package in the current scope.
24+
1. As an administrator, he/she can filter vulnerabilities by CVE id in the current scope.
25+
26+
## Personas
27+
28+
Only system administrators could access the security hub feature.
29+
30+
## None Goals
31+
32+
This proposal does not cover the following items:
33+
34+
1. The security hub feature does not provide a way to search and report the vulnerability information by image layer.
35+
2. It doesn't provide a way to fix the vulnerability or take action to the vulnerability.
36+
37+
## Compatibilities
38+
39+
The implementation of this feature should be compatible with the [pluggable scanner spec v1.0](https://github.com/goharbor/pluggable-scanner-spec) and only support the trivy adapter.
40+
41+
Some table schema should be changed to enable this feature
42+
43+
The scan_report table need to add the following columns:
44+
45+
| Column Name | Description |
46+
| ------------- | ------------- |
47+
| critical_cnt | The current report contains critical CVE’s count |
48+
| high_cnt | The current report contains high CVE’s count |
49+
| medium_cnt | The current report contains medium CVE’s count |
50+
| low_cnt | The current report contains low CVE’s count |
51+
| none_cnt | The current report contains none CVE’s count |
52+
| unknown_cnt | The current report contains unknown CVE’s count |
53+
| fixable_cnt | The current report contains fixible CVE’s count |
54+
55+
56+
## Implementation
57+
58+
There are REST APIs provided:
59+
60+
1. Retrieve the vulnerability summary of the system
61+
```
62+
GET /api/v2.0/security/summary?q=xxx&with_cve=true&with_artifact=true
63+
```
64+
Response data
65+
```
66+
{
67+
“total_count”: 240,
68+
“high_count”: 20,
69+
“critical_count”: 35,
70+
“medium_count”: 90,
71+
“low_count”: 23,
72+
“none_count”: 0,
73+
“fixiable_count”:103,
74+
“scanned”, 1032,
75+
“not_scanned”:59
76+
“most_dangerous_cve”: [
77+
{"cve_id":“CVE-2022-32221”, "package": "curl", "version": "2.3.2", "cvss_score_v3": 9.8},
78+
...
79+
]
80+
“most_dangerous_artifact”:
81+
[
82+
{“artifact_id”: 2377, “artifact_repository”:”library/nuxas”, “digest”:"sha256:7027e69a2172e38cef8ac2cb1f046025895c9fcf3160e8f70ffb26446f680e4d", “serverity_above_high”: 23}
83+
},
84+
...
85+
]
86+
}
87+
```
88+
89+
To query the security summary of a project
90+
91+
```
92+
GET /api/v2.0/projects/{project_name_or_id}/security/summary?with_cve=true&with_artifact=true
93+
```
94+
Reponse data format is the same as the system level query.
95+
96+
If with_cve, with_artifact are true, then the response data should include the “most_dangerous_cve” and “most_dangerous_artifact” information.
97+
98+
99+
The sql query for this API:
100+
101+
```
102+
-- query all the vulnerability count by severity level
103+
select sum(s.critical_cnt) critical_cnt,
104+
sum(s.high_cnt) high_cnt,
105+
sum(s.medium_cnt) medium_cnt,
106+
sum(s.low_cnt) low_cnt,
107+
sum(s.none_cnt) none_cnt,
108+
sum(s.unknown_cnt) unknown_cnt,
109+
sum(s.fixable_cnt) fixable_cnt
110+
from artifact a
111+
left join scan_report s on a.digest = s.digest
112+
where s.registration_uuid = ?
113+
114+
-- query total artifact count
115+
116+
SELECT COUNT(1)
117+
FROM artifact A
118+
WHERE NOT EXISTS (select 1 from artifact_accessory acc WHERE acc.artifact_id = a.id)
119+
AND (EXISTS (SELECT 1 FROM tag WHERE tag.artifact_id = a.id)
120+
OR NOT EXISTS (SELECT 1 FROM artifact_reference ref WHERE ref.child_id = a.id))
121+
122+
-- query scanned count
123+
SELECT COUNT(1)
124+
FROM artifact a
125+
WHERE EXISTS (SELECT 1
126+
FROM scan_report s
127+
WHERE a.digest = s.digest
128+
AND s.registration_uuid = ?)
129+
-- exclude artifact accessory
130+
AND NOT EXISTS (SELECT 1 FROM artifact_accessory acc WHERE acc.artifact_id = a.id)
131+
-- exclude artifact without tag and part of the image index
132+
AND EXISTS (SELECT 1
133+
FROM tag
134+
WHERE tag.artifact_id = id
135+
OR (NOT EXISTS (SELECT 1 FROM artifact_reference ref WHERE ref.child_id = a.id)))
136+
-- include image index which is scanned
137+
OR EXISTS (SELECT 1
138+
FROM scan_report s,
139+
artifact_reference ref
140+
WHERE s.digest = ref.child_digest
141+
AND ref.parent_id = a.id AND s.registration_uuid = ? AND NOT EXISTS (SELECT 1
142+
FROM scan_report s
143+
WHERE s.digest = a.digest and s.registration_uuid = ?)) // scanned count
144+
145+
-- query top 5 of the most dangerous cve
146+
SELECT vr.id,
147+
vr.cve_id,
148+
vr.package,
149+
vr.cvss_score_v3,
150+
vr.description,
151+
vr.fixed_version,
152+
vr.severity,
153+
CASE vr.severity
154+
WHEN 'Critical' THEN 5
155+
WHEN 'High' THEN 4
156+
WHEN 'Medium' THEN 3
157+
WHEN 'Low' THEN 2
158+
WHEN 'None' THEN 1
159+
WHEN 'Unknown' THEN 0 END AS severity_level
160+
FROM vulnerability_record vr
161+
WHERE EXISTS (SELECT 1 FROM report_vulnerability_record WHERE vuln_record_id = vr.id)
162+
AND vr.cvss_score_v3 IS NOT NULL
163+
AND vr.registration_uuid = ?
164+
ORDER BY vr.cvss_score_v3 DESC, severity_level DESC
165+
LIMIT 5
166+
167+
-- query top 5 of the most dangerous artifact
168+
select a.project_id project, a.repository_name repository, a.digest, s.critical_cnt, s.high_cnt, s.medium_cnt, s.low_cnt
169+
from artifact a,
170+
scan_report s
171+
where a.digest = s.digest
172+
and s.registration_uuid = ?
173+
order by s.critical_cnt desc, s.high_cnt desc, s.medium_cnt desc, s.low_cnt desc
174+
limit 5
175+
176+
```
177+
If a label is specified in the query condition, the label could be changed to a filter condition of artifact id in the sql query
178+
179+
2. Search Vulnerability information
180+
181+
The Vulnerability is the security issue found in the artifact, it includes the package information and the CVE information, but it is not the CVE itself.
182+
```
183+
GET /api/v2.0/security/vul?q=xxx&tune_count=true
184+
```
185+
Response data
186+
```
187+
[{
188+
“project”: “library”,
189+
“repository”: "library/nuxas”,
190+
“digest”: “sha256:7027e69a2172e38cef8ac2cb1f046025895c9fcf3160e8f70ffb26446f680e4d”,
191+
“tags”: [“v2.3.0”, “latest”],
192+
“css_v3_score”: 8.9,
193+
“cve_id”: “CVE-2022-32221”,
194+
“package” “nfs-utils”,
195+
“package_version” “v3.1.0”,
196+
"fix_version": "2.3.1"
197+
“description”: “The package nuxas before 2.3.0 for Python allows Directory Traversal via a crafted tar file.”,
198+
"urls": “https://nvd.nist.gov/vuln/detail/CVE-2022-32221”,
199+
},
200+
{
201+
//another cve record
202+
}
203+
]
204+
```
205+
206+
The tune_count option is used to tune the query of the query count, if the query total count > 1000, then the query will display that the total count is more than 1000, and x-total-count will be set to -1, and the response is the same as the query without tune_count option.
207+
208+
The q parameters like q see lib/q to pass the following parameters
209+
210+
| Query condition | Description |
211+
| ------------- |--------------------------------------------------------------------------|
212+
| cve_id | Search vulnerability information by CVE ID, support exact match |
213+
| severity | Search vulnerability information by severity level |
214+
| cvss_v3_score | Search vulnerability information by cvss v3 score |
215+
| project_id | Search vulnerability information by project id |
216+
| digest | Search vulnerability information by artifact digest, support exact match |
217+
| repository | Search vulnerability information by repository name, support exact match |
218+
| package | Search vulnerability information by package name, support exact match |
219+
| tag | Search vulnerability information by tag name, support exact match |
220+
221+
An example of the query condition:
222+
223+
```
224+
GET /api/v2.0/security/vul?q=cve_id=CVE-2023-12345,cvss_v3_score=[7.0~10.0],severity=Critical,project_id=1,repository=library/nuxas,package=nfs-utils,tag=v2.3.0
225+
```
226+
227+
The sql query for this API:
228+
229+
```
230+
select vr.cve_id, vr.cvss_score_v3, vr.package, a.repository_name, a.id artifact_id, a.digest, vr.package, vr.package_version, vr.severity, vr.fixed_version, vr.description, vr.urls, a.project_id
231+
from artifact a,
232+
scan_report s,
233+
report_vulnerability_record rvr,
234+
vulnerability_record vr
235+
where a.digest = s.digest
236+
and s.uuid = rvr.report_uuid
237+
and rvr.vuln_record_id = vr.id
238+
and rvr.report_uuid is not null
239+
and vr.registration_uuid = ?
240+
241+
```
242+
243+
Database schema change:
244+
245+
246+
scan_report:
247+
```
248+
alter table scan_report add column IF NOT EXISTS critical_cnt int;
249+
alter table scan_report add column IF NOT EXISTS high_cnt int;
250+
alter table scan_report add column IF NOT EXISTS medium_cnt int;
251+
alter table scan_report add column IF NOT EXISTS low_cnt int;
252+
alter table scan_report add column IF NOT EXISTS none_cnt int;
253+
alter table scan_report add column IF NOT EXISTS unknown_cnt int;
254+
alter table scan_report add column IF NOT EXISTS fixable_cnt int;
255+
```
256+
257+
Beside the upward APIs, there are some other refactor work.
258+
259+
1. To improve the performance, refactor scan report add summary information, such as total, high, low, medium count, fixible in a single scan report, when querying the summary information, these data could be aggregated without join other table.
260+
2. Refactor scan report insert CVE process, regulate the data insert into the table, current cvss_v3_score is emtpy, we need to extract these information from vendor attribute data, and store the information in the cvss_v3_score column.
261+
3. Previous scan report table doesn't contain any critical_cnt, high_cnt, medium_cnt, low_cnt, none_cnt, unknown_cnt, fixable_cnt information, we need to extract these information from vendor attribute data, and store the information in the vendor_attribute column.
262+
263+
## UI work
264+
265+
The draft UI of the security Hub:
266+
Summary:
267+
![image](../images/securityhub/securityhub-1.png)
268+
Search vulnerability:
269+
![image](../images/securityhub/securityhub-2.png)
270+
271+
272+
## Open Questions
273+
274+
1. Current trivy adapter report doesn't contain the `preferred_cvss` attribute, as a workaround, we need to extract the information from vendor attribute data, waiting for the trivy adapter to provide this information in the scan report, the score will be stored in the `cvss_v3_score`. the final solution is update the plugable-scanner-spec to add the `cvss_v3_score` attribute. there maybe other vendor's score information, but we only support these two vendor's score information when searching. the score information will be stored in the `vulnerability_record` table's vendor_attribute column.
275+
276+
2. Peformance consideration, a typical registry might have 10000+ artifacts, and each artifact might have 1000+ CVE's, the table of report_vulnerability_record will have 10000000+ records, the query performance is a big concern, we need to refactor the sql query for better performance, and add index for the table. further more we will limit the records returned by a query to 100 records, and add the total count in the response header. all queries should be returned in 1 minute.
277+
278+
3. The currrent implementation is based on database, it is possible to use other storage in future, such as elasticsearch, if we use elasticsearch, we need to add the support for elasticsearch in the post scan job, to index each CVE records, and add the support for elasticsearch in the query API.

0 commit comments

Comments
 (0)