Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
28 changes: 14 additions & 14 deletions .ci/benchmark.txt
Original file line number Diff line number Diff line change
Expand Up @@ -225,26 +225,26 @@ FileType FileNumber ValidLines Positives Negatives Templat
.zsh 6 872 12
.zsh-theme 1 97 1
TOTAL: 10343 16352935 12112 46625 4907
credsweeper result_cnt : 11860, lost_cnt : 0, true_cnt : 11648, false_cnt : 212
credsweeper result_cnt : 11901, lost_cnt : 0, true_cnt : 11687, false_cnt : 214
Rules Positives Negatives Templates Reported TP FP TN FN FPR FNR ACC PRC RCL F1
------------------------------ ----------- ----------- ----------- ---------- ----- ---- ----- ---- -------- -------- -------- -------- -------- --------
API 125 3172 187 122 122 0 3359 3 0.000000 0.024000 0.999139 1.000000 0.976000 0.987854
AWS Client ID 170 19 0 162 162 0 19 8 0.000000 0.047059 0.957672 1.000000 0.952941 0.975904
AWS Multi 82 10 0 84 82 1 9 0 0.100000 0.000000 0.989130 0.987952 1.000000 0.993939
AWS S3 Bucket 67 23 0 92 67 23 0 0 1.000000 0.000000 0.744444 0.744444 1.000000 0.853503
Atlassian Old PAT token 3 7 0 10 3 7 0 0 1.000000 0.000000 0.300000 0.300000 1.000000 0.461538
Auth 417 2744 81 396 395 1 2824 22 0.000354 0.052758 0.992906 0.997475 0.947242 0.971710
Auth 417 2744 81 400 399 1 2824 18 0.000354 0.043165 0.994139 0.997500 0.956835 0.976744
Azure Access Token 19 0 0 12 12 0 0 7 0.368421 0.631579 1.000000 0.631579 0.774194
BASE64 Private Key 12 4 0 12 12 0 4 0 0.000000 0.000000 1.000000 1.000000 1.000000 1.000000
BASE64 encoded PEM Private Key 7 0 0 5 5 0 0 2 0.285714 0.714286 1.000000 0.714286 0.833333
Bitbucket Client ID 19 52 0 72 17 52 0 2 1.000000 0.105263 0.239437 0.246377 0.894737 0.386364
Bitbucket Client Secret 29 75 1 104 27 75 1 2 0.986842 0.068966 0.266667 0.264706 0.931034 0.412214
CMD ConvertTo-SecureString 13 4 0 13 13 0 4 0 0.000000 0.000000 1.000000 1.000000 1.000000 1.000000
CMD Password 27 128 0 25 25 0 128 2 0.000000 0.074074 0.987097 1.000000 0.925926 0.961538
CMD Password 27 128 0 27 27 0 128 0 0.000000 0.000000 1.000000 1.000000 1.000000 1.000000
CMD Secret 1 1 0 1 1 0 1 0 0.000000 0.000000 1.000000 1.000000 1.000000 1.000000
CMD Token 6 0 0 6 6 0 0 0 0.000000 1.000000 1.000000 1.000000 1.000000
Certificate 24 471 0 19 19 0 471 5 0.000000 0.208333 0.989899 1.000000 0.791667 0.883721
Credential 91 422 76 90 90 0 498 1 0.000000 0.010989 0.998302 1.000000 0.989011 0.994475
CMD Token 6 0 0 5 5 0 0 1 0.166667 0.833333 1.000000 0.833333 0.909091
Certificate 24 471 0 20 20 0 471 4 0.000000 0.166667 0.991919 1.000000 0.833333 0.909091
Credential 91 422 76 92 91 1 497 0 0.002008 0.000000 0.998302 0.989130 1.000000 0.994536
Docker Swarm Token 2 0 0 1 1 0 0 1 0.500000 0.500000 1.000000 0.500000 0.666667
Dropbox App secret 59 141 0 42 31 10 131 28 0.070922 0.474576 0.810000 0.756098 0.525424 0.620000
Facebook Access Token 0 1 0 0 0 1 0 0.000000 1.000000
Expand All @@ -259,21 +259,21 @@ Grafana Provisioned API Key 22 1 0
JSON Web Token 170 61 0 131 131 0 61 39 0.000000 0.229412 0.831169 1.000000 0.770588 0.870432
Jira / Confluence PAT token 0 4 0 0 0 4 0 0.000000 1.000000
Jira 2FA 21 0 1 15 15 0 1 6 0.000000 0.285714 0.727273 1.000000 0.714286 0.833333
Key 3916 15714 482 3921 3902 19 16177 14 0.001173 0.003575 0.998359 0.995154 0.996425 0.995789
Nonce 93 49 0 93 92 1 48 1 0.020408 0.010753 0.985915 0.989247 0.989247 0.989247
Key 3916 15714 482 3931 3910 21 16175 6 0.001297 0.001532 0.998658 0.994658 0.998468 0.996559
Nonce 93 49 0 92 92 0 49 1 0.000000 0.010753 0.992958 1.000000 0.989247 0.994595
Other 9 7450 5 0 0 7455 9 0.000000 1.000000 0.998794 0.000000
PEM Private Key 1019 1483 0 1023 1019 4 1479 0 0.002697 0.000000 0.998401 0.996090 1.000000 0.998041
Password 2032 7527 2539 1951 1946 5 10061 86 0.000497 0.042323 0.992478 0.997437 0.957677 0.977153
Password 2032 7527 2539 1970 1965 5 10061 67 0.000497 0.032972 0.994049 0.997462 0.967028 0.982009
SQL Password 44 13 0 41 41 0 13 3 0.000000 0.068182 0.947368 1.000000 0.931818 0.964706
Salesforce Credentials 2 0 0 2 2 0 0 0 0.000000 1.000000 1.000000 1.000000 1.000000
Salt 49 74 1 46 46 0 75 3 0.000000 0.061224 0.975806 1.000000 0.938776 0.968421
Secret 1310 1567 799 1303 1303 0 2366 7 0.000000 0.005344 0.998096 1.000000 0.994656 0.997321
Salt 49 74 1 47 47 0 75 2 0.000000 0.040816 0.983871 1.000000 0.959184 0.979167
Secret 1310 1567 799 1304 1304 0 2366 6 0.000000 0.004580 0.998368 1.000000 0.995420 0.997705
Seed 1 6 0 0 0 6 1 0.000000 1.000000 0.857143 0.000000
Slack Token 4 1 0 4 4 0 1 0 0.000000 0.000000 1.000000 1.000000 1.000000 1.000000
Stripe Credentials 2 0 0 2 2 0 0 0 0.000000 1.000000 1.000000 1.000000 1.000000
Tencent WeChat API App ID 6 0 0 6 6 0 0 0 0.000000 1.000000 1.000000 1.000000 1.000000
Token 647 4169 453 626 626 0 4622 21 0.000000 0.032457 0.996014 1.000000 0.967543 0.983504
Token 647 4169 453 628 628 0 4622 19 0.000000 0.029366 0.996394 1.000000 0.970634 0.985098
Twilio Credentials 30 39 0 30 30 0 39 0 0.000000 0.000000 1.000000 1.000000 1.000000 1.000000
URL Credentials 224 168 197 223 223 0 365 1 0.000000 0.004464 0.998302 1.000000 0.995536 0.997763
URL Credentials 224 168 197 224 224 0 365 0 0.000000 0.000000 1.000000 1.000000 1.000000 1.000000
UUID 1075 265 0 1074 1073 1 264 2 0.003774 0.001860 0.997761 0.999069 0.998140 0.998604
12112 46625 4907 11872 11648 212 46413 464 0.004547 0.038309 0.988491 0.982125 0.961691 0.971800
12112 46625 4907 11913 11687 214 46411 425 0.004590 0.035089 0.989121 0.982018 0.964911 0.973389
4 changes: 2 additions & 2 deletions .github/workflows/check.yml
Original file line number Diff line number Diff line change
Expand Up @@ -38,8 +38,8 @@ jobs:
- name: Check ml_config.json and ml_model.onnx integrity
if: ${{ always() && steps.code_checkout.conclusion == 'success' }}
run: |
md5sum --binary credsweeper/ml_model/ml_config.json | grep 3a4bfcd6f3ea74461b158d4ec073cc06
md5sum --binary credsweeper/ml_model/ml_model.onnx | grep 9725b166e07e60f94929fea986f84ae2
md5sum --binary credsweeper/ml_model/ml_config.json | grep a412732e34e61dfd0128044c759a8ea7
md5sum --binary credsweeper/ml_model/ml_model.onnx | grep f3a6fd6ba6b3440e280eaf268a7ec204

# # # line ending

Expand Down
4 changes: 2 additions & 2 deletions SECURITY.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,8 +4,8 @@

| Version | Supported |
|---------|--------------------|
| 1.10.x | :white_check_mark: |
| <1.10.x | :x: |
| 1.11.x | :white_check_mark: |
| <1.11.x | :x: |

## Reporting a Vulnerability

Expand Down
2 changes: 1 addition & 1 deletion credsweeper/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -18,4 +18,4 @@
'__version__'
]

__version__ = "1.10.8"
__version__ = "1.11.0"
2 changes: 2 additions & 0 deletions credsweeper/common/morpheme_checklist.txt
Original file line number Diff line number Diff line change
Expand Up @@ -960,6 +960,7 @@ nish
nism
node
non
nope
norm
not
nsive
Expand Down Expand Up @@ -1529,6 +1530,7 @@ warn
watch
wave
way
weak
web
week
weight
Expand Down
4 changes: 3 additions & 1 deletion credsweeper/ml_model/features/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,9 @@
from credsweeper.ml_model.features.morpheme_dense import MorphemeDense
from credsweeper.ml_model.features.rule_name import RuleName
from credsweeper.ml_model.features.search_in_attribute import SearchInAttribute
from credsweeper.ml_model.features.word_in_line import WordInLine
from credsweeper.ml_model.features.word_in_path import WordInPath
from credsweeper.ml_model.features.word_in_postamble import WordInPostamble
from credsweeper.ml_model.features.word_in_preamble import WordInPreamble
from credsweeper.ml_model.features.word_in_transition import WordInTransition
from credsweeper.ml_model.features.word_in_value import WordInValue
from credsweeper.ml_model.features.word_in_variable import WordInVariable
32 changes: 32 additions & 0 deletions credsweeper/ml_model/features/word_in_postamble.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
from typing import List

import numpy as np

from credsweeper.common.constants import ML_HUNK
from credsweeper.credentials import Candidate
from credsweeper.ml_model.features.word_in import WordIn


class WordInPostamble(WordIn):
"""Feature is true if line contains at least one word from predefined list."""

def __init__(self, words: List[str]) -> None:
"""Feature returns array of matching words

Args:
words: list of predefined words - MUST BE IN LOWER CASE

"""
super().__init__(words)

def extract(self, candidate: Candidate) -> np.ndarray:
"""Returns true if any words in a part of line after value"""
postamble_end = len(candidate.line_data_list[0].line) \
if len(candidate.line_data_list[0].line) < candidate.line_data_list[0].value_end + ML_HUNK \
else candidate.line_data_list[0].value_end + ML_HUNK
postamble = candidate.line_data_list[0].line[candidate.line_data_list[0].value_end:postamble_end].strip()

if postamble:
return self.word_in_str(postamble.lower())
else:
return np.array([np.zeros(shape=[self.dimension], dtype=np.int8)])
37 changes: 37 additions & 0 deletions credsweeper/ml_model/features/word_in_preamble.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
from typing import List

import numpy as np

from credsweeper.common.constants import ML_HUNK
from credsweeper.credentials import Candidate
from credsweeper.ml_model.features.word_in import WordIn


class WordInPreamble(WordIn):
"""Feature is true if line contains at least one word from predefined list."""

def __init__(self, words: List[str]) -> None:
"""Feature returns array of matching words

Args:
words: list of predefined words - MUST BE IN LOWER CASE

"""
super().__init__(words)

def extract(self, candidate: Candidate) -> np.ndarray:
"""Returns true if any words in line before variable or value"""
if 0 <= candidate.line_data_list[0].variable_start:
preamble_start = 0 if ML_HUNK >= candidate.line_data_list[0].variable_start \
else candidate.line_data_list[0].variable_start - ML_HUNK
preamble = candidate.line_data_list[0].line[preamble_start:candidate.line_data_list[0].
variable_start].strip()
else:
preamble_start = 0 if ML_HUNK >= candidate.line_data_list[0].value_start \
else candidate.line_data_list[0].value_start - ML_HUNK
preamble = candidate.line_data_list[0].line[preamble_start:candidate.line_data_list[0].value_start].strip()

if preamble:
return self.word_in_str(preamble.lower())
else:
return np.array([np.zeros(shape=[self.dimension], dtype=np.int8)])
Original file line number Diff line number Diff line change
Expand Up @@ -2,13 +2,11 @@

import numpy as np

from credsweeper.common.constants import CHUNK_SIZE
from credsweeper.credentials import Candidate
from credsweeper.ml_model.features.word_in import WordIn
from credsweeper.utils import Util


class WordInLine(WordIn):
class WordInTransition(WordIn):
"""Feature is true if line contains at least one word from predefined list."""

def __init__(self, words: List[str]) -> None:
Expand All @@ -21,9 +19,14 @@ def __init__(self, words: List[str]) -> None:
super().__init__(words)

def extract(self, candidate: Candidate) -> np.ndarray:
"""Returns true if any words in first line"""
subtext = Util.subtext(candidate.line_data_list[0].line, candidate.line_data_list[0].value_start, CHUNK_SIZE)
if subtext:
return self.word_in_str(subtext.lower())
"""Returns true if any words between variable and value"""
if 0 <= candidate.line_data_list[0].variable_end < candidate.line_data_list[0].value_start:
transition = candidate.line_data_list[0].line[candidate.line_data_list[0].variable_end:candidate.
line_data_list[0].value_start].strip()
else:
transition = ''

if transition:
return self.word_in_str(transition.lower())
else:
return np.array([np.zeros(shape=[self.dimension], dtype=np.int8)])
Loading
Loading