Skip to content

Latest commit

 

History

History
849 lines (661 loc) · 64.4 KB

File metadata and controls

849 lines (661 loc) · 64.4 KB

Changelog

All notable changes to this project will be documented in this file.

The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.

0.2.27 (2026-03-05)

Features

  • add CatBoost .cbm scanner support (#627) (9138066)
  • add CNTK scanner support (#629) (74a60b9)
  • add CoreML .mlmodel scanner support (#635) (4e24291)
  • add llamafile executable scanner support (#634) (8d2c37d)
  • add Model Metadata Extractor feature (#383) (ff66f33)
  • add native LightGBM scanner support (#633) (d3aca64)
  • add R serialized scanner support (#628) (e27667c)
  • add RKNN scanner support (#631) (f1bbfb7)
  • add standalone compressed wrapper scanner (#630) (c5f0dba)
  • add TensorFlow MetaGraph scanner support (#637) (7c3c25d)
  • add Torch7 scanner support (#632) (2e6f2c4)
  • security: add CVE-2019-6446 attribution for NumPy object dtype RCE (#610) (5d707b5)
  • security: add CVE-2022-25882 attribution to ONNX external_data path traversal (#606) (4d69e83)
  • security: add CVE-2024-3660 Lambda code injection attribution (#604) (60ca40f)
  • security: add NeMo scanner for CVE-2025-23304 Hydra target injection (#609) (6d2dee3)
  • security: detect 3 PyTorch CVEs (JIT eval, RPC injection, RemoteModule RCE) (#611) (98f2af6)
  • security: detect 4 PyTorch CVEs via static scanning (#595) (024f583)
  • security: detect CVE-2024-27318 ONNX nested path traversal bypass (#607) (fe8837c)
  • security: detect CVE-2025-10155 pickle protocol 0/1 bypass via .bin extension (#605) (88a5901)
  • security: detect CVE-2025-1550 Keras safe_mode bypass (#599) (432c383)
  • security: detect CVE-2025-1716 pickle bypass via pip.main() (#598) (2f2ae20)
  • security: detect CVE-2025-49655 TorchModuleWrapper RCE (#600) (0c12d2d)
  • security: detect CVE-2025-51480 ONNX save_external_data file overwrite (#608) (fe04271)
  • security: detect CVE-2025-8747 get_file gadget bypass (#602) (16308d0)
  • security: detect CVE-2025-9905 H5 safe_mode bypass (#603) (1676693)
  • security: detect CVE-2025-9906 Keras enable_unsafe_deserialization config bypass (#601) (b493806)

Bug Fixes

  • block joblib.load pickle trampoline (#626) (966c223)
  • ci: resolve 4 release pipeline failures (#572) (7e2e7ed)
  • ci: resolve Ruff failures on main (#621) (bd186f0)
  • cli: surface operational scan error status in text output (#578) (ddbbec6)
  • close pickle EXT opcode bypass (#623) (ffb5ec1)
  • deps: promote msgpack to core dependency for Flax scanner (#583) (ebba6b2)
  • detect proto0/1 pickles inside zip entries (#624) (2bce49d)
  • downgrade non-traversal ONNX external data refs to WARNING (#642) (44eb3ab)
  • eliminate false positive in skops Unsafe Joblib Fallback Detection (#584) (c1dd2a6)
  • handle MemoryError gracefully for joblib/sklearn pickle files (#645) (f8599fe)
  • pickle-scanner: three targeted false-positive reductions (#591) (7a5567e)
  • preserve opcode analysis on malformed pickle tails (#625) (4fe4dee)
  • prevent false positives in TF SavedModel scanner (#588) (89282e2)
  • report actual file size in scan summary when scanner exits early (#587) (7d066fb)
  • resolve false positive for .keras ZIP files (Keras 3.x) (#582) (f575769)
  • resolve ONNX weight extraction failure (#589) (3f54602)
  • security: close scanner RCE bypasses and add regressions (#518) (e736ebb)
  • security: harden pickle scanner blocklist and multi-stream analysis (#581) (f0c7246)
  • stabilize nightly performance CI and optimize pickle opcode analysis (#619) (e5dcec5)
  • suppress false positives in PaddlePaddle scanner (#586) (ec7fc48)
  • tests: prevent multiple_stream_attack fixture rewrites (#580) (0eb47c9)
  • tests: resolve 3 nightly CI failures across Linux and Windows (#576) (dd115d1)
  • tests: resolve nightly CI failures on Linux and Windows (#597) (7f88c52)
  • tflite: recognize .tflite format without tflite package installed (#585) (8276184)
  • tighten metadata URL hostname matching (#617) (c2af8c1)

Documentation

  • add CVE detection checklist from 13 CVE implementation learnings (#612) (7ea1869)
  • audit and refresh README, user docs, and maintainer guides (#643) (015acdc)
  • rewrite SECURITY.md with comprehensive vulnerability policy (#594) (968a2c2)
  • update scanner architecture example (#579) (20de35d)

Added

  • tests: enable existing PaddlePaddle scanner tests in CI by adding test_paddle_scanner.py to the allowed test files list (Python 3.10/3.12/3.13)
  • security: detect CVE-2026-24747 PyTorch weights_only=True bypass via SETITEM/SETITEMS abuse and tensor metadata mismatch detection
  • security: detect CVE-2022-45907 PyTorch torch.jit.annotations.parse_type_line unsafe eval() injection (CVSS 9.8)

Fixed

  • security: treat legacy httplib pickle globals the same as http.client, including import-only and REDUCE findings in standalone and archived payloads
  • security: detect import-only pickle GLOBAL/STACK_GLOBAL references while preserving safe constructor imports and avoiding mislabeling executed call chains as import-only
  • security: harden TensorFlow weight extraction limits to bound actual tensor payload materialization, including malformed tensor_content and string-backed tensors, and continue scanning past oversized Const nodes
  • security: stream TAR members to temp files under size limits instead of buffering whole entries in memory during scan
  • security: inspect TensorFlow SavedModel function definitions when scanning for dangerous ops and protobuf string abuse, with function-aware finding locations
  • cli: include streamed artifacts as SBOM components when scan --stream --sbom is used
  • cli: exclude HuggingFace download cache bookkeeping files from remote SBOMs and asset lists
  • security: require official or explicitly allowlisted JFrog hosts before treating /artifactory/ URLs as authenticated JFrog endpoints
  • security: detect CVE-2024-5480 PyTorch torch.distributed.rpc arbitrary function execution via PythonUDF (CVSS 10.0)
  • security: detect CVE-2024-48063 PyTorch torch.distributed.rpc.RemoteModule deserialization RCE via pickle (CVSS 9.8)
  • security: detect CVE-2019-6446 in NumPy scanner when object-dtype arrays are found, with warning-level attribution (CVSS 9.8) due potential pickle deserialization via allow_pickle=True
  • security: new NeMo scanner detecting CVE-2025-23304 Hydra _target_ injection in .nemo model files (CVSS 7.6), with recursive config inspection and dangerous callable blocklist
  • security: detect CVE-2025-51480 ONNX save_external_data arbitrary file overwrite via external_data path traversal (CVSS 8.8)
  • security: detect CVE-2025-49655 TorchModuleWrapper deserialization RCE (CVSS 9.8).
  • security: add CatBoost .cbm scanner with strict CBM1 format validation, bounded parsing, and suspicious command/network/script indicator checks
  • security: add dedicated scanner support for R serialized artifacts (.rds, .rda, .rdata) with bounded decompression and static detection of executable symbol/payload indicators
  • security: add CNTK .dnn/.cmf scanner with strict signature validation, bounded reads, and multi-signal suspicious content correlation
  • feat: add standalone compressed-wrapper scanner support for .gz, .bz2, .xz, .lz4, and .zlib with strict signature validation, decompression size/ratio safeguards, and inner-payload scanner routing
  • security: add RKNN .rknn scanner with strict RKNN signature detection, bounded metadata parsing, and contextual command/network/obfuscation checks
  • security: add Torch7 (.t7, .th, .net) scanner with strict signature heuristics plus Lua execution primitive and dynamic module-loading detection
  • security: add native LightGBM scanner for .lgb/.lightgbm and signature-validated .model artifacts with strict XGBoost collision disambiguation and static command/network/path indicator checks
  • feat: add Llamafile executable scanner with bounded runtime-string analysis and embedded GGUF payload carving/forwarding
  • feat: add CoreML .mlmodel scanner with strict protobuf structure validation, custom layer/custom model detection, metadata abuse checks, and linked-model path safety checks
  • feat: add MXNet scanner support for paired *-symbol.json and *-NNNN.params artifacts with strict contract validation, companion-file checks, and suspicious reference/payload detection
  • security: add TensorFlow MetaGraph (.meta) scanner support with strict protobuf can_handle(), bounded MetaGraph parsing, unsafe op detection (PyFunc/PyCall/LoadLibrary), executable-context string checks, and payload-stuffing anomaly controls
  • security: add dedicated TorchServe .mar scanner with strict archive validation, bounded manifest/member reads, manifest policy checks, and recursive embedded payload scanning
  • security: detect CVE-2025-1716 pickle bypass via pip.main() as dangerous callable (CVSS 9.8)
  • keras: detect CVE-2025-9906 enable_unsafe_deserialization config bypass in .keras archives (CVSS 8.6, safe_mode bypass)
  • security: detect CVE-2025-8747 Keras get_file gadget safe_mode bypass
  • keras: detect CVE-2025-9905 H5 safe_mode bypass for Lambda layers (CVSS 7.3)
  • keras: add CVE-2024-3660 attribution to Lambda layer detection in .keras and .h5 scanners (CVSS 9.8)
  • keras: recursively inspect H5 training_config and .keras compile_config for custom losses and metrics, while allowlisting standard aliases and built-in preprocessing layers to reduce false positives
  • security: detect CVE-2025-10155 pickle protocol 0/1 payloads disguised as .bin files by extending detect_file_format() to recognize GLOBAL opcode patterns and adding posix/nt internal module names to binary code pattern blocklist
  • security: detect CVE-2022-25882 ONNX external_data path traversal with CVE attribution, CVSS score, and CWE classification in scan results
  • security: detect CVE-2024-27318 ONNX nested external_data path traversal bypass via path segment sanitization evasion

Security

  • keras: detect CVE-2025-1550 arbitrary module references in .keras config.json (CVSS 9.8, safe_mode bypass)

Fixed

  • security: scan OCI layer members based on registered file extensions so embedded ONNX, Keras H5, and other real-path scanners are no longer skipped inside tar layers
  • security: resolve bare-module TorchServe handler references like custom_handler to concrete archive members so malicious handler source is no longer skipped by static analysis
  • security: compare archive entry paths against the intended extraction root without following base-directory symlinks
  • security: stop loading .env files implicitly during JFrog helper import so untrusted working directories cannot rewrite proxy or auth-related environment variables
  • rules: preserve rule_code metadata through direct result aggregation and ensure dangerous advanced pickle globals emit explicit rule codes (with regression coverage)
  • rules: ignore unknown rule IDs in config files with warning logs, normalize rule-code casing in config parsing, and prevent invalid severity entries from being applied
  • telemetry: refresh the cached telemetry client when runtime context changes and lazily initialize PostHog when telemetry is re-enabled in-process
  • tests: add scanner literal rule_code registry-consistency coverage to catch unknown rule identifiers early
  • cloud: harden cache path handling to prevent sibling-prefix bypasses from escaping cache boundaries, avoid deleting out-of-cache metadata paths during cleanup, and clean temporary cloud download directories on failure
  • tests: unskip and restore cloud disk-space failure coverage; add regressions for cache boundary enforcement and temp-directory cleanup on download errors
  • security: harden pickle scanner stack resolution to correctly track STACK_GLOBAL and memoized REDUCE call targets, preventing decoy-string and BINGET bypasses
  • security: flag pickle EXT1/EXT2/EXT4 extension-registry call targets in REDUCE analysis to close EXT opcode bypasses
  • security: detect protocol 0/1 ASCII pickle signatures in generic file-format detection to prevent ZIP entry extension bypasses (e.g., malicious payload.txt)
  • security: harden protocol 0/1 pickle format detection with bounded opcode parsing to catch prefixed payloads (e.g., MARK/LIST before GLOBAL) while reducing plain-text false positives in ZIP entry scanning
  • security: keep opcode-level pickle analysis active when malformed streams trigger unicode/text parse errors after partial opcode extraction
  • security: treat joblib.load as always dangerous and remove it from pickle ML allowlist to block loader trampoline bypasses
  • security: tighten manifest trusted-domain matching to validate URL hostnames instead of substring matches
  • security: make .keras suspicious file extension checks case-insensitive to catch uppercase executable/script payloads
  • security: block unsafe in-process torch.load in WeightDistributionScanner by default unless explicitly opted in
  • fix: tighten metadata scanner suspicious URL matching to use exact hostname/subdomain checks and add focused regression coverage
  • fix: treat .nemo files as tar-compatible during file-type validation to avoid false extension/magic mismatch alerts
  • fix: pass XGBoost load-test file paths via subprocess argv instead of interpolating shell-quoted paths into python -c, preventing backslash escape corruption on Windows-style paths
  • security: reject absolute OCI layer references so .manifest files cannot scan host tarballs outside the OCI layout

Documentation

  • update README and user docs for the modelaudit metadata command, metadata safety guidance (--trust-loaders), and new NeMo format coverage
  • align maintainer/agent docs with current architecture and release workflow (metadata extractor component, dependency extras, and release-please + changelog guidance)

0.2.26 (2026-02-24)

Bug Fixes

  • ci: pin protoc version for vendored proto reproducibility (#548) (03e9d35)
  • cli: add --cache-dir and simplify defaults wording (#550) (b8701dd)
  • cli: fail fast when glob patterns match nothing (#519) (404104b)
  • deps: update dependency xgboost to >=3.2,<3.3 (#507) (4489e97)
  • enforce consistent scanner patterns across all scanners (#564) (dd6b8d2)
  • improve test suite reliability and safety (#565) (4bd04a7)
  • remove security anti-patterns from scanning infrastructure (#562) (d02cd0b)
  • security: close critical scanner and CI gating gaps (#553) (807a8aa)
  • security: resolve CodeQL alerts for workflow permissions and sensitive logging (#570) (d2dfc79)
  • security: resolve remaining audit findings (#4-#8) (#556) (7430436)
  • security: use URL hostname parsing instead of substring matching (#571) (b4d3696)
  • test: relax benchmark timing assertions for Windows CI (#569) (b06faac)

Documentation

  • clarify README exit codes (#568) (e57a0de)
  • fix accuracy issues across AGENTS.md, README, and CONTRIBUTING (#566) (880e7a4)
  • open-source: add user trust docs batch (#534) (dd5e676)
  • readme: add cache management flag (#521) (33d74bd)
  • ship next-phase open-source readiness docs (#532) (c88035d)
  • trim README to essentials, fix inaccuracies (#517) (59c056c)

0.2.25 - 2026-02-12

Features

  • add binary patterns for native code loading (#499) (ef638f1)
  • add comprehensive Windows compatibility support (#474) (d62574e)
  • add detection for dangerous TensorFlow operations (#494) (6c4c0c9)
  • add detection for memo-based and extension registry pickle opcodes (#493) (72509f7)
  • add getattr-based evasion detection patterns (#500) (87ba295)
  • add Git LFS pointer detection (#488) (6413ae3)
  • add Keras subclassed model detection (#503) (d9e5663)
  • add lambda variadic argument validation (#501) (52a6622)
  • add PyTorch ZIP archive security controls (#502) (09ab087)
  • eliminate TensorFlow dependency with vendored protobuf stubs (#485) (56cec5e)
  • expand SUSPICIOUS_GLOBALS with process and memory modules (#495) (8637d2b)

Bug Fixes

  • add content-based CVE detection to SkopsScanner (#498) (89895cb)
  • add logging to critical exception handlers in pickle scanner (#492) (b6b06cb)
  • add logging to silent exception handlers in secrets detector (#491) (b59f8a4)
  • add security keywords to QueueEnqueueV2 TF op explanation (#511) (1d93483)
  • ci: ensure numpy compatibility job runs (#478) (7266160)
  • deps: bump pillow 12.1.0→12.1.1 and cryptography 46.0.4→46.0.5 (#513) (5b18d49)
  • deps: update dependency fickling to v0.1.7 [security] (#479) (292eb23)
  • improve Python version requirement UX (#508) (a44d8bb)
  • reduce false positive scan warnings for HuggingFace models (#514) (b545c11)
  • reduce pickle scanner false positives for BERT and standalone REDUCE opcodes (#510) (94c22d6)
  • remove duplicate whitelist downgrading in add_check() (#490) (a8c52bc)
  • remove variable shadowing for skip_file_types parameter (#489) (bcf99ea)
  • use deterministic data patterns in anomaly detector tests (#477) (df11759)

0.2.24 - 2025-12-23

Bug Fixes

  • deps: update dependency contourpy to <1.3.4 (#463) (16fb916)
  • deps: update dependency fickling to v0.1.6 [security] (#462) (9413ddc)
  • deps: update dependency xgboost to v3 (#469) (97adbbc)
  • resolve release-please CHANGELOG formatting race condition (#457) (4347b83)

0.2.23 - 2025-12-12

Documentation

  • consolidate agent guidance (#453) (a01ceff)
  • restructure AGENTS.md and CLAUDE.md following 2025 best practices (#451) (e87de51)

0.2.22 - 2025-12-10

Added

  • feat: add modelaudit debug command for troubleshooting - outputs comprehensive diagnostic information including version, platform, environment variables, authentication status, scanner availability, NumPy compatibility, cache status, and configuration in JSON or pretty-printed format; useful for bug reports and support interactions

0.2.21 - 2025-12-09

Fixed

  • fix: resolve UnicodeDecodeError when scanning PyTorch .pkl files saved with default ZIP serialization - torch.save() uses ZIP format by default since PyTorch 1.6 (_use_new_zipfile_serialization=True), but ModelAudit was incorrectly routing these files to PickleScanner which failed to parse the ZIP header. Now correctly routes ZIP-format .pkl files to PyTorchZipScanner.

0.2.20 - 2025-12-01

Added

  • feat: detect cloud storage URLs in model configs (AWS S3, GCS, Azure Blob, HuggingFace Hub) - identifies external resource references that could indicate supply chain risks or data exfiltration vectors
  • feat: add URL allowlist security scanning to manifest scanner - uses 164 trusted domains to flag untrusted URLs in model configs as potential supply chain risks
  • feat: detect weak hash algorithms (MD5, SHA1) in model config files - scans manifest files for hash/checksum fields using cryptographically broken algorithms and reports WARNING with CWE-328 reference; SHA256/SHA512 usage is confirmed as strong
  • feat: add comprehensive analytics system with Promptfoo integration - opt-out telemetry for usage insights, respects PROMPTFOO_DISABLE_TELEMETRY and NO_ANALYTICS environment variables
  • feat: auto-enable progress display when output goes to file - shows spinner/progress when stdout is redirected to a file

Fixed

  • fix: resolve false positives in pickle and TFLite scanners - improved detection accuracy
  • fix: clean up tests for CI reliability - removed flaky tests and improved test isolation

0.2.19 - 2025-11-24

Fixed

  • fix: resolve Jinja2 SSTI false positives from bracket notation - refined obfuscation pattern to only match dunder attributes (["__class__"]) instead of legitimate dict access (["role"]), and fixed regex bug where |format\( matched any pipe character
  • fix: remove overly broad secret detection pattern - replaced generic [A-Za-z0-9]{20,} pattern with specific well-known token formats (GitHub, OpenAI, AWS, Slack) to eliminate false positives on URLs and model IDs
  • fix: resolve msgpack file type validation false positive - unified format name inconsistency where functions returned different values ("msgpack" vs "flax_msgpack"), causing validation failures on legitimate MessagePack files
  • fix: add HuggingFace training utilities to pickle safe globals - added safe Transformers, Accelerate, and TRL classes (HubStrategy, SchedulerType, DistributedType, DeepSpeedPlugin, DPOConfig, etc.) to reduce false positives on training checkpoints

0.2.18 - 2025-11-20

Fixed

  • fix: exclude INFO/DEBUG checks from success rate calculation - success rate now only includes security-relevant checks (WARNING/CRITICAL), with informational checks (INFO/DEBUG) shown separately in "Failed Checks (non-critical)" section
  • fix: missing whitelist logic in validation checks - whitelist downgrading now correctly applies to validation result instantiations
  • fix: resolve PyTorch ZIP scanner hang on large models - improved memory-mapped file handling and timeout configuration
  • fix: additional severity downgrades - further reduced false positives across multiple scanners

Changed

  • chore: standardize on add_check() API - migrated all internal code from legacy add_issue() method to modern add_check() method for structured check reporting with explicit pass/fail status

0.2.17 - 2025-11-19

Fixed

  • fix: eliminate false positive WARNINGs on sklearn/joblib models (removed overly broad pattern matching)
    • Removed b"sklearn", b"NumpyArrayWrapper", and b"numpy_pickle" from binary pattern detection
    • These patterns flagged ALL legitimate sklearn/joblib models (100% false positive rate)
    • Regex CVE patterns still detect actual exploits requiring dangerous combinations
    • Reduces false positive WARNING rate by 77% (10 out of 13 WARNINGs eliminated)
  • fix: NEWOBJ/OBJ/INST opcodes now recognize safe ML classes (eliminates sklearn model false positives)
    • Applied same safety logic as REDUCE opcode: check if class is in ML_SAFE_GLOBALS allowlist
    • sklearn models like LogisticRegression now correctly identified as INFO instead of WARNING
    • Added support for nested sklearn modules (e.g., sklearn.linear_model._logistic)
    • Added joblib.numpy_pickle.NumpyArrayWrapper and dtype.dtype to safe class list
  • fix: handle joblib protocol mismatches gracefully (protocol 4 files using protocol 5 opcodes)
    • joblib files may declare protocol 4 but use protocol 5 opcodes like READONLY_BUFFER (0x0f)
    • Scanner now parses as much as possible before unknown opcodes, logs INFO instead of failing
    • Eliminates false positive "Invalid pickle format - unrecognized opcode" WARNING on joblib files
  • fix: accept ZIP magic bytes for .npz files (NumPy compressed format is ZIP by design)
    • .npz files ARE ZIP archives containing multiple .npy files (numpy.savez format)
    • Now accepts both "zip" and "numpy" header formats for .npz extension
    • Fixed case-sensitivity bug: MODEL.NPZ, model.Npz now handled correctly
  • fix: handle XML namespaces in PMML root element validation
    • PMML 4.x files with namespaces like {http://www.dmg.org/PMML-4_4}PMML now recognized
    • Strips namespace prefix before comparing tag name
  • fix: add validation to prevent TFLite scanner crashes on malformed files
    • Pre-validates magic bytes ("TFL3") before parsing
    • Prevents buffer overflow crashes: "unpack_from requires a buffer of at least X bytes"
    • Added security rationale ("why" field) to magic bytes check

0.2.16 - 2025-11-04

Added

  • feat: content hash generation for regular scan mode - all scans (not just streaming) now generate content_hash field for model deduplication and verification

Changed

  • refactor: rename --scan-and-delete flag to --stream for clarity - streaming mode is now invoked with the more intuitive --stream flag

0.2.15 - 2025-10-31

Added

  • feat: universal streaming scan-and-delete mode for all sources to minimize disk usage
    • New --scan-and-delete CLI flag works with ALL sources (not just HuggingFace):
      • HuggingFace models (hf:// or https://huggingface.co/)
      • Cloud storage (S3, GCS: s3://, gs://)
      • PyTorch Hub (https://pytorch.org/hub/)
      • Local directories
    • Files are downloaded/scanned one-by-one, then deleted immediately
    • Computes SHA256 hash for each file and aggregate content hash for deduplication
    • Adds content_hash field to scan results for identifying identical models
    • Ideal for CI/CD or constrained disk environments where downloading entire models (100GB+) isn't feasible

Changed

  • chore: move cloud storage dependencies (fsspec, s3fs, gcsfs) to default install - S3, GCS, and cloud storage now work without [cloud] extra

Fixed

  • fix: centralize MODEL_EXTENSIONS to ensure all scannable formats are downloaded from HuggingFace
    • Created single source of truth for model extensions (62+ formats including GGUF)
    • Previously: GGUF files relied on fallback download (inefficient, downloads all files)
    • Now: GGUF, JAX, Flax, NumPy and other formats are properly detected and selectively downloaded
    • Dynamically extracts extensions from scanner registry to stay in sync
  • fix: restore fallback behavior in streaming downloads to maintain parity with non-streaming mode

0.2.14 - 2025-10-23

Fixed

  • fix: eliminate false positives across URL detection, CVE checks, GGUF parsing, and secret detection (#412)
  • fix: improve shebang detection, fix fsspec usage, and resolve UnboundLocalError (#411)

0.2.13 - 2025-10-23

Added

  • feat: huggingface model whitelist (#409)

Fixed

  • fix: eliminate CVE-2025-32434 false positives for legitimate PyTorch models (#408)

0.2.12 - 2025-10-22

Fixed

  • fix: remove non-security format validation checks across scanners (#406)
  • fix: eliminate false positives in stack depth, GGUF limits, and builtins detection (#405)

0.2.11 - 2025-10-22

Fixed

  • fix: INFO and DEBUG severity checks no longer count as failures in success rate calculations

0.2.10 - 2025-10-22

Fixed

  • fix: eliminate false positive REDUCE warnings for safe ML framework operations (#398)
  • fix: eliminate ONNX custom domain and PyTorch pickle false positives (#400)
  • fix: eliminate false positive JIT/Script warnings on ONNX files (#399)

0.2.9 - 2025-10-21

Added

  • feat: add context-aware severity for PyTorch pickle models (#395)
    • Implement SafeTensors detection utility to identify safer format alternatives
    • Add import analysis to distinguish legitimate vs malicious pickle imports
    • Consolidate opcode warnings into single check with evidence counts
    • Add import_reference field to pickle scanner GLOBAL checks for analysis
    • Provide actionable recommendations (use SafeTensors format)

Changed

  • feat: rewrite PyTorch pickle severity logic with context-awareness (#395)
    • CRITICAL: malicious imports detected (os.system, subprocess, eval)
    • WARNING: legitimate imports + SafeTensors alternative available
    • INFO: legitimate imports + no SafeTensors alternative
    • Reduces false positives while maintaining security detection accuracy
    • Example: sentence-transformers/all-MiniLM-L6-v2 now shows WARNING (was CRITICAL)

0.2.8 - 2025-10-21

Added

  • feat: add skops scanner for CVE-2025-54412/54413/54886 detection (#392)
    • Implement dedicated skops scanner for .skops model files
    • Detect CVE-2025-54412 (OperatorFuncNode RCE vulnerability)
    • Detect CVE-2025-54413 (MethodNode dangerous attribute access)
    • Detect CVE-2025-54886 (Card.get_model silent joblib fallback)
    • Add ZIP format validation and archive bomb detection

Changed

  • refactor: remove non-security checks prone to false positives (#391)
    • Remove blacklist checks from manifest scanner
    • Remove model name policy checks from manifest scanner
    • Streamline XGBoost scanner by removing non-security validation checks
    • Reduce false positives in metadata scanner

Fixed

  • fix: resolve XGBoost UBJ crash and network scanner false positives (#392)
    • Fix UBJ format JSON serialization crash by sanitizing bytes objects to hex strings
    • Eliminate network scanner false positives for pickle/joblib ML models by adding ML context awareness
    • Add comprehensive XGBoost testing documentation with 25-model test corpus

0.2.7 - 2025-10-20

Fixed

  • fix: improve XGBoost scanner severity levels and reduce false positives (#389)
    • Handle string-encoded numeric values in XGBoost JSON models
    • Add deterministic JSON validation to prevent claiming non-XGBoost files
    • Implement tiered file size thresholds (INFO → WARNING) for large models
    • Downgrade metadata scanner generic secret patterns from WARNING to INFO
    • Reduce false positives for BibTeX citations and code examples in README files
  • fix: prevent ML confidence bypass and hash collision security exploits (#388)
    • Enable --verbose flag and accurate HuggingFace file sizes
    • Remove CoreML scanner and coremltools dependency
  • fix: enable advanced TorchScript vulnerability detection (#384)
    • Enable comprehensive detection for serialization injection, module manipulation, and bytecode injection patterns

Changed

  • refactor: reorganize codebase into logical module structure (#387)
    • Create detectors/ module for security detection logic
    • Improve maintainability and reduce import complexity
  • chore(deps): bump tj-actions/changed-files from v46 to v47 (#386)

0.2.6 - 2025-09-10

Added

  • feat: add comprehensive JFrog folder scanning support (#380)
  • feat: add comprehensive XGBoost model scanner with security analysis (#378)
  • feat: consolidate duplicate caching logic into unified decorator (#347)
  • test: improve test architecture with dependency mocking (#374)

Fixed

  • fix: exclude Python 3.13 from NumPy 1.x compatibility tests (#375)

0.2.5 - 2025-09-05

Added

  • feat: upgrade to CycloneDX v1.6 (ECMA-424) with enhanced ML-BOM support (#364)
  • feat: add 7-Zip archive scanning support (#344)
  • feat: re-enable check consolidation system (#353)
  • feat: integrate ty type checker and enhance type safety (#372)

Changed

  • BREAKING: drop Python 3.9 support, require Python 3.10+ minimum
  • feat: add Python 3.13 support
  • feat: consolidate CLI from 25 to 12 flags using smart detection (#359)
  • feat: enhance pickle static analysis with ML context awareness (#358)
  • feat: enhance check consolidation system with PII sanitization and performance improvements (#356)
  • docs: update AGENTS.md with exact CI compliance instructions (#357)
  • docs: rewrite README with professional technical content (#370)
  • feat: improve logging standards and consistency (#355)
  • chore(deps): bump the github-actions group with 2 updates (#362)
  • chore: update dependencies and modernize type annotations (#360)
  • chore: remove unnecessary files from root directory (#369)

Fixed

  • fix: handle GGUF tensor dictionaries in SBOM asset creation (#363)
  • fix: correct release dates in CHANGELOG.md (#354)
  • fix: resolve SBOM generation FileNotFoundError with URLs (#373)

0.2.4 - 2025-08-28

Added

  • feat: improve CVE-2025-32434 detection with density-based analysis (#351)
  • feat: implement graceful degradation and enhanced error handling (#343)
  • feat: improve PyTorch ZIP scanner maintainability by splitting scan() into smaller functions (#346)
  • feat: add SARIF output format support for integration with security tools and CI/CD pipelines (#349)
  • feat: optimize cache performance by reducing file system calls (#338)
  • feat: comprehensive task list update and critical CLI usability audit (#340)
  • feat: add cache management CLI commands mirroring promptfoo's pattern (#331)
  • feat: add comprehensive metadata security scanner and enhanced HuggingFace support (#335)
  • feat: add comprehensive CVE detection for pickle/joblib vulnerabilities (#326)
  • feat: add Jinja2 template injection scanner (#323)
  • feat: comprehensive deep Pydantic integration with advanced type safety (#322)
  • feat: optimize CI for faster feedback (#320)
  • feat: skip SafeTensors in WeightDistributionScanner for performance (#317)
  • feat: add Pydantic models for JSON export with type safety (#315)
  • feat: add support for multi-part archive suffixes (#307)
  • docs: add comprehensive CI optimization guide (#319)
  • docs: add Non-Interactive Commands guidance to AGENTS.md (#318)
  • docs: add comprehensive publishing instructions (#302)
  • test: speed up tests and CI runtime (#316)
  • test: cover Windows path extraction scenarios (#313)
  • feat: detect dangerous TensorFlow operations (#329)
  • feat: enhance pickle scanner with STACK_GLOBAL and memo tracking (#330)
  • feat: detect Windows and Unix OS module aliases to prevent system command execution via nt and posix

Changed

  • chore: organize root directory structure (#341)
  • chore: make ctrl+c immediately terminate if pressed twice (#314)

Fixed

  • fix: aggregate security checks per file instead of per chunk (#352)
  • fix: eliminate circular import between base.py and core.py (#342)
  • fix: default bytes_scanned in streaming operations (#312)
  • fix: validate directory file list before filtering (#311)
  • fix: tighten ONNX preview signature validation (#310)
  • fix: recurse cloud object size calculations (#309)
  • fix: handle missing author in HuggingFace model info (#308)
  • fix: handle PyTorch Hub URLs with multi-part extensions (#306)
  • fix: avoid duplicated sharded file paths (#305)
  • fix: handle None values in Keras H5 scanner to prevent TypeError (#303)

0.2.3 - 2025-08-21

Added

  • feat: increase default max_entry_size from 10GB to 100GB for large language models (#298)
  • feat: add support for 1TB+ model scanning (#293)
  • docs: improve models.md formatting and organization (#297)

Fixed

  • fix: improve cache file skip reporting to not count as failed checks (#300)
  • fix: eliminate ZIP entry read failures with robust null checking and streaming (#299)

0.2.2 - 2025-08-21

Added

  • feat: increase default scan timeout to 1 hour (#292)
  • feat: improve CLI output user experience with verbose summary (#290)
  • feat: add promptfoo authentication delegation system (#287)
  • feat: expand malicious model test corpus with 42+ new models (#286)
  • feat: streamline file format detection I/O (#285)
  • feat: add comprehensive progress tracking for large model scans (#281)
  • feat: raise large model thresholds to 10GB (#280)
  • feat: enable scanner-driven streaming analysis (#278)
  • feat: safely parse PyTorch ZIP weights (#268)
  • feat: add comprehensive authentication system with semgrep-inspired UX (#50)
  • docs: document security features and CLI options in README (#279)

Changed

  • perf: cache port regex patterns for network detector (#269)
  • refactor: reduce file handle usage in format detection (#283)

Fixed

  • fix: eliminate SafeTensors recursion errors with high default recursion limit (#295)
  • fix: add interrupt handling to ONNX scanner for graceful shutdown (#294)
  • fix: eliminate duplicate checks through content deduplication (#289)
  • fix: implement ML-context-aware stack depth limits to eliminate false positives (#284)
  • fix: optimize directory detection (#282)
  • fix: include license files in metadata scan (#277)
  • fix: validate cloud metadata before download (#276)
  • fix: handle async event loop in cloud download (#273)
  • fix: add pdiparams extension to cloud storage filter (#272)
  • fix: streamline magic byte detection (#271)
  • fix: close cloud storage filesystems (#267)
  • fix: flag critical scan errors (#266)
  • fix: finalize early scan file exits (#265)
  • fix: isolate network detector custom patterns (#264)
  • fix: warn when JFrog auth missing (#263)
  • fix: refine dangerous pattern detection check (#262)
  • fix: handle deeply nested SafeTensors headers (#244)

Removed

  • chore: remove outdated markdown documentation files (#296)

0.2.1 - 2025-08-15

Added

  • feat: enhance timeout configuration for progressive scanning (#252)
  • feat: add Keras ZIP scanner for new .keras format (#251)
  • feat: add enhanced TensorFlow SavedModel scanner for Lambda layer detection (#250)
  • feat: add compile() and eval() variants detection (#249)
  • feat: improve os/subprocess detection for command execution patterns (#247)
  • feat: add runpy module detection as critical security risk (#246)
  • feat: add importlib and runpy module detection as CRITICAL security issues (#245)
  • feat: add webbrowser module detection as CRITICAL security issue (#243)
  • feat: add record path and size validation checks (#242)
  • feat: enhance detection of dangerous builtin operators (#241)
  • feat: add network communication detection (#238)
  • feat: add JIT/Script code execution detection (#237)
  • feat: add embedded secrets detection (#236)
  • feat: add comprehensive security check tracking and reporting (#235)
  • feat: add JFrog integration helper (#230)
  • feat: add PyTorch Hub URL scanning (#228)
  • feat: add tar archive scanning (#227)
  • feat: add SPDX license checks (#223)
  • feat: add RAIL and BigScience license patterns (#221)
  • feat: expand DVC targets during directory scan (#215)
  • feat: adjust SBOM risk scoring (#212)
  • feat: add py_compile validation to reduce false positives (#206)
  • feat: add disk space checking before model downloads (#201)
  • feat: add interrupt handling for graceful scan termination (#196)
  • feat: add CI-friendly output mode with automatic TTY detection (#195)

Changed

  • perf: use bytearray for chunked file reads (#217)
  • chore: improve code professionalism and remove casual language (#258)
  • refactor: remove unreachable branches (#222)
  • refactor: remove type ignore comments (#211)

Fixed

  • fix: improve detection of evasive malicious models and optimize large file handling (#256)
  • fix: eliminate false positives and false negatives in model scanning (#253)
  • fix: improve PyTorch ZIP scanner detection for .bin files (#248)
  • fix: add dangerous pattern detection to embedded pickles in PyTorch models (#240)
  • fix: reduce false positives in multiple scanners (#229)
  • fix: cast sbom output string (#220)
  • fix: stream zip entries to temp file (#218)
  • fix: handle broken symlinks safely (#214)
  • fix: enforce UTF-8 file writes (#213)
  • fix: update PyTorch minimum version to address CVE-2025-32434 (#205)
  • fix: add main.py module and improve interrupt test reliability (#204)
  • fix: resolve linting and formatting issues (#203)
  • fix: return non-zero exit code when no files are scanned (#200)
  • fix: improve directory scanning with multiple enhancements (#194)
  • fix: add missing type annotations to scanner registry (#191)
  • fix: resolve CI timeout by running only explicitly marked slow/integration tests (#190)
  • fix: change false positive messages from INFO to DEBUG level (#189)

Security

  • fix: resolve PyTorch scanner pickle path context and version bump to 0.2.1 (#257)

0.2.0 - 2025-07-17

Added

  • feat: add scan command as default - improved UX with scan as the default command (#180)
  • feat: add TensorRT engine scanner - support for NVIDIA TensorRT optimized models (#174)
  • feat: add Core ML model scanner - support for Apple's Core ML .mlmodel format (#173)
  • feat: add PaddlePaddle model scanner - support for Baidu's PaddlePaddle framework models (#172)
  • feat: add ExecuTorch scanner - support for Meta's ExecuTorch mobile inference format (#171)
  • feat: add TensorFlow SavedModel weight analysis - deep analysis of TensorFlow model weights (#138)
  • ci: add GitHub Actions dependency caching - optimized CI pipeline performance (#183)

Fixed

  • fix: optimize CI test performance for large blob detection (#184)
  • fix: properly handle HuggingFace cache symlinks to avoid path traversal warnings (#178)

0.1.5 - 2025-06-20

Added

  • feat: add cloud storage support - Direct scanning from S3, GCS, and other cloud storage (#168)
  • feat: add JFrog Artifactory integration - Download and scan models from JFrog repositories (#167)
  • feat: add JAX/Flax model scanner - Enhanced support for JAX/Flax model formats (#166)
  • feat: add NumPy 2.x compatibility - Graceful fallback and compatibility layer (#163)
  • feat: add MLflow model integration - Native support for MLflow model registry scanning (#160)
  • feat: add DVC pointer support - Automatic resolution and scanning of DVC-tracked models (#159)
  • feat: add nested pickle payload detection - Advanced analysis for deeply embedded malicious code (#153)
  • feat: enhance SafeTensors scanner - Suspicious metadata and anomaly detection (#152)
  • feat: add HuggingFace Hub integration - Direct model scanning from HuggingFace Hub URLs (#144, #158)
  • feat: improve output formatting for better user experience (#143)
  • feat: add PythonOp detection in ONNX - Critical security check for custom Python operations (#140)
  • feat: add dangerous symlink detection - Identify malicious symbolic links in ZIP archives (#137)
  • feat: add TFLite model scanner - Support for TensorFlow Lite mobile models (#103)
  • feat: add asset inventory reporting - Comprehensive model asset discovery and cataloging (#102)
  • feat: add Flax msgpack scanner - Support for Flax models using MessagePack serialization (#99)
  • feat: add PMML model scanner - Support for Predictive Model Markup Language files (#98)
  • feat: add header-based format detection - Improved accuracy for model format identification (#72)
  • feat: add CycloneDX SBOM output - Generate Software Bill of Materials in standard format (#59)
  • feat: add OCI layer scanning - Security analysis of containerized model layers (#53)
  • test: add comprehensive test coverage for TFLite scanner (#165)
  • perf: achieve 2074x faster startup - Lazy loading optimization for scanner dependencies (#129)

Changed

  • perf: stop scanning when size limit reached for better performance (#139)

Fixed

  • fix: reduce HuggingFace model false positives (#164)
  • fix: reduce false positives for Windows executable detection in model files (#162)

0.1.4 - 2025-06-20

Added

  • feat: add binary pattern validation - Executable signature and pattern analysis (#134)
  • feat: refine import pattern detection - Enhanced detection of malicious imports (#133)
  • feat: centralize security patterns with validation system (#128)
  • feat: add unified scanner logging - Consistent logging across all scanner modules (#125)
  • feat: add magic byte-based file type validation - Improved format detection accuracy (#117)
  • feat: add centralized dangerous pattern definitions - Unified security rule management (#112)
  • feat: add scan configuration validation - Input validation and error handling (#107)
  • feat: add total size limit enforcement - Configurable scanning limits across all scanners (#106, #119)
  • feat: enhance dill and joblib serialization support - Advanced security scanning for scientific computing libraries (#55)
  • feat: add GGML format variants support for better compatibility (4c3d842)
  • test: organize comprehensive security test assets with CI optimization (#45)

0.1.3 - 2025-06-17

Added

  • feat: add security issue explanations - User-friendly 'why' explanations for detected threats (#92)
  • feat: add modern single-source version management - Streamlined release process (#91)
  • feat: add GGUF/GGML scanner - Support for llama.cpp and other quantized model formats (#66)
  • feat: add ONNX model scanner - Security analysis for Open Neural Network Exchange format (#62)
  • feat: add dill, joblib, and NumPy format support - Extended serialization format coverage (#60)
  • feat: add comprehensive GGUF/GGML security checks - Advanced threat detection for quantized models (#56)

Changed

  • chore: modernize pyproject configuration (#87)
  • chore: refine package build configuration (#82)

Fixed

  • fix: broaden ZIP signature detection (#95)
  • fix: synchronize version between pyproject.toml and init.py to 0.1.3 (#90)
  • fix: eliminate false positives in GPT-2 and HuggingFace models (#89)

0.1.2 - 2025-06-17

Added

  • feat: add Biome formatter integration - Code quality tooling for JSON and YAML files (#79)
  • feat: enable full scan for .bin files (#76)
  • feat: add zip-slip attack protection - Prevent directory traversal attacks in ZIP archives (#63)
  • feat: add SafeTensors scanner - Security analysis for Hugging Face's SafeTensors format (#61)
  • feat: add dill pickle support - Extended pickle format security scanning (#48)
  • feat: add CLI version command - Easy version identification for users (#44)
  • feat: add weight distribution anomaly detector - Advanced backdoor detection through statistical analysis (#32)
  • docs: optimize README and documentation for PyPI package distribution (#83)

Changed

  • chore: update biome configuration to v2.0.0 schema (#85)
  • chore: change errors → findings (#67)

Fixed

  • fix: reduce PyTorch pickle false positives (#78)
  • fix: log weight extraction failures (#75)
  • fix: log debug issues at debug level (#74)
  • fix: clarify missing data.pkl warning (#73)
  • fix: clarify missing dependency error messages (#71)
  • fix: change weight distribution warnings to info level (#69)
  • fix: correct duration calculation (#68)

0.1.1 - 2025-06-16

Added

  • feat: add multi-format .bin file support - Enhanced detection for various binary model formats (#57)
  • feat: add PR title validation - Development workflow improvements (#35)
  • feat: add manifest parser error handling - Better diagnostics for corrupted model metadata (#30)
  • feat: change output label of ERROR severity to CRITICAL (#25)

Changed

  • chore: replace Black, isort, flake8 with Ruff for faster linting and formatting (#24)

Fixed

  • fix: treat raw .pt files as unsupported (#40)
  • fix: avoid double counting bytes in zip scanner (#39)
  • fix: mark scan result unsuccessful on pickle open failure and test (#29)
  • fix: ignore debug issues in output status (#28)
  • fix: use supported color for debug output (#27)
  • fix: switch config keys to info and reduce false positives (#8)
  • fix: reduce false positives for ML model configurations (#3)

0.1.0 - 2025-03-08

Added

  • feat: add ZIP archive security analysis - Comprehensive scanning of compressed model packages (#15)
  • feat: add stack_global opcode detection - Critical security check for dangerous pickle operations (#7)
  • feat: add configurable exit codes - Standardized return codes for CI/CD integration (#6)
  • feat: add core pickle scanning engine - foundation for malicious code detection in Python pickles (f3b56a7)
  • docs: add AI development guidance - CLAUDE.md for AI-assisted development (#16)
  • ci: add GitHub Actions CI/CD - Automated testing and security validation (#4)

Fixed

  • style: improve code formatting and documentation standards (#12, #23)
  • fix: improve core scanner functionality and comprehensive test coverage (#11)