Skip to content

SPDX fileTypes: classify shebang scripts as SOURCE (optionally +TEXT) #4640

@dorser

Description

@dorser

What would you like to be added:
Improve SPDX 2.3 fileTypes classification so that files identified as scripts (e.g., files with a valid shebang #! header) are emitted with SOURCE in addition to or instead of only TEXT.

Today, Syft appears to classify shebang scripts (e.g., /usr/sbin/dpkg-preconfigure) as:

"fileTypes": ["TEXT"]

Since SPDX 2.3 does not define a SCRIPT type, the closest semantic match for executable scripts is SOURCE.

Proposed behavior:

If file starts with a valid shebang (#!), emit:

"fileTypes": ["SOURCE", "TEXT"]

Why is this needed:
SPDX 2.3 defines SOURCE as “human readable source code” and TEXT as generic text. Scripts (shell, Python, Perl, etc.) are executable source code, not arbitrary text.

Emitting only TEXT loses semantic information and makes it difficult for downstream consumers to distinguish between executable scripts and arbitrary text artifacts.

Tools performing policy enforcement, integrity validation, or runtime classification rely on SPDX metadata to differentiate executable artifacts from non-executable text files.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    Status

    Ready

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions