Feature Request: Filtering Options for Scanning (Tags, Extensions, Paths, Size)

Improve the **cdk-serverless-clamscan** construct with a `filter` property for scanning S3 objects based on tags, file extensions, S3 paths, and object size. Additionally, introduce configurable logic for both overall filtering criteria and tag-specific filtering, allowing different filters per bucket. These filters should also be configurable when dynamically adding buckets using the `addSourceBucket` method.

### Proposed `filter` Property
The `filter` property will be an object applied per bucket, with the following sections:

1. **Tags**: Check if the object is tagged with specific key-value pairs, with a configurable logic operator to determine matching criteria.
2. **File Extensions**: Specific file types to scan.
3. **S3 Paths**: Targeted S3 prefixes or paths.
4. **Object Size**: Conditions to scan objects larger or smaller than specified sizes.
5. **Logic Operator**: Defines the overall logic to combine the specified filters (default: ALL).

### Configuration Example
Here’s an organized example showing the `filter` property per bucket:

**Example**:
```typescript
new ServerlessClamscan(this, 'rClamscan', {
  buckets: [
    {
      bucket: bucket_1,
      filter: {
        tags: {
          criteria: { 
            "ScanRequired": "true",
            "Priority": "high"
          },
          logicOperator: 'ANY' // Can be 'ANY' or 'ALL' (default: ANY)
        },
        extensions: ['.mp4', '.jpeg', '.png'],
        paths: ['uploads/images/', 'uploads/videos/'],
        objectSize: {
          greaterThanBytes: 1024, // 1 KB, optional
          lessThanBytes: 10485760 // 10 MB, optional
        },
        logicOperator: 'ALL' // Can be 'ANY' or 'ALL' (default: ALL)
      }
    },
    {
      bucket: bucket_2,
      filter: {
        extensions: ['.exe', '.zip'],
        logicOperator: 'ALL' // Can be 'ANY' or 'ALL' (default: ALL)
      }
    }
  ]
});

// Adding a source bucket with filters dynamically
const sc = new ServerlessClamscan(this, 'rClamscan', { /* initial configuration */ });
sc.addSourceBucket(bucket_3, {
  filter: {
    tags: {
      criteria: { 
        "ScanRequired": "true"
      },
      logicOperator: 'ANY' // Can be 'ANY' or 'ALL' (default: ANY)
    },
    extensions: ['.docx', '.pdf'],
    paths: ['uploads/documents/'],
    objectSize: {
      lessThanBytes: 5242880 // 5 MB, optional
    },
    logicOperator: 'ALL' // Can be 'ANY' or 'ALL' (default: ALL)
  }
});
```
### Scanning Behavior
- **Overall Logic Operator** (default: ALL): If set to ALL, only objects meeting all specified criteria will be scanned. If set to ANY, an object meeting any of the specified criteria will be scanned.
- **Tag Logic Operator** (default: ANY): Determines if any specified tags must match. If set to ALL, all specified tags must match.
- **Object Size Conditions**: Users can specify either `greaterThanBytes` or `lessThanBytes`, or both, depending on their needs.

This feature maintains **backward compatibility** by ensuring that if no filter is specified, all objects are scanned.

### Benefits
- **Cost Efficiency**: Lower Lambda invocation costs by skipping unnecessary scans.
- **Flexibility**: Multiple filters to meet diverse needs, all within a single, unified configuration.
- **Targeted Security**: An organization can focus on scanning only certain paths where sensitive documents are uploaded.

Looking forward to your feedback and thank you for considering this feature request!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature Request: Filtering Options for Scanning (Tags, Extensions, Paths, Size) #1205

Proposed `filter` Property

Configuration Example

Scanning Behavior

Benefits

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Feature Request: Filtering Options for Scanning (Tags, Extensions, Paths, Size) #1205

Description

Proposed filter Property

Configuration Example

Scanning Behavior

Benefits

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

Proposed `filter` Property