Rewrite Mimecast adapter with multi-API support #246

RagingRedRiot · 2025-11-21T16:56:27Z

Rewrite Mimecast adapter with multi-API support

Add OAuth 2.0 token caching with automatic refresh

Implement concurrent fetching across 5 API endpoints (audit events, attachment, impersonation, URL, DLP)

Add configurable base URL, initial lookback, and worker concurrency

Improve rate limiting with Retry-After header support

Add graceful shutdown with proper context cancellation

Implement hash-based deduplication for logs without IDs

Handle nested log structures (e.g., attachmentLogs arrays)

Add per-API enable/disable on 403 Forbidden responses

Replace simple loop with semaphore-controlled fetch cycles

Type of change

Bug fix (non-breaking change that fixes an issue)
New feature (non-breaking change that adds functionality)
Breaking change (fix or feature that would cause existing functionality to not work as expected)

Note

Replaces the Mimecast adapter with an OAuth-backed, concurrent multi-API fetcher, adds new config, robust retry/rate-limit handling, deduping, and graceful shutdown.

Adapter rewrite:
- Switch to OAuth 2.0 with cached tokens and singleflight refresh; remove getAuthToken and baseURL const.
- Concurrent fetching via errgroup across APIs: auditEvents, attachment, impersonation, url, dlp.
- New fetch loop (queryInterval) with per-API state (since, active, per-API dedupe).
Config:
- Add base_url, initial_lookback, max_concurrent_workers; validate and set sane defaults.
HTTP/transport:
- New tuned http.Client and header handling; page size increased to 500.
Resilience & rate limiting:
- Handle 401 (token reset/retry), 403 (disable API), 429 with Retry-After, and 5xx with capped backoff; 1h retry deadline.
Data handling:
- ApiResponse.Data now []map[string]interface{}; support nested log arrays (e.g., attachmentLogs, urlLogs).
- Hash-based deduplication when no ID field; per-API timestamped dedupe culling.
Shutdown & shipping:
- Context-driven graceful shutdown with closeOnce/fetchOnce, chFetchLoop; improved Close() waiting.
- submitEvents streams with backpressure handling and cancellation on prolonged buffer full.

^{Written by Cursor Bugbot for commit 7407d8a. This will update automatically on new commits. Configure here.}

maximelb · 2025-11-23T18:51:58Z

Code Review Findings

Critical Issues

1. Token Refresh Race Condition (client.go:198-215)

Multiple goroutines can simultaneously call refreshOAuthToken when a token expires. The code unlocks before calling refresh (line 211), allowing all waiting goroutines to proceed with token refresh, causing unnecessary API calls and potential rate limiting.

// Multiple goroutines pass this check simultaneously
if a.oauthToken != "" && time.Now().Before(a.tokenExpiry) {
    token := a.oauthToken
    a.tokenMu.Unlock()  // <-- Unlocked here
    return map[string]string{...}, nil
}
a.tokenMu.Unlock()  // <-- All goroutines unlock and call refresh
return a.refreshOAuthToken(ctx)  // <-- Multiple simultaneous calls

Fix: Use double-checked locking or sync.Once per refresh cycle to ensure only one goroutine refreshes the token.

2. Negative Token Expiry Duration (client.go:261)

If ExpiresIn < 60, this creates a negative duration, causing immediate token invalidation:

a.tokenExpiry = time.Now().Add(time.Duration(tokenResp.ExpiresIn-60) * time.Second)

Fix: Use max(tokenResp.ExpiresIn-60, 0) or handle short-lived tokens differently.

3. Unstable Hash-Based Deduplication (client.go:890-900)

json.Marshal(logMap) produces non-deterministic output due to Go map iteration order being random. Same event will generate different hashes, causing duplicates to be sent:

jsonBytes, err := json.Marshal(logMap)  // Map order is random in Go
hash := sha256.Sum256(jsonBytes)

Fix: Sort keys before marshaling or use a deterministic serialization method.

4. Context Mismatch (client.go:150)

USP client uses parent context while adapter uses child context. If parent cancels, USP client stops but adapter continues:

ctxChild, cancel := context.WithCancel(ctx)
a.uspClient, err = uspclient.NewClient(ctx, conf.ClientOptions)  // Should use ctxChild

Fix: Use ctxChild when creating the USP client.

5. Negative Retry-After Duration (client.go:652)

If retryAfterTime is in the past, time.Until() returns negative duration:

retryUntilTime := time.Until(retryAfterTime).Seconds()  // Can be negative
if err := sleepContext(a.ctx, time.Duration(retryUntilTime)*time.Second)

Fix: Add validation: if retryAfterTime.Before(time.Now()) { continue }

High Priority Issues

6. No Validation for MaxConcurrentWorkers (client.go:130-132)

Accepts any positive value. Setting to 100,000 creates 100,000 goroutines:

if c.MaxConcurrentWorkers == 0 {
    c.MaxConcurrentWorkers = 10
}
// No upper bound check

Fix: Add reasonable upper limit (e.g., 100).

7. Retryable Errors Not Retried (client.go:732-740)

Mimecast API returns Retryable: true for errors that should be retried, but code treats all API errors as fatal:

errorMessages = append(errorMessages, fmt.Sprintf("%s: %s (retryable: %v)", 
    errDetail.Code, errDetail.Message, errDetail.Retryable))
}
// Returns error without checking Retryable field
return nil, fmt.Errorf("mimecast api errors: %v", errorMessages)

Fix: Check Retryable field and retry with backoff for retryable errors.

8. Inconsistent 5XX Error Handling (client.go:704-712 vs 714-720)

5XX errors return nil, nil (no error), other non-200 return error. This prevents 5XX errors from being logged via the error channel in fetchApi:483:

if status >= 500 && status < 600 {
    return nil, nil  // Silent failure, no error propagated
}
if status != http.StatusOK {
    return allItems, err  // Error propagated
}

Fix: Return error for 5XX to ensure proper error tracking, or document this design choice.

Medium Priority Issues

11. Semaphore Hardcoded (client.go:370)

shipperSem fixed at 2, ignoring MaxConcurrentWorkers config:

shipperSem := make(chan struct{}, 2)  // Always 2, regardless of config

Fix: Make this configurable or document why it's separate from worker concurrency.

12. Unused Variable (client.go:418, 438-439)

count incremented but never used. Dead code.

count := 0
// ...
count += len(events)

Fix: Remove or use for metrics/logging.

13. Redundant querySucceeded Flag (client.go:531, 823, 828)

Only set to true at line 823, checked at 828. Always true at check point:

var querySucceeded bool
// ... lots of code ...
querySucceeded = true  // Line 823
if querySucceeded {    // Line 828 - always true here

Fix: Remove flag and simplify logic.

maximelb

Also getting the robot to post some relevant findings from its review as comments to the PR.

mimecast/client.go

maximelb · 2025-11-23T18:53:05Z

@RagingRedRiot Let me know if you prefer we pick up the PR from here and make mods vs you doing it.

RagingRedRiot · 2025-11-23T19:03:34Z

@maximelb I'm capable of making the updates, but I'm not protective of being the one to do them.

RagingRedRiot · 2025-11-24T17:27:10Z

This isn't actually true.

No Validation for MaxConcurrentWorkers (client.go:130-132)
Accepts any positive value. Setting to 100,000 creates 100,000 goroutines:

if c.MaxConcurrentWorkers == 0 {
c.MaxConcurrentWorkers = 10
}
// No upper bound check
Fix: Add reasonable upper limit (e.g., 100).

The MaxConcurrentWorkers is only a max limit of concurrent routines using Semaphores, it doesn't actually spawn that many routines. The impact is a small memory consumption from generating a large channel size.

RagingRedRiot · 2025-11-24T17:58:31Z

Inconsistent 5XX Error Handling (client.go:704-712 vs 714-720)
5XX errors return nil, nil (no error), other non-200 return error. This prevents 5XX errors from being logged via the error channel in fetchApi:483:

if status >= 500 && status < 600 {
return nil, nil // Silent failure, no error propagated
}
if status != http.StatusOK {
return allItems, err // Error propagated
}
Fix: Return error for 5XX to ensure proper error tracking, or document this design choice.

The code does log 5XX errors via a.conf.ClientOptions.OnError(err). However, it intentionally returns nil error to the caller to prevent treating transient server errors as fatal:

usp-adapters/mimecast/client.go

Lines 712 to 721 in 9d1555d

    
           if status >= 500 && status < 600 { 
        
           	err := fmt.Errorf("mimecast server error: %d\nRESPONSE: %s", status, string(respBody)) 
        
           	a.conf.ClientOptions.OnError(err) 
        
           	// We don't want this to be handled like an error 
        
           	// The hope is these errors are temporary 
        
           	if len(allItems) > 0 { 
        
           		return allItems, nil 
        
           	} 
        
           	return nil, nil 
        
           }

5XX errors are typically transient (server overload, temporary outage)
Logging via OnError ensures visibility and monitoring
Returning nil error prevents the adapter from shutting down or treating the API as permanently failed
The next fetch cycle (30 seconds later) will retry automatically
If partial data was collected (allItems), it's still returned and shipped

Co-authored-by: Maxime Lamothe-Brassard <[email protected]>

RagingRedRiot · 2025-11-24T18:16:58Z

@maximelb I believe I have addressed all code review findings.

maximelb · 2025-12-10T19:17:46Z

/gcbrun

- Fix error variable shadowing bug where err from io.ReadAll was being shadowed by err from strconv.Atoi/http.ParseTime - Fix mutex contention by not holding tokenMu during HTTP calls in refreshOAuthToken - Fix silent error ignore in submitEvents for non-ErrorBufferFull errors - Fix potential deadlock by using context cancellation instead of calling Close() from within fetch loop goroutines - Fix tight loop when Retry-After time has already passed by adding minimum 1 second sleep - Fix 5xx errors being swallowed - now properly returns error so api.since won't be updated and data won't be lost - Fix struct tag alignment inconsistencies in MimecastConfig - Fix generateLogHash to use JSON marshaling for deterministic hashing of complex/nested values - Add shutdown check in submitEvents loop 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <[email protected]>

cursor

This PR is being reviewed by Cursor Bugbot

Details

You are on the Bugbot Free tier. On this plan, Bugbot will review limited PRs each billing cycle.

To receive Bugbot reviews on all of your PRs, visit the Cursor dashboard to activate Pro and start your 14-day free trial.

mimecast/client.go

When there was no Retry-After header or it couldn't be parsed, retryAfterTime remained the zero value. The condition retryAfterTime.Before(time.Now()) was always true for the zero value (year 0001 is before current time), causing the code to incorrectly enter the "time already passed" branch (1s wait) instead of the "no header" branch (60s wait). Fix by checking !retryAfterTime.IsZero() before the Before check and restructure the conditions for clarity. Also added comment documenting that InitialLookback defaults to zero. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <[email protected]>

maximelb · 2025-12-11T03:25:44Z

/gcbrun

mimecast/client.go

- Replace complex nested goroutine structure with errgroup for cleaner concurrency control and automatic cancellation propagation - Fix data race in shouldShutdown() by using api.IsActive() instead of direct field access - Fix token refresh race condition with double-checked locking - Fix Retry-After duration truncation by using time.Duration directly The refactored RunFetchLoop is ~75 lines shorter and eliminates: - 3 levels of nested goroutines - 4 coordination channels (cycleSem, shipperSem, shipCh, shipDone) - Multiple early exit paths that could leak goroutines 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <[email protected]>

mimecast/client.go

- Remove unused MaxConcurrentShippers config field - Remove unused AuditLog type - Add 1-hour retry deadline for 429 rate limiting - Add 5XX retry with exponential backoff (30s-5m), 1h max - Use singleflight for token refresh to prevent thundering herd - Extend dedupe cleanup window from 60s to 1 hour - Fix minor style issues (indentation, blank lines) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <[email protected]>

- Increase Close() timeout from 10s to 2min to allow in-flight HTTP requests (60s timeout) and Ship() calls to complete gracefully - Reset retry counters after each successful page fetch so each page gets a fresh retry budget 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <[email protected]>

cursor · 2025-12-12T02:21:00Z

mimecast/client.go

+			}
+			// Handle non-ErrorBufferFull errors
+			a.conf.ClientOptions.OnError(fmt.Errorf("Ship(): %v", err))
+		}


Bug: Non-buffer-full shipping errors don't trigger shutdown

When uspClient.Ship returns an error that isn't ErrorBufferFull, the error is logged on line 926 but the loop continues processing subsequent events. Other adapters in the codebase stop and signal shutdown when any non-recoverable ship error occurs. This allows the adapter to silently drop messages that fail to ship due to unexpected errors, continuing operation in a potentially broken state instead of failing cleanly.

RagingRedRiot added 2 commits November 21, 2025 09:52

rewrite Mimecast adapter with multi-API support

e93216c

Remove Debug Messages

a0582e3

maximelb requested changes Nov 23, 2025

View reviewed changes

mimecast/client.go Outdated Show resolved Hide resolved

RagingRedRiot added 5 commits November 24, 2025 09:36

Token Refresh Race Condition

79a0854

Potential for negative token expiry

e73d07f

Deterministic Serialization for hashing IDs for logs without IDs

e210440

Correcting Context Mismatch

fda07c3

Potential Negative Retry-After Duration

1aa30f7

RagingRedRiot added 2 commits November 24, 2025 10:27

Setting MaxConcurrentWorkers upper limit

b348391

Retryable Errors

9d1555d

RagingRedRiot and others added 4 commits November 24, 2025 11:03

Remove unused variable (dead code)

9c543d9

MaxConcurrentShippers

0ee034f

Redundant querySucceeded Flag

4f5fffa

Initial Lookback Validation Removal

07dc737

Co-authored-by: Maxime Lamothe-Brassard <[email protected]>

Remove unused variable (dead code)

87fc024

cursor bot reviewed Dec 11, 2025

View reviewed changes

mimecast/client.go Outdated Show resolved Hide resolved

mimecast/client.go Outdated Show resolved Hide resolved

cursor bot reviewed Dec 11, 2025

View reviewed changes

mimecast/client.go Show resolved Hide resolved

mimecast/client.go Outdated Show resolved Hide resolved

mimecast/client.go Show resolved Hide resolved

cursor bot reviewed Dec 12, 2025

View reviewed changes

mimecast/client.go Show resolved Hide resolved

cursor bot reviewed Dec 12, 2025

View reviewed changes

Rewrite Mimecast adapter with multi-API support #246

Are you sure you want to change the base?

Rewrite Mimecast adapter with multi-API support #246

Uh oh!

Conversation

RagingRedRiot commented Nov 21, 2025 • edited by cursor bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!