Skip to content

Comments

fix: Multiple improvements - Frontend build, Mobile releases, DC error handling#94

Merged
hagen-p merged 7 commits intomainfrom
fix/frontend-react-query-error
Jan 27, 2026
Merged

fix: Multiple improvements - Frontend build, Mobile releases, DC error handling#94
hagen-p merged 7 commits intomainfrom
fix/frontend-react-query-error

Conversation

@hagen-p
Copy link
Collaborator

@hagen-p hagen-p commented Jan 26, 2026

Summary

This PR includes several improvements and fixes:

  1. Frontend build error fix (React Query v5 compatibility)
  2. Mobile app release automation via GitHub Actions
  3. Collector configuration to handle user navigation errors gracefully

Changes

1. Frontend Build Fix

Problem: Build failing with TypeScript error - onError callback removed in React Query v5
Solution: Replaced deprecated onError callback with try-catch pattern in queryFn
Files: src/frontend/providers/Ad.provider.tsx

2. Mobile Release Automation

Added: GitHub Actions workflow for building and releasing mobile apps
Features:

  • Manual trigger with version number input
  • Builds iOS and Android apps automatically
  • Creates GitHub releases with artifacts
  • Auto-detects repository (works on both fork and origin)
  • Commits version changes back to repo
    Files:
  • .github/workflows/mobile-release.yml (new)
  • build-saucelabs.sh (updated for auto-detection)

3. DC Error Handling (v1.5.0 Collector Config)

Problem: User navigation causing error alerts in APM

  • When users navigate away, Envoy marks spans with error=true
  • Creates false-positive errors in monitoring dashboards
  • Inflates error rates with normal user behavior

Solution: New OTel Collector transform processor

  • Detects DC (Downstream Connection) and canceled spans
  • Sets error=false and status=OK
  • Renames spans for clarity:
    • ingress (user navigated away)
    • router frontend egress (backend interaction terminated due to client disconnect)

Files: kubernetes/splunk-astronomy-shop-1.5.0-values.yaml (new)

Impact:

  • ✅ Error rates reflect actual application issues
  • ✅ User navigation treated as normal behavior
  • ✅ Clear span names show what happened
  • ✅ All diagnostic info preserved (response_flags, canceled attributes)

Testing

Frontend Build

  • ✅ TypeScript compilation passes
  • ✅ Ad loading works with graceful degradation
  • ✅ Errors handled silently

Mobile Release Action

  • ✅ Workflow triggers manually
  • ✅ Builds complete successfully
  • ✅ Releases created on correct repository
  • ✅ Auto-detection works on fork and origin

Collector DC Handling

  • ✅ Deployed to dev-astronomy cluster
  • ✅ Ingress DC errors no longer marked as errors
  • ✅ Egress cancellations no longer marked as errors
  • ✅ Span names updated correctly
  • ✅ APM dashboards show reduced error rates

Deployment Notes

Frontend

Deploy as normal - fix is backward compatible

Mobile Releases

Use GitHub Actions UI:

  1. Go to Actions → "Mobile App Release"
  2. Click "Run workflow"
  3. Enter version (e.g., 1.2.3)

Collector Config

Deploy v1.5.0 config:

helm upgrade splunk-otel-collector <chart> \
  -f kubernetes/splunk-astronomy-shop-1.5.0-values.yaml

Leave v1.4.0 unchanged for backward compatibility.

Breaking Changes

None - all changes are backward compatible

Related Issues

  • Fixes frontend build error with React Query v5
  • Addresses false-positive DC errors in APM traces
  • Adds mobile release automation capability

React Query v5 removed onError callback from useQuery options.
Moved error handling inside queryFn using try-catch pattern.

When /api/data times out or fails:
- Catch the error silently
- Log message for debugging
- Return empty array (page works without ads)
- No error state in React Query

This maintains the graceful degradation behavior while fixing the build error.
- Created mobile-release.yml workflow with manual trigger
- Accepts version number as input parameter
- Builds iOS and Android apps using build-saucelabs.sh
- Creates GitHub release on origin repository
- Commits version changes back to main branch
- Uploads build artifacts with 90-day retention

Updated build-saucelabs.sh:
- Now respects GITHUB_REPO environment variable
- Prefers origin remote over fork for auto-detection
- Ensures releases are pushed to origin (splunk/opentelemetry-demo)

Usage:
1. Go to Actions tab in GitHub
2. Select 'Mobile App Release' workflow
3. Click 'Run workflow'
4. Enter version number (e.g., 1.2.3)
5. Wait for build to complete (~15-20 minutes)
6. Release will be created at github.com/splunk/opentelemetry-demo/releases
Changes:
- Script now uses GITHUB_REPOSITORY env var (automatically set by GitHub Actions)
- When running locally, auto-detects from 'origin' remote
- If running on fork: release goes to fork
- If running on origin: release goes to origin
- No manual configuration needed

How it works:
1. GitHub Actions: GITHUB_REPOSITORY = repo where workflow runs
2. Local execution: Detects from 'origin' remote URL
3. Result: Release always goes to the correct repository

Examples:
- Run workflow on hagen-p/opentelemetry-demo-splunk → release goes to fork
- Run workflow on splunk/opentelemetry-demo → release goes to origin
- Run locally from fork clone → release goes to fork
- Run locally from origin clone → release goes to origin
Created new splunk-astronomy-shop-1.5.0-values.yaml with DC error fix.
Left 1.4.0 unchanged for backward compatibility.

Changes in v1.5.0:
- Added transform/dc_not_error processor
- Treats 'DC' (Downstream Connection closed) as normal user behavior
- Sets status.code to STATUS_CODE_UNSET for DC errors
- Sets error attribute to 'false' for DC errors
- Added processor to traces pipeline after filter/drop_flagd

Usage:
  helm upgrade astronomy-shop <chart> \
    -f kubernetes/splunk-astronomy-shop-1.5.0-values.yaml

Impact:
- User navigation no longer counted as errors in APM
- Error rates reflect actual application issues
- DC flag still present in spans for debugging
Extended transform/dc_not_error processor to handle both sides:

Ingress (Client → Proxy):
- ✅ Already fixed: response_flags='DC' → error='false'

Egress (Proxy → Frontend):
- ✅ Now fixed: canceled='true' → error='false'

Problem:
When a client disconnects (DC on ingress), Envoy also cancels the
outbound request to the backend. This created TWO errors per request:
1. Ingress DC error (now fixed)
2. Egress cancellation error (now fixed)

Solution:
Added handling for canceled='true' attribute on egress spans.
Now both ingress AND egress are treated as normal user behavior.

Impact:
- Eliminates ALL errors from user navigation events
- Error rates now reflect only real application issues
- Both DC and canceled flags still visible for debugging
Updated transform processor to rename spans for better visibility:

Before:
- ingress (error=true, response_flags='DC')
- router frontend egress (error=true, canceled=true)

After:
- ingress (user navigated away) (error=false, response_flags='DC')
- router frontend egress (backend interaction terminated - client disconnect) (error=false, canceled=true)

Benefits:
- ✅ Clear what happened at a glance
- ✅ No longer counted as errors
- ✅ All diagnostic info preserved (response_flags, canceled attributes)
- ✅ Easy to distinguish from real errors
- ✅ Better UX in APM trace view

The renamed spans clearly indicate this is normal user behavior,
not an application error requiring investigation.
Changed from:
  'backend interaction terminated - client disconnect'

To:
  'backend interaction terminated due to client disconnect'

Clearer causal relationship in the span name.
@hagen-p hagen-p merged commit b556fa4 into main Jan 27, 2026
5 checks passed
@hagen-p hagen-p deleted the fix/frontend-react-query-error branch January 27, 2026 09:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant