Skip to content

[LFXV2-1084] Add NATS KV Event Processing to Survey Service#4

Merged
andrest50 merged 11 commits intomainfrom
andrest50/events-processing
Mar 3, 2026
Merged

[LFXV2-1084] Add NATS KV Event Processing to Survey Service#4
andrest50 merged 11 commits intomainfrom
andrest50/events-processing

Conversation

@andrest50
Copy link
Contributor

Summary

Implements NATS KV bucket event processing for the survey service to automatically sync survey and survey response data from the v1 system to the v2 system. This enables real-time data synchronization, search indexing, and access control updates.

Ticket

https://linuxfoundation.atlassian.net/browse/LFXV2-1084

Implementation Details

Architecture

  • Watches the v1-objects KV bucket for keys matching itx-surveys:* and itx-survey-responses:*
  • Uses JetStream consumer with DeliverLastPerSubjectPolicy for latest version processing
  • Runs in the same binary as the HTTP API (background goroutine)
  • Event processing enabled by default, configurable via EVENT_PROCESSING_ENABLED

Data Transformation

  • Transforms v1 DynamoDB format (string fields) → v2 format (proper types)
  • Converts string integers to actual integers (nps_value, total_responses, etc.)
  • Maps v1 SFIDs → v2 UUIDs for committees and projects via IDMapper service
  • Preserves SurveyMonkey answers without transformation

Publishing

Publishes transformed data to two downstream services:

  1. Indexer Service (lfx.index.survey, lfx.index.survey_response)
    • Enables search functionality with parent references and access control metadata
  2. FGA-Sync Service (lfx.fga-sync.update_access, lfx.fga-sync.delete_access)
    • Updates Fine-Grained Authorization tuples and manages permissions

Deduplication

  • Tracks processed events in v1-mappings KV bucket
  • Distinguishes between CREATE and UPDATE operations

Error Handling

  • Transient errors (NAK for retry): NATS timeouts, IDMapper unavailable, network failures
  • Permanent errors (ACK and skip): Invalid JSON, missing required fields
  • Retries transient failures up to 3 times with 30-second timeout

Files Changed

New Files

  • cmd/survey-api/eventing/event_processor.go - Lifecycle management
  • cmd/survey-api/eventing/kv_handler.go - Event routing by key prefix
  • cmd/survey-api/eventing/survey_event_handler.go - Survey transformation logic
  • cmd/survey-api/eventing/survey_response_event_handler.go - Response transformation logic
  • internal/domain/event_models.go - v2 data models
  • internal/domain/event_publisher.go - Publisher interface
  • internal/infrastructure/eventing/event_config.go - Configuration
  • internal/infrastructure/eventing/nats_publisher.go - NATS publishing implementation
  • pkg/utils/string.go - String utility functions
  • docs/event-processing.md - Comprehensive documentation
  • scripts/delete_survey_documents.sh - Cleanup utility script

Modified Files

  • cmd/survey-api/main.go - Event processor initialization and shutdown
  • README.md - Documentation updates
  • go.mod - Added indexer service dependency

Configuration

New environment variables:

  • EVENT_PROCESSING_ENABLED - Enable/disable event processing (default: true)
  • EVENT_CONSUMER_NAME - JetStream consumer name (default: survey-service-kv-consumer)
  • EVENT_STREAM_NAME - JetStream stream name (default: KV_v1-objects)
  • EVENT_FILTER_SUBJECT - NATS subject filter (default: $KV.v1-objects.>)

Reference Implementation

Follows the exact pattern from voting service PR #8: linuxfoundation/lfx-v2-voting-service#8

Uses function-based handlers (not struct-based methods) to match the voting service architecture.

Documentation

  • Added docs/event-processing.md with architecture, configuration, operations, and troubleshooting
  • Updated README.md with event processing feature and configuration
  • Included operational procedures and monitoring guidance

Checklist

  • Event processor starts and connects to NATS KV bucket
  • Survey events transformed correctly from v1 → v2 format
  • Committee/project IDs mapped via IDMapper
  • String integers converted to proper ints
  • Events published to both indexer and FGA-sync
  • Deduplication working via v1-mappings bucket
  • Error handling matches voting service pattern
  • Graceful shutdown without data loss
  • Documentation complete and accurate

Implement complete event processing infrastructure to consume v1 survey and
survey_response data from NATS KV buckets, transform to v2 format, and publish
to indexer and FGA-sync services. This mirrors the pattern from voting service
PR #8.

Implementation details:

Event Processing Infrastructure:
- EventProcessor manages NATS JetStream consumer lifecycle with Start/Stop
- Watches v1-objects KV bucket with consumer pattern (DeliverLastPerSubject)
- Routes events by key prefix (itx-surveys, itx-survey-responses)
- Graceful shutdown with proper context cancellation

Data Transformation Layer:
- Converts v1 DynamoDB string fields to proper v2 types (ints, booleans)
- Maps v1 SFIDs to v2 UUIDs via IDMapper for committees and projects
- Processes committee arrays with deduplication of project references
- Preserves SurveyMonkey question/answer data without transformation
- Handles string-to-int conversions with error logging

Publishing Strategy:
- Dual publishing to indexer service and FGA-sync service
- Indexer messages include IndexingConfig with parent refs and access control
- FGA messages include committee/project references for access control
- Determines create vs update by checking v1-mappings KV bucket
- Delete operations publish to both services and clean up mappings

Error Handling:
- Transient errors (NATS timeout, connection issues) trigger NAK for retry
- Permanent errors (invalid JSON, missing required fields) trigger ACK to skip
- Mapping failures log warnings but don't block processing
- MaxDeliver=3, AckWait=30s, MaxAckPending=1000

Configuration:
- EVENT_PROCESSING_ENABLED=true (default enabled)
- EVENT_CONSUMER_NAME=survey-service-kv-consumer
- EVENT_STREAM_NAME=KV_v1-objects
- EVENT_FILTER_SUBJECT=$KV.v1-objects.>
- Runs in same binary as HTTP API, starts in background goroutine

Files created:
- cmd/survey-api/eventing/event_processor.go
- cmd/survey-api/eventing/kv_handler.go
- cmd/survey-api/eventing/survey_event_handler.go
- cmd/survey-api/eventing/survey_response_event_handler.go
- internal/domain/event_models.go
- internal/domain/event_publisher.go
- internal/infrastructure/eventing/event_config.go
- internal/infrastructure/eventing/nats_publisher.go
- pkg/utils/string.go

Files modified:
- cmd/survey-api/main.go - Event processor integration
- go.mod/go.sum - Added indexer-service dependency
- internal/domain/errors.go - Added error classifications
- internal/service/survey_service.go - Minor adjustments
- pkg/models/itx/models.go - Model updates

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

Signed-off-by: Andres Tobon <andrest2455@gmail.com>
Create detailed documentation for the NATS KV bucket event processing feature
and update README to reference it.

Documentation includes:
- Architecture overview with diagrams
- Event flow and data transformation details
- Configuration reference for all environment variables
- Error handling strategies (transient vs permanent)
- Operations guide (monitoring, troubleshooting, lifecycle)
- Deduplication mechanism explanation
- Performance considerations and tuning
- Development guide with code structure
- Integration with IDMapper, Indexer, and FGA-Sync services

README updates:
- Added Event Processing to features list
- Added event processing environment variables to configuration section
- Updated project structure to show eventing packages
- Added reference link to event-processing.md

This ensures developers understand what event processing does, how to configure
it, and how to troubleshoot issues.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

Signed-off-by: Andres Tobon <andrest2455@gmail.com>
Add a cleanup utility script that deletes all survey and survey_response
documents from the OpenSearch index. This is useful for cleaning up test
data or resetting the system during development and testing.

The script:
- Counts existing survey and survey_response documents
- Prompts for confirmation before deletion
- Deletes documents by object_type using delete_by_query
- Verifies cleanup by counting remaining documents
- Provides clear output at each step

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

Signed-off-by: Andres Tobon <andrest2455@gmail.com>
Copilot AI review requested due to automatic review settings February 13, 2026 18:51
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request implements NATS KV bucket event processing for the survey service to enable real-time synchronization of survey and survey response data from v1 (DynamoDB) to v2 (indexer and FGA services). The implementation follows the architectural pattern established in the voting service PR #8.

Changes:

  • Added event processing infrastructure with NATS JetStream consumer for KV bucket watching
  • Implemented data transformation logic from v1 (string-based DynamoDB format) to v2 (properly typed format)
  • Added ID mapping integration for converting v1 SFIDs to v2 UUIDs
  • Integrated with indexer service for search functionality and FGA-sync service for access control
  • Added comprehensive documentation and operational utilities

Reviewed changes

Copilot reviewed 14 out of 18 changed files in this pull request and generated 9 comments.

Show a summary per file
File Description
cmd/survey-api/eventing/event_processor.go Event processor lifecycle management with NATS connection and JetStream consumer
cmd/survey-api/eventing/kv_handler.go Routes KV events to appropriate handlers based on key prefix and operation
cmd/survey-api/eventing/survey_event_handler.go Transforms v1 survey data to v2 format, handles ID mapping and validation
cmd/survey-api/eventing/survey_response_event_handler.go Transforms v1 survey response data with similar conversion logic
internal/domain/event_models.go Defines v2 data models for surveys and responses with proper types
internal/domain/event_publisher.go Publisher interface abstraction for event publishing
internal/infrastructure/eventing/event_config.go Configuration structure for event processor
internal/infrastructure/eventing/nats_publisher.go NATS publisher implementation for indexer and FGA-sync
cmd/survey-api/main.go Initializes event processor on startup and handles graceful shutdown
pkg/utils/string.go Custom string utility (reinvents stdlib)
go.mod Adds indexer service dependency
docs/event-processing.md Comprehensive event processing documentation
README.md Updates with event processing feature documentation
scripts/delete_survey_documents.sh Utility script for cleaning up test data from OpenSearch
pkg/models/itx/models.go Whitespace/alignment formatting cleanup
internal/service/survey_service.go Minor formatting adjustments
internal/domain/errors.go Formatting cleanup

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

- Add config validation call after loadConfig() to fail fast on missing credentials
- Replace os.Exit() in goroutines with shutdown channel for graceful cleanup
- Add dependency checks for jq and curl in cleanup script
- Add HTTP status code validation for all curl requests in script
- Fix duplicate cmd/survey-api entry in README project structure

Generated with Claude Code: https://claude.com/claude-code

Signed-off-by: Andres Tobon <andrest2455@gmail.com>
Add blank lines after headers and before lists for better readability
and proper markdown rendering.

Generated with Claude Code: https://claude.com/claude-code

Signed-off-by: Andres Tobon <andrest2455@gmail.com>
…ssing

Add custom UnmarshalJSON methods to SurveyDBRaw, SurveyCommitteeDBRaw, and
SurveyResponseDBRaw structs to handle flexible type casting for numeric fields.
This allows the code to accept string, int, or float64 inputs from DynamoDB and
properly convert them to int types.

Changes:
- Add UnmarshalJSON to SurveyDBRaw for 11 numeric fields
- Add UnmarshalJSON to SurveyCommitteeDBRaw for 10 numeric fields
- Add UnmarshalJSON to SurveyResponseDBRaw for 2 numeric fields
- Remove ~150 lines of manual strconv.Atoi conversion code
- Update struct field types from string to int for proper typing

This follows the same pattern as lfx-v1-sync-helper PR #43.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Signed-off-by: Andres Tobon <andrest2455@gmail.com>
…ependency

Implement parent survey existence check before processing survey responses, with
exponential backoff retry logic when parent surveys are not yet found in mappings.
This prevents survey responses from being skipped when they arrive before their
parent surveys due to timing differences in the data pipeline.

Changes:
- Add parent survey existence check in handleSurveyResponseUpdate
- Return true to trigger NAK when parent survey not found in v1-mappings
- Implement exponential backoff in kvMessageHandler (2s → 10s → 20s delays)
- Use msg.NakWithDelay() for retry with calculated backoff delays
- Log retry attempts with delay duration for debugging

This follows the same pattern as lfx-v1-sync-helper PR #45.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Signed-off-by: Andres Tobon <andrest2455@gmail.com>
1. Replace custom utils.Contains with strings.Contains
   - The custom implementation was slower than the optimized standard
     library version which uses assembly on many platforms
   - Removed dependency on pkg/utils package

2. Refactor deduplication logic with helper function
   - Created appendIfNotExists helper using slices.Contains
   - Eliminated duplicate code in sendSurveyIndexerMessage and
     sendSurveyAccessMessage for project UID deduplication
   - Improves maintainability and consistency

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Signed-off-by: Andres Tobon <andrest2455@gmail.com>
Copy link

@mauriciozanettisalomao mauriciozanettisalomao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Basically, same comments from here; linuxfoundation/lfx-v2-voting-service#8

Good job! 🚀 💪

@andrest50 andrest50 merged commit b949237 into main Mar 3, 2026
4 of 5 checks passed
@andrest50 andrest50 deleted the andrest50/events-processing branch March 3, 2026 21:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants