Repository: Stefsek/AWS-TicketManagementSystem
This project provides an end-to-end ticket management system built with the AWS Cloud Development Kit (CDK) in Python. It is designed to:
- Ingest incoming support tickets via Kinesis Data Stream
- Orchestrate processing with Step Functions
- Analyze sentiment using Amazon Comprehend
- Generate first ai response messages via Bedrock LLM (in a Lambda)
- Persist metadata in DynamoDB and full JSON in S3
- Notify Notifications with SNS
- ETL all data into Amazon Redshift via an AWS Glue job
- Monitor failures with CloudWatch Alarms
Every resource is defined in the CDK stack (ticket_management_system/stack.py) and uses RemovalPolicy.DESTROY for convenient teardown in a development or thesis environment. Configuration (Redshift JDBC URL, credentials, networking) is read from a local .env file.
Below is a simplified flow of how a ticket travels through the system:
[TicketGenerator/main.py]
|
v
+------------------------------+
| Kinesis Data Stream |
+------------------------------+
|
v
+------------------------------+
| TriggerSFN Lambda |
+------------------------------+
|
v
+------------------------------+
| Step Functions State Machine|
+------------------------------+
/ | \
v v v
+-----------+ +------------+ +-------------+
|Comprehend| |ResponseGen | | S3Writer |
| Detect | | Lambda | | Lambda |
+-----------+ +------------+ +-------------+
| | |
v v v
+-----------+ +------------+ +-------------+
| SNS Topic | | DynamoDB | | S3 Bucket |
+-----------+ +------------+ +-------------+
| | |
v v v
[Email Subs] (metadata) (JSON files)
|
v
+------------------------------+
| High Priority Alert → SNS |
+------------------------------+
|
v
+------------------------------+
| Glue ETL Job → Redshift |
+------------------------------+
|
v
+------------------------------+
| CloudWatch Alarm on Fail |
+------------------------------+
Each numbered step below corresponds to a CDK method in stack.py.
- Purpose: Ingest raw ticket events with high throughput and durability.
- Config: Single shard, 24‑hour retention, auto-destroy on stack deletion.
- Code:
ticket_management_system/lambdas/TriggerSFN/handler.py - Role Permissions: Read from Kinesis & Start Step Functions execution.
- Behavior: Filters records for
eventName: TicketSubmittedand starts state machine with the ticket payload.
- Definition: JSON in
ticket_management_system/state_machine/state_machine.json - Steps:
- DetectSentiment (Comprehend)
- ResponseGenerator Lambda
- WriteMetadata (DynamoDB + SNS)
- S3Writer Lambda
- High-Priority Alerts: f the generated priority is HIGH, the workflow immediately publishes an alert to the SNS topic before continuing the normal flow.
- Role: Permissions for Comprehend, Lambda Invoke, DynamoDB write, SNS publish.
- Layer: Shared dependencies at
ticket_management_system/lambda_layers/ResponseGenerator. - Code:
ticket_management_system/lambdas/ResponseGenerator/handler.py - Role Permissions: CloudWatch Logs &
bedrock:InvokeModel. - Function: Formats prompt, calls Bedrock LLM, returns customer response, priority, reasoning.
- Name:
ThesisTicketsTable - Partition Key:
ticket_id(String) - Tracks: Ticket metadata (ID, status, timestamps).
- Name:
ThesisTicketNotificationsTopic - Subscription: Email (configured address) for immediate notifications.
- Code:
ticket_management_system/lambdas/S3Writer/handler.py - Role Permissions: Write to S3 bucket.
- Behavior: Receives full ticket + LLM + sentiment output, transforms to flat JSON, stores under
tickets/YYYY/MM/DD/ticket_<ID>.json.
- Code:
ticket_management_system/lambdas/TriggerSFN/handler.py - Role Permissions: Triggers the state machine.
- Behavior: Starts by receiving a ticket with specific event name and triggers the state machine.
- Name:
thesis-tickets-bucket - Config: Auto-delete objects on stack destroy.
- Purpose: Long‑term storage of processed ticket JSON.
- Glue Connection: JDBC → Redshift using
.envvariables. - Glue Script:
ticket_management_system/glue_scripts/ticket_processing_job.py- Extract: Read JSON from S3
- Transform: Cast schema & validate no nulls
- Load: COPY into Redshift
- Schedule: EventBridge rule triggers every 2 hours.
- IAM: S3 read/write, Redshift credentials, Glue service role.
- Metric:
AWS/States.ExecutionsFailedfor the state machine. - Threshold: >0 failures in 1 minute.
- Action: Publish to SNS topic.
- Purpose: Generate realistic dummy tickets with LangChain + Bedrock, then push to Kinesis.
⚠️ Requires AWS Bedrock access with permissions to invoke Nova models.- If Bedrock is unavailable, you can substitute another LLM provider (e.g., OpenAI GPT) — update the code accordingly and supply the necessary API key.
- Key Steps:
- Use
issue_scenariosdictionary to pick a product and issue type. - Use
TicketGeneratorOutputParserto format JSON ticket. - Send
put_recordtokinesis-streamwith payload:{ "eventName": "TicketSubmitted", "ticketId": "TKT-...", "submittedAt": "ISO...", "data": {...} }
- Use
-
Purpose: Evaluate the quality of LLM-generated ticket responses against strict AWS Support standards.
⚠️ Requires AWS Bedrock access with permissions to invoke Nova models.- If Bedrock is unavailable, you can substitute another LLM provider (e.g., OpenAI GPT) — update the code accordingly and supply the necessary API key.
-
Inputs: Processed tickets stored in
ProcessedTickets/processed_tickets000.json. -
Evaluation Criteria (booleans only):
- contextual_relevance – Response explicitly acknowledges the AWS service/problem in the ticket.
- technical_accuracy – All AWS details are factually correct, safe, and applicable.
- professional_tone – Tone is formal, polite, and consistent with AWS support standards.
- actionable_guidance – Response provides 2–3 specific, executable troubleshooting steps.
-
Workflow:
- Load processed tickets (JSON lines).
- Build strict evaluation prompt (
Prompts/ticket_response_evaluator_prompts.py). - Run evaluation with AWS Bedrock (
us.amazon.nova-pro-v1:0). - Parse output using schema in
Schemas/ticket_response_evaluator_output_parser.py. - Save evaluations to
ticket_evaluations.pkl.
-
Example Output:
{ "output": { "contextual_relevance": true, "technical_accuracy": true, "professional_tone": true, "actionable_guidance": false } }
- ``: Bootstraps the stack in your AWS account/region.
- ``: Configuration for CDK commands.
Before deploying this stack, make sure the following AWS resources already exist in your account/region:
-
Amazon Redshift Cluster
- A provisioned Redshift cluster to host your data warehouse.
- Note its JDBC endpoint (for
REDSHIFT_JDBC_CONNECTION_URL) and cluster ARN (REDSHIFT_ARN).
-
Database, Schema & Table in Redshift
- Create the target database (e.g.
data). - Create or grant privileges on the schema (e.g.
demo_workspace). - Create the empty table (e.g.
processed_tickets) with columns matching the Glue schema. A SQL script is provided in thesqlfolder (sql/create_processed_tickets.sql) containing:CREATE TABLE IF NOT EXISTS ${REDSHIFT_SCHEMA}.${REDSHIFT_TABLE} ( ticket_id character varying(50) NOT NULL ENCODE lzo, submitted_at timestamp without time zone NOT NULL ENCODE az64, customer_first_name character varying(50) ENCODE lzo, customer_last_name character varying(50) ENCODE lzo, customer_full_name character varying(50) ENCODE lzo, customer_email character varying(50) ENCODE lzo, product character varying(50) ENCODE lzo, issue_type character varying(50) ENCODE lzo, subject character varying(500) ENCODE lzo, description character varying(5000) ENCODE lzo, response_text character varying(5000) ENCODE lzo, sentiment character varying(20) ENCODE lzo, sentiment_score_mixed double precision ENCODE raw, sentiment_score_negative double precision ENCODE raw, sentiment_score_neutral double precision ENCODE raw, sentiment_score_positive double precision ENCODE raw, priority character varying(20) ENCODE lzo, priority_reasoning character varying(5000) ENCODE lzo, processed_at timestamp without time zone ENCODE az64, PRIMARY KEY (ticket_id) ) DISTSTYLE AUTO;
- Create the target database (e.g.
-
VPC Networking
- At least one subnet (ID for
REDSHIFT_SUBNET_ID) in the cluster’s VPC. - A security group (ID for
REDSHIFT_SECURITY_GROUP_ID) allowing inbound JDBC traffic. - Ensure your subnet’s AZ matches
AVAILABILITY_ZONEused by the cluster.
- At least one subnet (ID for
-
IAM Permissions
- Your AWS user or role must be able to:
- Create and manage all CDK resources (Lambda, Kinesis, Glue, etc.).
- Read/write to the existing Redshift cluster via Glue’s IAM role.
- Your AWS user or role must be able to:
-
Email Addresses for Notifications
- Any valid email(s) that should receive SNS alerts on workflow failures.
-
AWS Bedrock Access
- Ensure your AWS principal has permissions to invoke AWS Bedrock LLM models (e.g.,
nova-pro-v1) and that Bedrock is enabled in your account.
- Ensure your AWS principal has permissions to invoke AWS Bedrock LLM models (e.g.,
-
AWS CLI with proper IAM rights
- Install the AWS CLI (v2) and configure it using
aws configureor environment variables. - You must create an AWS CLI named profile with your Access Key ID and Secret Access Key, ensuring it has sufficient IAM permissions to deploy and manage the resources in this project.
- For detailed setup, follow the AWS CLI Quickstart Guide: https://docs.aws.amazon.com/cli/latest/userguide/getting-started-quickstart.html
- Install the AWS CLI (v2) and configure it using
-
Node.js (v16+) & AWS CDK v2
- Install Node.js version 16 or higher.
- Install the AWS CDK v2 globally (
npm install -g aws-cdk). - This provides the
cdkcommand for synthesizing and deploying the infrastructure.
-
Python 3.11 & Virtual Environment
- Ensure Python 3.11 is installed on your system.
- Create and activate a virtual environment (
python3.11 -m venv .venv&source .venv/bin/activate). - Install Python dependencies with
pip install -r requirements.txt.
-
Docker (optional, for local Lambda testing)
- If you want to test Lambdas locally (
cdk synth,cdk deploy --watch), install Docker. - CDK can use Docker to build and emulate Lambda runtimes, ensuring compatibility with AWS.
Once these prerequisites are in place, continue with the setup steps below.
Environment Variables (.env**):**
# Unique project/resource name prefix used throughout the CDK stack
PROJECT_NAME=<YOUR_PROJECT_NAME>
# Redshift connection URL (JDBC)
REDSHIFT_JDBC_CONNECTION_URL=<YOUR_REDSHIFT_JDBC_URL>
# Redshift cluster ARN for Glue authentication
REDSHIFT_ARN=<YOUR_REDSHIFT_CLUSTER_ARN>
# Credentials to log in to Redshift
REDSHIFT_USERNAME=<YOUR_REDSHIFT_USERNAME>
REDSHIFT_PASSWORD=<YOUR_REDSHIFT_PASSWORD>
# Target database, schema, and table names in Redshift
REDSHIFT_DATABASE=<REDSHIFT_DATABASE_NAME>
REDSHIFT_SCHEMA=<REDSHIFT_SCHEMA_NAME>
REDSHIFT_TABLE=<REDSHIFT_TABLE_NAME>
# Networking details for Redshift VPC connectivity
REDSHIFT_SUBNET_ID=<YOUR_SUBNET_ID>
REDSHIFT_SECURITY_GROUP_ID=<YOUR_SECURITY_GROUP_ID>
AVAILABILITY_ZONE=<YOUR_AWS_AZ>
# Comma-separated list of notification email addresses for SNS alerts
NOTIFICATION_EMAILS=<EMAIL_ADDRESS_1>,<EMAIL_ADDRESS_2>
# AWS region where resources will be deployed
AWS_REGION=<YOUR_AWS_REGION>Each variable explained:
-
REDSHIFT_JDBC_CONNECTION_URL: JDBC endpoint used by Glue to connect and load data into Redshift.
-
REDSHIFT_ARN: Amazon Resource Name for your Redshift cluster; needed for Glue to retrieve temporary credentials.
-
REDSHIFT_USERNAME / REDSHIFT_PASSWORD: Authentication details for Redshift; Glue and CDK use these when establishing the connection.
-
REDSHIFT_DATABASE / SCHEMA / TABLE: Specify where processed tickets should be loaded in Redshift to organize data.
-
REDSHIFT_SUBNET_ID / REDSHIFT_SECURITY_GROUP_ID / AVAILABILITY_ZONE: Network settings ensuring Glue jobs can reach Redshift inside a VPC.
-
NOTIFICATION_EMAILS: Defines who will receive SNS notifications on Step Function failures or other alerts.
-
AWS_REGION: Tells CDK and Lambdas which AWS region to provision and target services in.
-
PROJECT\NAME: Unique project/resource name prefix used by the CDK stack to consistently name all AWS resources (streams, functions, tables, buckets, etc.).
Important: Never commit real credentials or ARNs to Git. Use the placeholders above in your local
.env, and add.envto your.gitignoreto keep them safe. git clone ... cd AWS-TicketManagementSystem python3 -m venv .venv && source .venv/bin/activate pip install -r requirements.txt npm install -g aws-cdk cdk bootstrap aws:/// cdk deploy
cdk deploy- Generate tickets:
TicketGenerator/main.py - Monitor: Kinesis, State Machine,Lambda logs in CloudWatch
- Verify: DynamoDB table entries & S3 JSON files
- Check ETL: Glue job runs and data appears in Redshift
- Alarm: Force a Step Function error to test SNS email,get notifications for HIGH priority labelled tickets
cdk destroyAll AWS resources will be removed, including data in DynamoDB, S3, Glue, etc.