Transaction Reconciliation Engine

A highly scalable, production-grade Transaction Reconciliation Engine built to detect discrepancies between internal system records and external exchange reports.

Designed with an emphasis on high-throughput memory efficiency, fault tolerance, and infrastructure-as-code security, this engine leverages modern Node.js streaming APIs and a distributed micro-architecture.

🏗️ Architecture

The system is deployed entirely serverless on AWS, ensuring zero-downtime scalability and highly optimized cost efficiency.

🚀 Advanced Engineering Concepts

1. Streaming Ingestion & Backpressure Management

Traditional CSV parsing loads entire files into memory, leading to catastrophic OutOfMemory (OOM) crashes under heavy load. This engine implements an advanced for-await iteration pipeline using native Node.js streams. It reads chunks iteratively, applies on-the-fly normalization, and yields memory immediately, allowing the engine to process multi-gigabyte transaction datasets within the strict memory constraints of an AWS Lambda environment.

2. Multi-Pass Heuristic Matching Algorithm

Instead of naive O(N^2) cross-referencing, the matching engine utilizes a layered, hash-based deterministic approach:

Pass 1 (Exact Match): O(1) Hash-map lookups based on globally unique composite keys (TransactionID + Amount + Date).
Pass 2 (Fuzzy Date Match): Sliding window temporal matching for transactions delayed across midnights.
Pass 3 (Orphan Detection): Remainder processing to accurately flag MISSING_IN_INTERNAL and MISSING_IN_EXCHANGE discrepancies.

3. CI/CD with Layered Turborepo Caching

The deployment pipeline (ci.yml, cd.yml) utilizes Turborepo to intelligently cache linting, type-checking, and test results. By hashing the AST and dependency trees, redundant executions are skipped. We consolidated the CI and CD steps into a highly optimized build-and-deploy job, cutting deployment times by over 60%.

4. Zero-Trust "Least Privilege" Cloud Security

Infrastructure provisioning is strictly controlled via granular IAM policies. The Serverless deployment executes under an IAM role explicitly restricted to:

Resource-constrained iam:PassRole limits.
Deterministic deployment bucket constraints (reconciliation-engine-deploy-*).
Scoped cloudformation, lambda, and apigateway ARNs.
Strict TagResource validation for organizational compliance.

🗄️ Database Schema & Indexing Strategy

The engine relies on MongoDB for persistence, utilizing heavily indexed collections to support lightning-fast reporting queries and aggregate data generation.

Collections

1. `ReconciliationRun`

Tracks the metadata, status, and aggregate statistics of a singular reconciliation execution.

{
  _id: ObjectId,
  status: 'PENDING' | 'PROCESSING' | 'COMPLETED' | 'FAILED',
  startTime: Date,
  endTime: Date,
  summary: {
    totalInternalRecords: Number,
    totalExchangeRecords: Number,
    matchedCount: Number,
    discrepancyCount: Number,
  }
}

Indexes:
- { status: 1, startTime: -1 } (Optimized for dashboard polling)

2. `Transaction`

Stores the normalized, atomic transactions from both sources.

{
  _id: ObjectId,
  runId: ObjectId,            // Ref: ReconciliationRun
  source: 'INTERNAL' | 'EXCHANGE',
  transactionId: String,
  amount: Number,
  currency: String,
  timestamp: Date,
  rawMetadata: Object         // Schema-less blob for audit trails
}

Indexes:
- { runId: 1, transactionId: 1 } (Compound unique index to prevent duplicate ingestion per run)
- { runId: 1, source: 1 } (Optimized for filtering by source during matching phase)

3. `ReconciliationResult`

The highly-optimized discrepancy ledger.

{
  _id: ObjectId,
  runId: ObjectId,            // Ref: ReconciliationRun
  status: 'MATCHED' | 'DISCREPANCY' | 'MISSING_IN_INTERNAL' | 'MISSING_IN_EXCHANGE',
  internalTransactionId: String,  // (Optional)
  exchangeTransactionId: String,  // (Optional)
  discrepancyReasons: [String],   // e.g., ["AMOUNT_MISMATCH", "DATE_MISMATCH"]
  deltaAmount: Number
}

Indexes:
- { runId: 1, status: 1 } (Critical for fast report generation and pagination)

Aggregation Pipelines

Generating discrepancy reports utilizes native MongoDB aggregation pipelines rather than memory-heavy application mapping:

$match: Filters by runId and status != 'MATCHED'.
$group: Groups by discrepancyReasons to generate analytical health dashboards.
$lookup: (Optional) Left-joins the Transaction collection to hydrate raw audit data into the final payload.

🛠️ Repository Structure (Monorepo)

reconciliation-engine/
├── apps/
│   └── api/                # Express.js REST API & Serverless Framework config
├── packages/
│   ├── engine/             # Core reconciliation logic & streaming parsers
│   ├── database/           # Mongoose models, schemas, and DB connection logic
│   ├── logger/             # Structured Pino logging utility
│   └── types/              # Shared TypeScript definitions (DTOs)
├── .github/
│   └── workflows/          # CI/CD GitHub Actions pipelines
├── turbo.json              # Turborepo task runner configuration
└── package.json            # Root workspace definitions (pnpm)

🚦 Local Development

Install Dependencies:
```
pnpm install
```
Start Infrastructure (MongoDB):
```
docker-compose up -d
```
Run Development Server:
```
pnpm turbo run dev
```
Run Test Suite:
```
pnpm turbo run test
```

Designed and engineered for financial-grade data integrity and uncompromised cloud security.

Name		Name	Last commit message	Last commit date
Latest commit History 173 Commits
.github/workflows		.github/workflows
apps/api		apps/api
data		data
packages		packages
.gitignore		.gitignore
.prettierrc		.prettierrc
README.md		README.md
commitlint.config.js		commitlint.config.js
docker-compose.yml		docker-compose.yml
package.json		package.json
pnpm-lock.yaml		pnpm-lock.yaml
pnpm-workspace.yaml		pnpm-workspace.yaml
tsconfig.base.json		tsconfig.base.json
turbo.json		turbo.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Transaction Reconciliation Engine

🏗️ Architecture

🚀 Advanced Engineering Concepts

1. Streaming Ingestion & Backpressure Management

2. Multi-Pass Heuristic Matching Algorithm

3. CI/CD with Layered Turborepo Caching

4. Zero-Trust "Least Privilege" Cloud Security

🗄️ Database Schema & Indexing Strategy

Collections

1. `ReconciliationRun`

2. `Transaction`

3. `ReconciliationResult`

Aggregation Pipelines

🛠️ Repository Structure (Monorepo)

🚦 Local Development

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Transaction Reconciliation Engine

🏗️ Architecture

🚀 Advanced Engineering Concepts

1. Streaming Ingestion & Backpressure Management

2. Multi-Pass Heuristic Matching Algorithm

3. CI/CD with Layered Turborepo Caching

4. Zero-Trust "Least Privilege" Cloud Security

🗄️ Database Schema & Indexing Strategy

Collections

1. ReconciliationRun

2. Transaction

3. ReconciliationResult

Aggregation Pipelines

🛠️ Repository Structure (Monorepo)

🚦 Local Development

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

1. `ReconciliationRun`

2. `Transaction`

3. `ReconciliationResult`

Packages