Name	Name	Last commit message	Last commit date
parent directory ..
PHASE_ONE_COMPLETE.md	PHASE_ONE_COMPLETE.md
README.md	README.md
active-config.ts	active-config.ts
analyze.ts	analyze.ts
graph-builder.ts	graph-builder.ts
index.ts	index.ts
package.json	package.json
schema-analyzer.ts	schema-analyzer.ts
tsconfig.json	tsconfig.json
types.ts	types.ts

Semantic Graph Extractor

Overview

The Semantic Graph Extractor is a TypeScript tool that analyzes OpenAPI specifications with semantic type annotations to build operation dependency graphs. This is Phase One of a test generation project that will ultimately help generate comprehensive test scenarios for APIs.

What it does

Parses OpenAPI Specifications: Reads and analyzes rest-api.domain.yaml which contains semantic type annotations using x-semantic-type extensions
Extracts Semantic Types: Identifies all semantic types (like ProcessInstanceKey, UserTaskKey, etc.) and their relationships
Analyzes Operations: Examines all API operations to understand their input parameters and response schemas
Builds Dependency Graph: Creates a graph showing how operations depend on each other through shared semantic types
Generates Analysis: Produces detailed reports about the API structure and dependencies

Key Findings

From the Camunda OpenAPI specification analysis:

144 operations across the API
18 semantic types that create relationships between operations
6,291 dependencies between operations
122 entry points (operations that can be called without dependencies)
Average of 43.69 dependencies per operation

Most Important Semantic Types

ProcessInstanceKey (3,720 dependencies) - Central to process management
ProcessDefinitionKey (1,449 dependencies) - Key for process templates
ElementInstanceKey (1,027 dependencies) - Individual process elements

Architecture

Core Components

index.ts: Main extraction logic and entry point
types.ts: TypeScript definitions for OpenAPI and graph structures
schema-analyzer.ts: Analyzes OpenAPI schemas and extracts semantic information
graph-builder.ts: Builds dependency graphs and performs analysis
analyze.ts: Generates human-readable analysis reports

Output Format

The dependency graph is saved as JSON with the following structure:

{
  "operations": [...],           // Array of all API operations
  "semanticTypes": [...],        // Array of all semantic types
  "edges": [...],               // Array of dependency relationships
  "metadata": {                 // Extraction metadata
    "extractedAt": "2025-08-13T...",
    "totalOperations": 144,
    "totalSemanticTypes": 18,
    "totalDependencies": 6291
  }
}

Dependency Types

Dependencies are classified by strength:

REQUIRED: Target operation cannot be called without source
OPTIONAL: Target operation benefits from source but can work without it
CONDITIONAL: Dependency depends on specific conditions

Usage

Prerequisites

cd api-test
npm install

Extract Dependency Graph

npm run extract-graph

This will:

Parse rest-api.domain.yaml
Extract semantic types and operations
Build the dependency graph
Save to generated/<config>/graph/operation-dependency-graph.json

Generate Analysis Report

npm run analyze-graph

This will:

Load the previously generated graph
Perform analysis to find entry points, sinks, clusters
Generate generated/<config>/graph/dependency-graph-analysis.md
Display key statistics in the console

Example Dependencies

Here are some example dependencies discovered:

Process Creation → Process Querying: createProcessInstance produces ProcessInstanceKey that can be used by searchProcessInstances
Job Activation → Job Completion: activateJobs produces JobKey that is required by completeJob
Element Navigation: Various operations produce ElementInstanceKey that enables drilling down into specific process elements

Graph Analysis Features

The tool identifies several important graph characteristics:

Entry Points

Operations that don't depend on others (122 found):

getTopology, getLicense, getAuthentication
activateJobs, createTenant, createUser
Search operations that can work without specific filters

Sink Operations

Operations that don't produce outputs used by others (111 found):

Status/info operations like getTopology
Completion operations like completeJob
Create operations for standalone entities

Operation Clusters

Groups of operations with mutual dependencies:

Main workflow cluster: Search operations, process creation, document handling
Decision cluster: Decision definition and evaluation operations

Future Phases

This semantic graph extractor is designed to support the following phases:

Phase Two: API Coverage Parser

Build a complete map of the API to reason about test coverage

Phase Three: Test Path Generator

Generate test scenarios that traverse the dependency graph to achieve maximum coverage

Technical Notes

Graph Format Rationale

JSON was chosen for the dependency graph format because:

Readability: Easy to inspect and debug
Interoperability: Can be consumed by various tools
Completeness: Preserves all extracted information
Future-proof: Easy to extend with additional metadata

Semantic Type Detection

The tool looks for x-semantic-type annotations in:

Schema definitions (components/schemas)
Parameter schemas
Request body schemas
Response schemas
Nested object properties

Performance Considerations

The tool processes large OpenAPI specs efficiently
Graph construction is O(n²) in operations but optimized for practical sizes
Memory usage scales linearly with spec complexity

Example Output Structure

Operation Record

{
  "operationId": "createProcessInstance",
  "method": "POST",
  "path": "/process-definitions/{processDefinitionKey}/instances",
  "parameters": [
    {
      "name": "processDefinitionKey",
      "location": "path",
      "semanticType": "ProcessDefinitionKey",
      "required": true
    }
  ],
  "responseSemanticTypes": {
    "200": [
      {
        "semanticType": "ProcessInstanceKey",
        "fieldPath": "processInstanceKey",
        "required": true
      }
    ]
  }
}

Dependency Edge

{
  "sourceOperationId": "createProcessInstance",
  "targetOperationId": "searchProcessInstances", 
  "semanticType": "ProcessInstanceKey",
  "sourceFieldPath": "processInstanceKey",
  "targetFieldPath": "query.processInstanceKey",
  "strength": "optional",
  "description": "searchProcessInstances can filter by ProcessInstanceKey from createProcessInstance"
}

This foundation enables sophisticated test generation strategies in future phases by providing a complete understanding of how API operations relate to each other through their data dependencies.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

Semantic Graph Extractor

Overview

What it does

Key Findings

Most Important Semantic Types

Architecture

Core Components

Output Format

Dependency Types

Usage

Prerequisites

Extract Dependency Graph

Generate Analysis Report

Example Dependencies

Graph Analysis Features

Entry Points

Sink Operations

Operation Clusters

Future Phases

Phase Two: API Coverage Parser

Phase Three: Test Path Generator

Technical Notes

Graph Format Rationale

Semantic Type Detection

Performance Considerations

Example Output Structure

Operation Record

Dependency Edge

FilesExpand file tree

semantic-graph-extractor

Directory actions

More options

Directory actions

More options

Latest commit

History

semantic-graph-extractor

Folders and files

parent directory

README.md

Semantic Graph Extractor

Overview

What it does

Key Findings

Most Important Semantic Types

Architecture

Core Components

Output Format

Dependency Types

Usage

Prerequisites

Extract Dependency Graph

Generate Analysis Report

Example Dependencies

Graph Analysis Features

Entry Points

Sink Operations

Operation Clusters

Future Phases

Phase Two: API Coverage Parser

Phase Three: Test Path Generator

Technical Notes

Graph Format Rationale

Semantic Type Detection

Performance Considerations

Example Output Structure

Operation Record

Dependency Edge