Skip to content

Gin Extractor Enhancement: Response Schemas, Embedded Structs, Route Filtering & Type Inference #68

@spencercjh

Description

@spencercjh

Background

Spec-forge's Gin extractor uses Go AST static analysis to generate OpenAPI specs. Compared against swaggo/swag (annotation-based) output from a real project (openapi), several significant gaps were identified. This issue tracks the phased improvements to close these gaps.

Gap Analysis

# Gap Severity Root Cause
1 Response schemas mostly missing (show "Success") High Cannot penetrate done() helper to extract actual response types
2 Schema missing embedded struct fields High schema_extractor.go:261 skips embedded fields (continue)
3 Extra non-business routes (/swagger/{any}) Medium No route filtering in AST parser
4 All parameters typed as string Medium No type inference for c.Query/c.Param
5 Missing description, tags, summary Medium Needs LLM enricher or annotations
6 Response wrapper (Rsp) struct incomplete Medium Embedded struct not expanded
7 Missing example/enum/default/maxLength constraints Low AST cannot infer business constraints
8 operationId includes package prefix (apis.CreateProject) Low Format cleanup needed

Implementation Plan

Phase 1: Embedded Struct Expansion (P0)

Problem: schema_extractor.go:261 skips embedded fields with continue. This causes Rsp (which embeds BaseRsp) to lose code, msg, _cost, _err fields.

Solution: Implement Go field promotion rules — recursively expand embedded struct fields into the parent schema.

  • Pointer embedding (*BaseRsp) — dereference and process
  • Cross-package embedding (msgs.BaseRsp) — resolve via imports
  • Multi-level embedding — recursive expansion
  • Cycle detection — use existing visited map

Phase 2: Response Schema Extraction via Helper Pattern (P0)

Problem: ~99% of handlers in real projects use a done(c, data, err) helper that internally calls c.JSON(http.StatusOK, rsp). The analyzer cannot see through this.

Solution: Two-layer approach:

  1. Recognize common helper patterns — detect done(c, data, err), done(c, err), done(c, data) calls and extract data type from arguments
  2. Cross-function call tracing — for non-standard helpers, trace into function bodies to find c.JSON calls and map parameters

Built-in helper list: done, response, respond, writeJSON, sendJSON.

Phase 3: Route Filtering (P1)

Problem: Non-business routes (/swagger/{any}, /docs/*, /debug/*) are extracted.

Solution: Default exclude prefixes + configurable filtering.

Default excludes: /swagger, /docs, /debug, /static, /public, /favicon.ico

Phase 4: Parameter Type Inference (P1)

Problem: All c.Query/c.Param/c.GetHeader return values typed as string.

Solution: Multi-strategy inference:

Strategy Pattern Inferred Type
strconv.Atoi offset, _ := strconv.Atoi(c.Query("offset")) integer
strconv.ParseBool verbose, _ := strconv.ParseBool(c.Query("verbose")) boolean
strconv.ParseFloat score, _ := strconv.ParseFloat(c.Query("score"), 64) number
Binding struct c.ShouldBindQuery(&req) where req.Offset is int integer
Context methods c.GetInt("limit") integer

Phase 5: Tags + Summary Cleanup (P2)

  • Auto-generate tags from route prefix (/api/v1/projects/* → tag: projects)
  • Clean operationId (strip package prefix)
  • Generate summary from operationId (CamelCase → space-separated)

Phase 6: Binding Tag Enhancement (P2)

Extract OpenAPI constraints from Go struct tags:

Binding Tag OpenAPI Mapping
binding:"required" required: true
binding:"min=3" minLength: 3
binding:"max=16" maxLength: 16
binding:"oneof=a b c" enum: [a, b, c]
binding:"email" format: email

Phase 7: Configurable Route Exclusion (P3)

Allow users to exclude specific routes via CLI flag or config:

gin:
  excludeRoutes:
    - "/api/v1/projects/{project}/rjob*"
  excludePrefixes:
    - "/swagger"

Success Metrics

After all phases, compare against swaggo output:

  • Schema completeness: 13 → ~15 schemas
  • Response schema coverage: ~5% → >80%
  • Invalid routes: 1 → 0
  • Parameter type accuracy: 0% (all string) → >60%

Known Limitations (AST cannot solve)

These require annotations or LLM enricher:

  • example values — need @Param ... example: "foo" annotations
  • business-level required — e.g., whether tenant is required
  • Complex response schemas (allOf) — e.g., Ping returns Rsp + Pong composite
  • Descriptions — need LLM enricher or annotations

Reference

Design doc: docs/plans/2026-04-02-gin-extractor-enhancement.md

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions