Description
We are still blocked on the main issue #6186 by schema-first efforts in OTEL Collector not progressing. I wonder if we could instead use the Go's AST library to navigate the hierarchy of known config structs and extract the comments and other metadata needed for the docs, and/or config examples.
There are various blog posts showing examples of using AST.
The tool could have just a hardcoded list of starting configuration structs, both from Jaeger and from OTEL code base, e.g. cmd/jaeger/internal/extension/jaegerquery/config.go
.
The prototype is available in draft PR #7064.
Rough outline of the milestones:
- add a new subcommand to jaeger-v2 to generate config schema (done in Auto-generate jaeger-v2 config docs via AST #7064)
- collect config objects from OTEL component factories (done in Auto-generate jaeger-v2 config docs via AST #7064)
- use reflection on those objects to determine additional structs from field types and embedded structs (partially done in Auto-generate jaeger-v2 config docs via AST #7064)
- use "golang.org/x/tools/go/packages" to parse the packages containing the structs to get access to other metadata like comments (partially done in Auto-generate jaeger-v2 config docs via AST #7064)
- transform collected data into JSON Schema output (partially done in Auto-generate jaeger-v2 config docs via AST #7064)
- run 3rd party tools to convert JSON schema into HTML documentation (done in Auto-generate jaeger-v2 config docs via AST #7064)
- enhance Jaeger docs to use the output from last step to include in the website as part of the release process
This is another outline of the task from Gemini:
Feature: Generate JSON Schema with Comments and Defaults
Goal: Implement a tool or function that generates JSON schema for a collection of Go objects, incorporating comments as descriptions and using the current field values as defaults.
Implementation Outline:
I. Initialization and Package Loading:
- Input:
- A slice or map of Go objects to generate schemas for.
- The package paths where the types of these objects are defined.
- Load Packages:
- Utilize the
"golang.org/x/tools/go/packages"
library to load the specified Go packages. - Configure
packages.Config
to include necessary information for parsing comments and type structures (e.g.,NeedTypes
,NeedSyntax
,NeedName
,NeedImports
,NeedDeps
,NeedFiles
,NeedCompiledGoFiles
,NeedExportFile
,NeedModule
).
- Utilize the
- Type Information:
- For each input Go object, obtain its
reflect.Type
using thereflect
package for runtime inspection.
- For each input Go object, obtain its
II. Reflecting and Parsing Types:
- Iterate Through Objects: Loop through each Go object in the input collection.
- Get
reflect.Type
andreflect.Value
:- Obtain the
reflect.Type
to analyze the structure. - Obtain the
reflect.Value
to access the current field values for defaults.
- Obtain the
- Find Corresponding
ast.TypeSpec
:- For the
reflect.Type
, locate the correspondingast.TypeSpec
within the parsed packages (pkg.Syntax
). - This will involve traversing the syntax trees and matching the
ast.TypeSpec.Name.Name
with the Go type's name. - Handle potential complexities like embedded types and type aliases.
- For the
- Extract Field Information: For each field of the
reflect.Type
:- Get the field name (
field.Name
). - Get the field type (
field.Type
). - Extract struct tags (
field.Tag
), specifically looking for thejson
tag to determine the JSON property name andomitempty
. - Get the current value of the field from the
reflect.Value
(Value.Field(i)
).
- Get the field name (
- Extract Comment Information:
- Locate the corresponding
ast.Field
in theast.TypeSpec
. - Extract the associated comment from
ast.Field.Doc
orast.Field.Comment
.
- Locate the corresponding
III. Building the JSON Schema:
- Schema Structure:
- Define a structure for the generated JSON schema, likely using the
"definitions"
section for type schemas and a top-level schema referencing these definitions.
- Define a structure for the generated JSON schema, likely using the
- Type Mapping:
- Create a mapping between Go types (from
reflect.Type
) and their corresponding JSON schema types (e.g.,string
,integer
,boolean
,array
,object
). - Handle basic types, slices, maps, and nested structs.
- Create a mapping between Go types (from
- Schema Properties: For each Go field, create a property in the JSON schema:
type
: Mapped from the Go field type.description
: The extracted Go field comment.default
: The current value of the Go field (serialized appropriately for JSON schema).- Potentially include other keywords like
format
,nullable
, and constraints based on struct tags.
- Handling Nested Objects:
- If a field is another Go object, recursively process its type and add a
$ref
to its definition in the"definitions"
section.
- If a field is another Go object, recursively process its type and add a
- Handling Slices and Maps:
- For slice and map types, define the
items
oradditionalProperties
schema, referencing the schema of the element/value type.
- For slice and map types, define the
IV. Data Structures:
TypeCache
(Map:reflect.Type
->*ast.TypeSpec
): Caches the mapping betweenreflect.Type
and itsast.TypeSpec
to avoid redundant lookups.SchemaDefinitions
(Map:string
->map[string]interface{}
): Stores the generated JSON schema definitions for each Go type, keyed by the type name.ProcessedTypes
(Set:reflect.Type
): Tracks already processed Go types to prevent infinite recursion with nested or circular dependencies.FieldInfo
(Struct): Holds intermediate information about each field:type FieldInfo struct { Name string JSONName string Type reflect.Type Value reflect.Value Comment string Tags reflect.StructTag }
PackageInfo
(Struct): Stores information about a loaded Go package, including a mapping of type names to theirast.TypeSpec
:type PackageInfo struct { Package *packages.Package TypeSpecs map[string]*ast.TypeSpec }
V. Output:
- Root Schema: Construct the final JSON schema object, including the
$schema
and the"definitions"
section. The root schema might also define properties for the top-level object(s). - Serialization: Serialize the JSON schema structure into a JSON string using
encoding/json
.
Key Considerations and Challenges:
- Handling embedded types correctly.
- Managing type aliases.
- Detecting and handling circular dependencies between types.
- Deciding how to handle unexported fields.
- Mapping custom Go types to appropriate JSON schema types.
- Implementing robust error handling.
- Optimizing performance for large and complex type structures.