Skip to content

Latest commit

 

History

History
272 lines (191 loc) · 10.6 KB

File metadata and controls

272 lines (191 loc) · 10.6 KB

Data Generator API Reference

The data generator module provides the interface and base implementation for all data generators. Generators are responsible for producing test data values. The processor calls generators based on GeneratorDirective entries created from the spreadsheet.

import {
  DataGeneratorInterface,
  DataGeneratorBase,
  DataGeneratorRegistry,
  GeneratorFaker
} from 'nanook-table'

Generator Lifecycle

The processor manages generators through a well-defined lifecycle:

1. loadStore()                    -- called once at startup for each registered generator
2. For each test case:
   a. generate()                  -- produce data for a GeneratorDirective
   b. createPostProcessDirectives()  -- optionally return additional directives
   c. postProcess()               -- called for each post-process directive
3. saveStore()                    -- called once at shutdown for each registered generator

Between test cases, clearContext() may be called to reset per-run state while preserving the store.


DataGeneratorInterface

Abstract interface that all data generators must implement. Defines the contract between the processor and any generator.

Constructor

new DataGeneratorInterface(options: {
  logger: LoggerInterface
  serviceRegistry?: DataGeneratorRegistry
  unique?: boolean
  maxUniqueTries?: number
  varDir?: string
  useStore?: boolean
})
Option Type Default Description
logger LoggerInterface required Logger instance for diagnostic output
serviceRegistry DataGeneratorRegistry undefined Registry providing access to other generators. Allows generators to compose with each other
unique boolean true When true, the generator should return unique values. The definition of "unique" is generator-specific
maxUniqueTries number 100 Maximum attempts to generate a unique value before throwing an error
varDir string undefined Directory path for reading/writing persistent store files
useStore boolean false Whether the generator should persist data between runs

Properties

Property Type Description
logger LoggerInterface The logger instance
serviceRegistry DataGeneratorRegistry The registry of all available generators
unique boolean Whether uniqueness is enforced
maxUniqueTries number Maximum uniqueness retry count
uniqueSet Set<string> Stores previously generated values for uniqueness checks
instanceData Map<string, unknown> Maps instance IDs to previously generated data. Ensures the same instance ID returns the same value
varDir string Store directory path
useStore boolean Whether the store is active
name string The name under which this generator is registered. Set automatically by the registry

Methods

async loadStore(): Promise<void>

Loads previously persisted data from the store file. Called once by the processor before any generation begins. Implementations that do not use a store can leave this as a no-op.

async saveStore(): Promise<void>

Persists the current store data to a file. Called once by the processor after all generation is complete.

getGenerator(generatorName: string): DataGeneratorInterface

Retrieves another generator from the service registry by name. Throws an error if the generator is not found. This enables generators to delegate to or compose with other generators.

// Inside a custom generator
const faker = this.getGenerator('GeneratorFaker')

clearContext(): void

Resets the uniqueSet and instanceData. Called between independent generation runs to clear per-run state without affecting the persistent store.

async generate(instanceId: string, testcase: TestcaseData, generatorDirective: GeneratorDirective): Promise<unknown>

Generates a value for the given directive. This is the primary generation method.

Parameter Type Description
instanceId string A unique ID for this test case instance. The same instance ID should yield the same data
testcase TestcaseData The test case data object being built. Contains data already generated by other generators
generatorDirective GeneratorDirective The directive describing what to generate, including generator name and config

Returns: The generated data, or undefined if the generator cannot produce data yet (e.g., because it depends on data from another generator that has not run yet). The processor will retry generators that return undefined.

async createPostProcessDirectives(instanceId: string, testcase: TestcaseData, generatorDirective: GeneratorDirective): Promise<GeneratorDirective[]>

Called after generate() returns successfully. Returns an array of additional directives for post-processing. Each returned directive will cause a later call to postProcess().

This is useful when a generator needs to perform additional work after all primary generators have completed.

async postProcess(instanceId: string, testcase: TestcaseData, generatorDirective: GeneratorDirective): Promise<void>

Called for each directive returned by createPostProcessDirectives(), after all primary generation is complete. Post-processing can modify the testcase data object directly and does not return a value.


DataGeneratorBase

Base implementation of DataGeneratorInterface. Provides store loading/saving, instance ID management, and the uniqueness mechanism. Most custom generators should extend this class rather than implementing the interface directly.

Inherited Behavior

  • Instance ID caching: If generate() is called with an instance ID that has already been used, the previously generated value is returned without calling _doGenerate() again.
  • Uniqueness enforcement: When unique is true, the base class retries _doGenerate() up to maxUniqueTries times until a value is produced that is not already in uniqueSet.
  • Store persistence: loadStore() reads from and saveStore() writes to a JSON file at ${varDir}/${storeFileName}.

Additional Properties

Property Type Description
storeName string The base name used for the store file. Defaults to the generator name
store Record<string, unknown> The data object that is persisted. Generators can store arbitrary data here
storeFileName string Computed file name for the store (read-only). Derived from storeName

Methods

_doGenerate(instanceId: string, testcase: TestcaseData, generatorDirective: GeneratorDirective): Promise<unknown>

Override this method in subclasses. This is where the actual data generation logic goes. The base class generate() method handles instance ID caching and uniqueness; _doGenerate() is only called when new data is actually needed.

import { DataGeneratorBase } from 'nanook-table'
import type { GeneratorDirective } from 'nanook-table'

class GeneratorTimestamp extends DataGeneratorBase {
  async _doGenerate(
    instanceId: string,
    testcase: TestcaseData,
    generatorDirective: GeneratorDirective
  ): Promise<string> {
    return new Date().toISOString()
  }
}

getStoreData(): Record<string, unknown>

Returns the data as it would be written to the store. Useful for inspecting store state without saving to disk.

Creating a Custom Generator

import {
  DataGeneratorBase,
  DataGeneratorRegistry,
  LoggerMemory
} from 'nanook-table'
import type { GeneratorDirective, TestcaseData } from 'nanook-table'

class GeneratorCounter extends DataGeneratorBase {
  private counter = 0

  async _doGenerate(
    instanceId: string,
    testcase: TestcaseData,
    generatorDirective: GeneratorDirective
  ): Promise<number> {
    this.counter += 1
    return this.counter
  }
}

// Register the generator
const logger = new LoggerMemory()
const registry = new DataGeneratorRegistry()
const counter = new GeneratorCounter({ logger, serviceRegistry: registry })
registry.registerGenerator('counter', counter)

DataGeneratorRegistry

A registry that stores generator instances by name. The processor uses the registry to look up generators when executing GeneratorDirective entries. Generators can also use it to access other generators for composition.

Methods

registerGenerator(name: string, generator: DataGeneratorInterface): void

Registers a generator under the given name. Also sets the name property on the generator instance.

const registry = new DataGeneratorRegistry()
const faker = new GeneratorFaker({ logger })
registry.registerGenerator('GeneratorFaker', faker)

getGenerator(name: string): DataGeneratorInterface

Returns the generator registered under the given name. Throws an error if no generator with that name exists.

const faker = registry.getGenerator('GeneratorFaker')

async loadStore(): Promise<void>

Calls loadStore() on every registered generator. The processor calls this once at startup.

async saveStore(): Promise<void>

Calls saveStore() on every registered generator. The processor calls this once at shutdown.


GeneratorFaker

A built-in generator that uses @faker-js/faker to produce data. The Faker method to call is specified in the config property of the GeneratorDirective.

Usage in Spreadsheets

In the generator column of your equivalence class table, use:

gen:GeneratorFaker:{"method": "person.firstName"}
gen:GeneratorFaker:{"method": "internet.email"}
gen:GeneratorFaker:{"method": "number.int", "args": [{"min": 1, "max": 100}]}

Configuration

The config object in the directive supports:

Key Type Description
method string The Faker method path, e.g., 'person.firstName', 'internet.email', 'number.int'
args unknown[] Optional array of arguments passed to the Faker method

Properties

Property Type Description
logger LoggerInterface The logger instance
unique boolean Whether generated values must be unique. Default is false for GeneratorFaker

Example

import {
  GeneratorFaker,
  DataGeneratorRegistry,
  LoggerMemory
} from 'nanook-table'

const logger = new LoggerMemory()
const registry = new DataGeneratorRegistry()
const faker = new GeneratorFaker({ logger, serviceRegistry: registry })
registry.registerGenerator('GeneratorFaker', faker)

The createDefaultGeneratorRegistry() factory function in the processor module creates a registry with GeneratorFaker already registered.