The data generator module provides the interface and base implementation for all data generators. Generators are responsible for producing test data values. The processor calls generators based on GeneratorDirective entries created from the spreadsheet.
import {
DataGeneratorInterface,
DataGeneratorBase,
DataGeneratorRegistry,
GeneratorFaker
} from 'nanook-table'The processor manages generators through a well-defined lifecycle:
1. loadStore() -- called once at startup for each registered generator
2. For each test case:
a. generate() -- produce data for a GeneratorDirective
b. createPostProcessDirectives() -- optionally return additional directives
c. postProcess() -- called for each post-process directive
3. saveStore() -- called once at shutdown for each registered generator
Between test cases, clearContext() may be called to reset per-run state while preserving the store.
Abstract interface that all data generators must implement. Defines the contract between the processor and any generator.
new DataGeneratorInterface(options: {
logger: LoggerInterface
serviceRegistry?: DataGeneratorRegistry
unique?: boolean
maxUniqueTries?: number
varDir?: string
useStore?: boolean
})| Option | Type | Default | Description |
|---|---|---|---|
logger |
LoggerInterface |
required | Logger instance for diagnostic output |
serviceRegistry |
DataGeneratorRegistry |
undefined |
Registry providing access to other generators. Allows generators to compose with each other |
unique |
boolean |
true |
When true, the generator should return unique values. The definition of "unique" is generator-specific |
maxUniqueTries |
number |
100 |
Maximum attempts to generate a unique value before throwing an error |
varDir |
string |
undefined |
Directory path for reading/writing persistent store files |
useStore |
boolean |
false |
Whether the generator should persist data between runs |
| Property | Type | Description |
|---|---|---|
logger |
LoggerInterface |
The logger instance |
serviceRegistry |
DataGeneratorRegistry |
The registry of all available generators |
unique |
boolean |
Whether uniqueness is enforced |
maxUniqueTries |
number |
Maximum uniqueness retry count |
uniqueSet |
Set<string> |
Stores previously generated values for uniqueness checks |
instanceData |
Map<string, unknown> |
Maps instance IDs to previously generated data. Ensures the same instance ID returns the same value |
varDir |
string |
Store directory path |
useStore |
boolean |
Whether the store is active |
name |
string |
The name under which this generator is registered. Set automatically by the registry |
Loads previously persisted data from the store file. Called once by the processor before any generation begins. Implementations that do not use a store can leave this as a no-op.
Persists the current store data to a file. Called once by the processor after all generation is complete.
Retrieves another generator from the service registry by name. Throws an error if the generator is not found. This enables generators to delegate to or compose with other generators.
// Inside a custom generator
const faker = this.getGenerator('GeneratorFaker')Resets the uniqueSet and instanceData. Called between independent generation runs to clear per-run state without affecting the persistent store.
async generate(instanceId: string, testcase: TestcaseData, generatorDirective: GeneratorDirective): Promise<unknown>
Generates a value for the given directive. This is the primary generation method.
| Parameter | Type | Description |
|---|---|---|
instanceId |
string |
A unique ID for this test case instance. The same instance ID should yield the same data |
testcase |
TestcaseData |
The test case data object being built. Contains data already generated by other generators |
generatorDirective |
GeneratorDirective |
The directive describing what to generate, including generator name and config |
Returns: The generated data, or undefined if the generator cannot produce data yet (e.g., because it depends on data from another generator that has not run yet). The processor will retry generators that return undefined.
async createPostProcessDirectives(instanceId: string, testcase: TestcaseData, generatorDirective: GeneratorDirective): Promise<GeneratorDirective[]>
Called after generate() returns successfully. Returns an array of additional directives for post-processing. Each returned directive will cause a later call to postProcess().
This is useful when a generator needs to perform additional work after all primary generators have completed.
async postProcess(instanceId: string, testcase: TestcaseData, generatorDirective: GeneratorDirective): Promise<void>
Called for each directive returned by createPostProcessDirectives(), after all primary generation is complete. Post-processing can modify the testcase data object directly and does not return a value.
Base implementation of DataGeneratorInterface. Provides store loading/saving, instance ID management, and the uniqueness mechanism. Most custom generators should extend this class rather than implementing the interface directly.
- Instance ID caching: If
generate()is called with an instance ID that has already been used, the previously generated value is returned without calling_doGenerate()again. - Uniqueness enforcement: When
uniqueistrue, the base class retries_doGenerate()up tomaxUniqueTriestimes until a value is produced that is not already inuniqueSet. - Store persistence:
loadStore()reads from andsaveStore()writes to a JSON file at${varDir}/${storeFileName}.
| Property | Type | Description |
|---|---|---|
storeName |
string |
The base name used for the store file. Defaults to the generator name |
store |
Record<string, unknown> |
The data object that is persisted. Generators can store arbitrary data here |
storeFileName |
string |
Computed file name for the store (read-only). Derived from storeName |
_doGenerate(instanceId: string, testcase: TestcaseData, generatorDirective: GeneratorDirective): Promise<unknown>
Override this method in subclasses. This is where the actual data generation logic goes. The base class generate() method handles instance ID caching and uniqueness; _doGenerate() is only called when new data is actually needed.
import { DataGeneratorBase } from 'nanook-table'
import type { GeneratorDirective } from 'nanook-table'
class GeneratorTimestamp extends DataGeneratorBase {
async _doGenerate(
instanceId: string,
testcase: TestcaseData,
generatorDirective: GeneratorDirective
): Promise<string> {
return new Date().toISOString()
}
}Returns the data as it would be written to the store. Useful for inspecting store state without saving to disk.
import {
DataGeneratorBase,
DataGeneratorRegistry,
LoggerMemory
} from 'nanook-table'
import type { GeneratorDirective, TestcaseData } from 'nanook-table'
class GeneratorCounter extends DataGeneratorBase {
private counter = 0
async _doGenerate(
instanceId: string,
testcase: TestcaseData,
generatorDirective: GeneratorDirective
): Promise<number> {
this.counter += 1
return this.counter
}
}
// Register the generator
const logger = new LoggerMemory()
const registry = new DataGeneratorRegistry()
const counter = new GeneratorCounter({ logger, serviceRegistry: registry })
registry.registerGenerator('counter', counter)A registry that stores generator instances by name. The processor uses the registry to look up generators when executing GeneratorDirective entries. Generators can also use it to access other generators for composition.
Registers a generator under the given name. Also sets the name property on the generator instance.
const registry = new DataGeneratorRegistry()
const faker = new GeneratorFaker({ logger })
registry.registerGenerator('GeneratorFaker', faker)Returns the generator registered under the given name. Throws an error if no generator with that name exists.
const faker = registry.getGenerator('GeneratorFaker')Calls loadStore() on every registered generator. The processor calls this once at startup.
Calls saveStore() on every registered generator. The processor calls this once at shutdown.
A built-in generator that uses @faker-js/faker to produce data. The Faker method to call is specified in the config property of the GeneratorDirective.
In the generator column of your equivalence class table, use:
gen:GeneratorFaker:{"method": "person.firstName"}
gen:GeneratorFaker:{"method": "internet.email"}
gen:GeneratorFaker:{"method": "number.int", "args": [{"min": 1, "max": 100}]}
The config object in the directive supports:
| Key | Type | Description |
|---|---|---|
method |
string |
The Faker method path, e.g., 'person.firstName', 'internet.email', 'number.int' |
args |
unknown[] |
Optional array of arguments passed to the Faker method |
| Property | Type | Description |
|---|---|---|
logger |
LoggerInterface |
The logger instance |
unique |
boolean |
Whether generated values must be unique. Default is false for GeneratorFaker |
import {
GeneratorFaker,
DataGeneratorRegistry,
LoggerMemory
} from 'nanook-table'
const logger = new LoggerMemory()
const registry = new DataGeneratorRegistry()
const faker = new GeneratorFaker({ logger, serviceRegistry: registry })
registry.registerGenerator('GeneratorFaker', faker)The createDefaultGeneratorRegistry() factory function in the processor module creates a registry with GeneratorFaker already registered.