Skip to content

Add TS support for synthetic data, simulation parity, and prompt optimization#2798

Open
theanuragg wants to merge 3 commits into
confident-ai:mainfrom
theanuragg:feat/ts-deepeval-2734
Open

Add TS support for synthetic data, simulation parity, and prompt optimization#2798
theanuragg wants to merge 3 commits into
confident-ai:mainfrom
theanuragg:feat/ts-deepeval-2734

Conversation

@theanuragg

Copy link
Copy Markdown

This PR advances the TypeScript SDK toward the post–July 1st parity goals from #2734 by surfacing synthetic data generation, simulation parity, and prompt optimization capabilities in TS.

Concretely, it:

  • Implements a Synthesizer API for synthetic data generation in TypeScript, allowing users to generate goldens from documents, contexts, scratch, or existing goldens, with evolution and filtration configs aligned with the Python implementation.
  • Extends the existing simulation surface so that conversation/simulation workflows available in Python are reachable from TypeScript, keeping types and APIs consistent with the current /typescript design.
  • Introduces a PromptOptimizer for prompt optimization in TS, providing a metric-driven, iterative optimization loop that mirrors Python’s behavior while feeling natural for TypeScript users.

These changes are wired into the public TS exports and documented, so TS users can now:

  • Generate and evolve synthetic datasets in their existing DeepEval flows.
  • Use simulation features from TypeScript without dropping back to Python.
  • Optimize prompts directly from TS using their evaluation metrics as feedback.

This PR focuses on API completeness and parity; follow‑ups can refine algorithms and add more examples once these surfaces are battle‑tested.

@vercel

vercel Bot commented Jun 24, 2026

Copy link
Copy Markdown

@theanuragg is attempting to deploy a commit to the Confident AI Team on Vercel.

A member of the Team first needs to authorize it.

…rt local instantiation, and modularize Synthesizer evolution templates.
@penguine-ip

Copy link
Copy Markdown
Contributor

@theanuragg thanks for the PR. However it is near impossible to review. If you could break this down into scoped changed, I'm happy to take a look.

@theanuragg theanuragg force-pushed the feat/ts-deepeval-2734 branch from 66ddf40 to d9c13c8 Compare June 25, 2026 12:12
@theanuragg

Copy link
Copy Markdown
Author

@theanuragg thanks for the PR. However it is near impossible to review. If you could break this down into scoped changed, I'm happy to take a look.

The diff looks a lot bigger than the logical changes because Prettier reformatted a bunch of existing files. The actual code changes are scoped into a few areas:

  • New ‎assertTest API: ‎src/evaluate/assert_test.ts adds a standalone ‎assertTest() that calls ‎metric.measure() directly (bypassing the full ‎evaluate() pipeline and Confident AI posting). It returns a typed ‎TestResult and throws an ‎AssertionError with per‑metric details when any metric fails; ‎runMetric() is invoked with ‎showIndicator = false so it doesn’t affect CLI output.

  • Prompt usage without API key: ‎src/prompt/index.ts now exposes ‎Prompt.fromText() so you can construct prompts from raw text without requiring a Confident AI API key. The API client is lazily initialized on first ‎pull()/push(), so local evaluation / optimization flows work without remote calls.

  • CLI test runner refactor: ‎src/cli/index.ts switches the Jest runner to ‎child_process.spawn with multi‑path binary resolution (tries ‎jest / ‎npx jest on ‎PATH, then common ‎node_modules locations). I also added a ‎test command and cleaned up the console prefixes, but behavior is otherwise the same.

  • Synthesizer templates + engine:

▫ ‎src/synthesizer/templates.ts adds Nunjucks‑compiled evolution templates mirroring the Python side (‎EvolutionTemplate, ‎PromptEvolutionTemplate, ‎SynthesizerPromptTemplate).

▫ ‎src/synthesizer/synthesizer.ts wires those templates into a template‑based evolution engine with ‎evolutionMap / ‎promptEvolutionMap, ‎qualifyInput() (with retry), and ‎LLMTestCase generation via structured output. Evolution sampling matches Python’s weighting and properly renormalizes when ‎MultiContext is excluded for no‑context runs.

  • Prompt optimizer cleanup: ‎src/optimizer/prompt-optimizer.ts now imports ‎LLMTestCase at the top level instead of ‎require(), constructs prompts via ‎Prompt.fromText() instead of mutating private fields, and uses a small ‎resetMetric() helper that calls ‎metric.reset() when available.

  • OpenAI embeddings cost fix: ‎src/models/embedding-models/openai-embedding-model.ts corrects batch cost accounting by splitting total cost across the batch (‎perItemCost = cost != null ? cost / batchSize : null) instead of charging each item the full batch cost.

Did small plumbing updates and minor fixes across the file I worked on.

@A-Vamshi A-Vamshi mentioned this pull request Jul 1, 2026
6 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants