Description
🚀 Feature Request
Demo of the solution :
Screen.Recording.2025-04-17.at.12.46.19.AM.1.mp4
Problem:
The current Playwright Codegen captures only minimal information while recording user actions. It primarily logs just what's needed to replay actions, which limits its utility in real-world test automation where:
- DOM attributes change frequently (e.g., dynamic IDs)
- Custom locator strategies are needed
- Test validation logic is missing
- Reusability and modularity are crucial
Proposed Solution:
Introduce a new enhanced Codegen workflow that records richer metadata from the browser and supports AI-based script enhancement in two stages using LLMs (Large Language Models). The process would work like this:
1. Enhanced Metadata Recording from Browser
The browser should record additional data for each user interaction for example -
- JS Path
- Outer HTML
- XPath
- Full XPath
- All element attributes
This richer data enables smarter decision-making during code generation.
2. Action-Level Enhancement (LLM-1)
After generating each action, Codegen passes it to an LLM (with customizable prompts by the user) to improve the quality and resilience of the code. Example use cases:
- Adding fallback locators
- Adding conditional waits
- Improving readability
Example LLM-1 Prompt:
“Add a fallback locator for each action: if the primary locator fails for 1 minute, use a fallback.”
3. Script-Level Enhancement (LLM-2)
Once the browser is closed, the full recorded script is passed to another LLM for full-script enhancement. This can:
- Extract hardcoded values as variables
- Modularize repetitive patterns
- Add validations or reusability patterns
Example LLM-2 Prompt:
“Make all input texts variables. If any extracted text is reused, make sure the same variable is used. Modularize the script.”
I have started to work on this thing the code is currently not up to the standards. But the results looks promising
Here is the link to my repo https://github.com/ArpitSureka/playwright. The code is not always in the main branch refer the branch that was most recently updated.
Example
No response
Motivation
At my company, we’re building tests for a business portal. The code generated by Codegen often fails because attributes like id
and others change frequently.
With LLM-1, we can guide how locators should be created — for example, avoiding unstable attributes and preferring more reliable patterns. Additionally, Codegen doesn't handle validations — such as ensuring that text shown on one page reappears correctly on another.
LLM-2 can handle such cases by turning input text into variables and reusing them across the script to ensure consistency and improve maintainability.