Skip to content
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -0,0 +1,184 @@
package cc.unitmesh.agent.artifact

/**
* PEP 723 Inline Script Metadata Parser & Generator.
*
* Parses and generates PEP 723 compliant inline metadata blocks in Python scripts.
*
* PEP 723 format example:
* ```python
* # /// script
* # requires-python = ">=3.11"
* # dependencies = [
* # "requests>=2.28.0",
* # "pandas>=1.5.0",
* # ]
* # ///
* ```
*
* @see <a href="https://peps.python.org/pep-0723/">PEP 723</a>
*/
object PEP723Parser {

/**
* Parsed result from a PEP 723 metadata block.
*/
data class PEP723Metadata(
/** Required Python version constraint, e.g. ">=3.11" */
val requiresPython: String? = null,
/** List of dependency specifiers, e.g. ["requests>=2.28.0", "pandas>=1.5.0"] */
val dependencies: List<String> = emptyList(),
/** AutoDev Unit custom metadata embedded in [tool.autodev-unit] */
val autodevContext: Map<String, String> = emptyMap(),
/** The raw text of the entire metadata block (including comment prefixes) */
val rawBlock: String? = null
)

// ---- Parsing ----

private val PEP723_BLOCK_PATTERN = Regex(
"""#\s*///\s*script\s*\n(.*?)#\s*///""",
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These regexes hardcode \n, so scripts with CRLF (\r\n) may fail to match/strip/inject the PEP 723 block (and related sections). Consider allowing optional \r in newline parts to be robust across platforms.

Severity: medium

Other Locations
  • mpp-core/src/commonMain/kotlin/cc/unitmesh/agent/artifact/PEP723Parser.kt:56

Fix This in Augment

🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.

RegexOption.DOT_MATCHES_ALL
)
Comment on lines +39 to +41
Copy link

Copilot AI Mar 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PEP723_BLOCK_PATTERN hard-codes \n line endings, so scripts with CRLF (\r\n) may fail to match and metadata won't be parsed/replaced/stripped correctly. Updating the regex to tolerate \r?\n (and ideally using multiline anchors) will make parsing more robust across platforms/editors.

Copilot uses AI. Check for mistakes.

private val REQUIRES_PYTHON_PATTERN = Regex(
"""requires-python\s*=\s*"([^"]*)""""
)

private val DEPENDENCIES_PATTERN = Regex(
"""dependencies\s*=\s*\[(.*?)\]""",
Copy link

Copilot AI Mar 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DEPENDENCIES_PATTERN uses \[(.*?)\] to capture the array contents, which will terminate at the first ] character. Valid dependency specifiers can contain ] (extras), e.g. requests[socks]>=2.0, which would cause truncated parsing. Consider parsing the dependencies section line-by-line until the closing # ] line (or use a minimal TOML array parser) instead of a raw bracket regex.

Suggested change
"""dependencies\s*=\s*\[(.*?)\]""",
"""dependencies\s*=\s*\[(.*)]""",

Copilot uses AI. Check for mistakes.
RegexOption.DOT_MATCHES_ALL
)

private val DEP_ITEM_PATTERN = Regex("""["']([^"']+)["']""")

private val AUTODEV_SECTION_PATTERN = Regex(
"""\[tool\.autodev-unit\]\s*\n(.*?)(?=#\s*///|\[tool\.|$)""",
RegexOption.DOT_MATCHES_ALL
)

private val AUTODEV_KV_PATTERN = Regex(
"""#\s*(\S+)\s*=\s*"([^"]*)""""
)
Comment on lines +43 to +59
Copy link

Copilot AI Mar 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The parser currently only recognizes double-quoted TOML strings (e.g., requires-python = "...", # key = "..."). TOML also allows single-quoted strings, so valid PEP 723 metadata may be silently ignored. Consider extending REQUIRES_PYTHON_PATTERN / AUTODEV_KV_PATTERN to support both quote styles (or using a small TOML parser on the de-commented block).

Copilot uses AI. Check for mistakes.

/**
* Parse PEP 723 inline metadata from a Python script.
*
* @param pythonContent Full text of the Python script.
* @return Parsed metadata, or a default empty metadata if no block is found.
*/
fun parse(pythonContent: String): PEP723Metadata {
val blockMatch = PEP723_BLOCK_PATTERN.find(pythonContent)
?: return PEP723Metadata()

val metadataBlock = blockMatch.groupValues[1]
val rawBlock = blockMatch.value

// Parse requires-python
val requiresPython = REQUIRES_PYTHON_PATTERN.find(metadataBlock)?.groupValues?.get(1)

// Parse dependencies
val dependencies = parseDependencies(metadataBlock)

// Parse [tool.autodev-unit] section
val autodevContext = parseAutodevContext(metadataBlock)

return PEP723Metadata(
requiresPython = requiresPython,
dependencies = dependencies,
autodevContext = autodevContext,
rawBlock = rawBlock
)
}

/**
* Extract only the dependency list from a Python script (convenience method).
*/
fun parseDependencies(pythonContent: String): List<String> {
val depsMatch = DEPENDENCIES_PATTERN.find(pythonContent) ?: return emptyList()
val depsContent = depsMatch.groupValues[1]

return DEP_ITEM_PATTERN.findAll(depsContent)
.map { it.groupValues[1] }
.toList()
}
Comment on lines +91 to +101
Copy link

Copilot AI Mar 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

parseDependencies() scans the entire input for dependencies = [...], so it can accidentally pick up a Python variable/string named dependencies outside the PEP 723 header. Since callers pass full script text (e.g., executor/agent), this can produce incorrect dependency installs. Consider first extracting the PEP 723 block with PEP723_BLOCK_PATTERN (or reusing parse()) and only searching within that block.

Copilot uses AI. Check for mistakes.

private fun parseAutodevContext(metadataBlock: String): Map<String, String> {
val sectionMatch = AUTODEV_SECTION_PATTERN.find(metadataBlock)
?: return emptyMap()

val sectionContent = sectionMatch.groupValues[1]
return AUTODEV_KV_PATTERN.findAll(sectionContent)
.associate { it.groupValues[1] to it.groupValues[2] }
}

// ---- Generation ----

/**
* Generate a PEP 723 metadata header block.
*
* @param dependencies List of dependency specifiers (e.g. "requests>=2.28.0").
* @param requiresPython Python version constraint (default ">=3.11").
* @param autodevContext Optional AutoDev Unit context key-value pairs to embed.
* @return The generated metadata block as a string, ready to prepend to a script.
*/
fun generate(
dependencies: List<String> = emptyList(),
requiresPython: String = ">=3.11",
autodevContext: Map<String, String> = emptyMap()
): String = buildString {
appendLine("# /// script")
appendLine("# requires-python = \"$requiresPython\"")

if (dependencies.isNotEmpty()) {
appendLine("# dependencies = [")
dependencies.forEach { dep ->
appendLine("# \"$dep\",")
}
appendLine("# ]")
}

if (autodevContext.isNotEmpty()) {
appendLine("# [tool.autodev-unit]")
autodevContext.forEach { (key, value) ->
appendLine("# $key = \"$value\"")
}
}

appendLine("# ///")
}

/**
* Inject or replace a PEP 723 metadata block in a Python script.
*
* If the script already contains a PEP 723 block, it is replaced.
* Otherwise the new block is prepended.
*
* @param pythonContent The original script content.
* @param dependencies Dependency list.
* @param requiresPython Python version constraint.
* @param autodevContext Optional AutoDev context map.
* @return The script with the metadata block injected/replaced.
*/
fun injectMetadata(
pythonContent: String,
dependencies: List<String> = emptyList(),
requiresPython: String = ">=3.11",
autodevContext: Map<String, String> = emptyMap()
): String {
val newBlock = generate(dependencies, requiresPython, autodevContext)

return if (PEP723_BLOCK_PATTERN.containsMatchIn(pythonContent)) {
PEP723_BLOCK_PATTERN.replace(pythonContent, newBlock.trimEnd())
} else {
newBlock + "\n" + pythonContent
}
}

/**
* Strip the PEP 723 metadata block from a Python script, returning only the code body.
*/
fun stripMetadata(pythonContent: String): String {
return PEP723_BLOCK_PATTERN.replace(pythonContent, "").trimStart('\n')
Copy link

Copilot AI Mar 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

stripMetadata() only trims leading \n after removing the block; if the script uses CRLF the result can start with a stray \r. Consider trimming both \r and \n (or calling trimStart() without args if that's acceptable here).

Suggested change
return PEP723_BLOCK_PATTERN.replace(pythonContent, "").trimStart('\n')
return PEP723_BLOCK_PATTERN.replace(pythonContent, "").trimStart('\r', '\n')

Copilot uses AI. Check for mistakes.
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,213 @@
package cc.unitmesh.agent.subagent

import cc.unitmesh.agent.artifact.ArtifactContext
import cc.unitmesh.agent.artifact.ConversationMessage
import cc.unitmesh.agent.artifact.ModelInfo
import cc.unitmesh.agent.artifact.PEP723Parser
import cc.unitmesh.agent.core.SubAgent
import cc.unitmesh.agent.model.AgentDefinition
import cc.unitmesh.agent.model.PromptConfig
import cc.unitmesh.agent.model.RunConfig
import cc.unitmesh.agent.tool.ToolResult
import cc.unitmesh.llm.LLMService
import cc.unitmesh.devins.llm.Message
import cc.unitmesh.devins.llm.MessageRole
import cc.unitmesh.llm.ModelConfig
import kotlinx.serialization.Serializable

/**
* PythonArtifactAgent – Sub-agent responsible for generating
* complete, self-contained Python scripts with PEP 723 inline metadata.
*
* The generated scripts follow the AutoDev Artifact convention and include
* dependency declarations so that they can be executed with `uv run` or
* after a simple `pip install`.
*
* @see <a href="https://github.com/phodal/auto-dev/issues/526">Issue #526</a>
*/
class PythonArtifactAgent(
private val llmService: LLMService
) : SubAgent<PythonArtifactInput, ToolResult.AgentResult>(
AgentDefinition(
name = "PythonArtifactAgent",
displayName = "Python Artifact Agent",
description = "Generates self-contained Python scripts with PEP 723 metadata for the AutoDev Unit system",
promptConfig = PromptConfig(
systemPrompt = SYSTEM_PROMPT,
queryTemplate = null,
initialMessages = emptyList()
),
modelConfig = ModelConfig.default(),
runConfig = RunConfig(
maxTurns = 1,
maxTimeMinutes = 5,
terminateOnError = true
)
)
) {

override fun validateInput(input: Map<String, Any>): PythonArtifactInput {
val prompt = input["prompt"] as? String
?: throw IllegalArgumentException("'prompt' is required")
val dependencies = (input["dependencies"] as? List<*>)
?.filterIsInstance<String>()
?: emptyList()

return PythonArtifactInput(
prompt = prompt,
dependencies = dependencies,
requiresPython = input["requiresPython"] as? String ?: ">=3.11"
)
}

override suspend fun execute(
input: PythonArtifactInput,
onProgress: (String) -> Unit
): ToolResult.AgentResult {
onProgress("🐍 Generating Python script...")

val responseBuilder = StringBuilder()

val historyMessages = listOf(
Message(role = MessageRole.SYSTEM, content = SYSTEM_PROMPT)
)

return try {
llmService.streamPrompt(
userPrompt = buildUserPrompt(input),
historyMessages = historyMessages,
compileDevIns = false
).collect { chunk ->
responseBuilder.append(chunk)
onProgress(chunk)
}

val rawResponse = responseBuilder.toString()
val scriptContent = extractPythonCode(rawResponse)

if (scriptContent.isNullOrBlank()) {
return ToolResult.AgentResult(
success = false,
content = "Failed to extract Python code from LLM response."
)
}

// Validate PEP 723 metadata is present; inject if missing
val meta = PEP723Parser.parse(scriptContent)
val finalScript = if (meta.rawBlock == null && input.dependencies.isNotEmpty()) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This only injects PEP 723 metadata when input.dependencies is non-empty, so an empty-deps script could be returned without any header if the LLM omits it (despite the system rule that every script must start with metadata). Consider injecting a minimal block (at least requires-python) even when deps are empty, or aligning the rule/behavior.

Severity: medium

Fix This in Augment

🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.

PEP723Parser.injectMetadata(
pythonContent = scriptContent,
dependencies = input.dependencies,
requiresPython = input.requiresPython
)
} else {
scriptContent
}
Comment on lines +95 to +105
Copy link

Copilot AI Mar 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment says “inject if missing”, but the code only injects a PEP 723 header when the input dependency list is non-empty. If the LLM omits the header and dependencies is empty, the resulting artifact can violate the agent’s own SYSTEM_PROMPT rule (“Every script MUST begin with an inline metadata block”). Consider injecting the header whenever meta.rawBlock == null (even with an empty dependency list) so requires-python is always present.

Copilot uses AI. Check for mistakes.

onProgress("\n✅ Python script generated successfully.")

ToolResult.AgentResult(
success = true,
content = finalScript,
metadata = mapOf(
"type" to "python",
"dependencies" to PEP723Parser.parseDependencies(finalScript).joinToString(","),
"requiresPython" to (PEP723Parser.parse(finalScript).requiresPython ?: ">=3.11")
)
)
} catch (e: Exception) {
ToolResult.AgentResult(
success = false,
content = "Generation failed: ${e.message}"
)
}
}

override fun formatOutput(output: ToolResult.AgentResult): String = output.content

// ---- helpers ----

private fun buildUserPrompt(input: PythonArtifactInput): String = buildString {
appendLine(input.prompt)
if (input.dependencies.isNotEmpty()) {
appendLine()
appendLine("Required dependencies: ${input.dependencies.joinToString(", ")}")
}
}

/**
* Extract the Python code block from an LLM response.
* Supports fenced code blocks (```python ... ```) and raw artifact XML.
*/
private fun extractPythonCode(response: String): String? {
// Try autodev-artifact XML tag first
val artifactPattern = Regex(
"""<autodev-artifact[^>]*type="application/autodev\.artifacts\.python"[^>]*>(.*?)</autodev-artifact>""",
RegexOption.DOT_MATCHES_ALL
)
artifactPattern.find(response)?.let { return it.groupValues[1].trim() }

// Try fenced python code block
val fencedPattern = Regex(
"""```python\s*\n(.*?)```""",
Copy link

Copilot AI Mar 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

extractPythonCode() only matches fenced code blocks that use a \n newline after ```python. Responses that use CRLF (\r\n) or put code immediately after the language tag can fail extraction. Consider making the regex more flexible (e.g., allow `\r?\n` and optional whitespace) so code extraction is reliable across platforms/LLM formatting.

Suggested change
"""```python\s*\n(.*?)```""",
// Matches ```python, optional trailing text, optional CRLF/newline, then captures everything up to the next ``` or end of string
"""```python[^\n\r]*\r?\n?(.*?)(```|$)""",

Copilot uses AI. Check for mistakes.
RegexOption.DOT_MATCHES_ALL
)
fencedPattern.find(response)?.let { return it.groupValues[1].trim() }

// Fallback: if the whole response looks like Python code
if (response.trimStart().startsWith("#") || response.trimStart().startsWith("import ") || response.trimStart().startsWith("from ")) {
return response.trim()
}

return null
}

companion object {
/**
* System prompt guiding the LLM to generate PEP 723 compliant Python scripts.
*/
const val SYSTEM_PROMPT = """You are an expert Python developer specializing in creating self-contained, executable Python scripts.

## Rules

1. **PEP 723 Metadata** – Every script MUST begin with an inline metadata block:
```python
# /// script
# requires-python = ">=3.11"
# dependencies = [
# "some-package>=1.0",
# ]
# ///
```

2. **Self-Contained** – The script must run independently. All logic resides in a single file.

3. **Main Guard** – Always include:
```python
if __name__ == "__main__":
main()
```

4. **Clear Output** – Use `print()` to provide meaningful output to stdout.

5. **Error Handling** – Include basic try/except blocks for I/O, network, or file operations.

6. **No External Config** – Avoid reading from external config files. Use environment variables via `os.environ.get()` when necessary.

7. **Output Format** – Wrap the script in `<autodev-artifact identifier="..." type="application/autodev.artifacts.python" title="...">` tags.
"""
}
}

/**
* Input for PythonArtifactAgent
*/
@Serializable
data class PythonArtifactInput(
/** Natural-language description of what the script should do */
val prompt: String,
/** Pre-declared dependencies (may be empty) */
val dependencies: List<String> = emptyList(),
/** Python version constraint */
val requiresPython: String = ">=3.11"
)
Loading
Loading