Skip to content

fm-agent generates wrong source file path if proj_dir is not the same as repo root #6

@jschwe

Description

@jschwe
2026-04-24 17:21:01,290 [INFO] Removed duplicate file 'form_urlencoded/src/lib.rs' from phase 2 module 'form_urlencoded'
2026-04-24 17:21:01,290 [INFO] Removing phase 2: no source files remain after deduplication
2026-04-24 17:21:01,291 [WARNING] Source file not found: /workspaces/rust-url/form_urlencoded/form_urlencoded/src/lib.rs
2026-04-24 17:21:01,291 [ERROR] Nothing was extracted — check phases.json source_files paths.
2026-04-24 17:21:01,292 [INFO] Total time: 176.28 seconds

Main issue:

2026-04-24 17:21:01,291 [WARNING] Source file not found: /workspaces/rust-url/form_urlencoded/form_urlencoded/src/lib.rs

fm-agent generates a wrong path, it contains form_urlencoded twice. It looks like the markdown instruction in workflow_setup_extract.md should tell the agent to generate the path relative to proj_dir (the path passed to main.py)

Full log �[0m > ​Sisyphus - Ultraworker · google/gemini-3-pro-preview �[0m �[91m�[1mError: �[0mNo endpoints found for google/gemini-3-pro-preview. �[0m > ​Sisyphus - Ultraworker · anthropic/claude-sonnet-4.6 �[0m �[0m# �[0mTodos [ ] Scan codebase structure and source files [ ] Write fm_agent/phases.json [ ] Write fm_agent/spec_prompts/domain_context/engine_overview.txt [ ] Write fm_agent/spec_prompts/domain_context/phase_NN_types.txt for each phase [ ] Verify all files exist �[0m I detect **implementation** intent - create structured output files describing the codebase. My approach: scan codebase structure → write phases.json → write domain context files. �[0m→ �[0mRead .. �[0m→ �[0mRead . �[0m→ �[0mRead src �[0m→ �[0mRead Cargo.toml �[0m→ �[0mRead fm_agent �[0m→ �[0mRead src/lib.rs �[0m→ �[0mRead ../percent_encoding �[0m→ �[0mRead ../percent_encoding/src �[0m→ �[0mRead SPEC.md �[0m �[0m# �[0mTodos [x] Scan codebase structure and source files [ ] Write fm_agent/phases.json [ ] Write fm_agent/spec_prompts/domain_context/engine_overview.txt [ ] Write fm_agent/spec_prompts/domain_context/phase_NN_types.txt for each phase [ ] Verify all files exist �[0m Now I have sufficient context. The `form_urlencoded` crate is a single-module library with one source file. It has two main functional phases: **parsing** and **serializing**. Let me now write the output files. �[0m$ �[0mmkdir -p /workspaces/rust-url/form_urlencoded/fm_agent/spec_prompts/domain_context (no output) �[0m �[0m← �[0mWrite fm_agent/phases.json Wrote file successfully. �[0m �[0m# �[0mTodos [x] Scan codebase structure and source files [x] Write fm_agent/phases.json [ ] Write fm_agent/spec_prompts/domain_context/engine_overview.txt [ ] Write fm_agent/spec_prompts/domain_context/phase_NN_types.txt for each phase [ ] Verify all files exist �[0m �[0m← �[0mWrite fm_agent/spec_prompts/domain_context/engine_overview.txt Wrote file successfully. �[0m �[0m# �[0mTodos [x] Scan codebase structure and source files [x] Write fm_agent/phases.json [x] Write fm_agent/spec_prompts/domain_context/engine_overview.txt [ ] Write fm_agent/spec_prompts/domain_context/phase_NN_types.txt for each phase [ ] Verify all files exist �[0m �[0m← �[0mWrite fm_agent/spec_prompts/domain_context/phase_01_types.txt Wrote file successfully. �[0m �[0m← �[0mWrite fm_agent/spec_prompts/domain_context/phase_02_types.txt Wrote file successfully. �[0m �[0m# �[0mTodos [x] Scan codebase structure and source files [x] Write fm_agent/phases.json [x] Write fm_agent/spec_prompts/domain_context/engine_overview.txt [x] Write fm_agent/spec_prompts/domain_context/phase_NN_types.txt for each phase [ ] Verify all files exist �[0m �[0m$ �[0mls /workspaces/rust-url/form_urlencoded/fm_agent/ && echo "---" && ls /workspaces/rust-url/form_urlencoded/fm_agent/spec_prompts/domain_context/ fm_agent.log phases.json spec_prompts workflow_setup_extract.md --- engine_overview.txt phase_01_types.txt phase_02_types.txt �[0m �[0m# �[0mTodos [x] Scan codebase structure and source files [x] Write fm_agent/phases.json [x] Write fm_agent/spec_prompts/domain_context/engine_overview.txt [x] Write fm_agent/spec_prompts/domain_context/phase_NN_types.txt for each phase [x] Verify all files exist �[0m All files verified. ✓ All 4 required files created:
  • fm_agent/phases.json — 2 phases: Parsing (phase 1) and Serialization (phase 2), both mapping to form_urlencoded/src/lib.rs. They have no dependency on each other.

  • fm_agent/spec_prompts/domain_context/engine_overview.txt — Describes the overall architecture: two independent pipelines (parse/serialize), encoding conventions (+↔space, %XX percent-encoding, unchanged byte set), invariants, and the percent_encoding dependency.

  • fm_agent/spec_prompts/domain_context/phase_01_types.txt — Covers the parsing pipeline: Parse<'a>, ParseIntoOwned<'a>, Cow<str> item types, replace_plus, decode_utf8_lossy, all lazy-decoding invariants, and the exact 6-step decode algorithm.

  • fm_agent/spec_prompts/domain_context/phase_02_types.txt — Covers the serialization pipeline: ByteSerialize<'a>, Target trait, Serializer<'a, T>, EncodingOverride, all internal helpers, the unchanged-byte set, and all 10 serialization invariants including the unsafe justification.
    2026-04-24 17:21:01,290 [INFO] Removed duplicate file 'form_urlencoded/src/lib.rs' from phase 2 module 'form_urlencoded'
    2026-04-24 17:21:01,290 [INFO] Removing phase 2: no source files remain after deduplication
    2026-04-24 17:21:01,291 [WARNING] Source file not found: /workspaces/rust-url/form_urlencoded/form_urlencoded/src/lib.rs
    2026-04-24 17:21:01,291 [ERROR] Nothing was extracted — check phases.json source_files paths.
    2026-04-24 17:21:01,292 [INFO] Total time: 176.28 seconds

Background:

I tried out fm-agent on a small Rust crate, with a single lib.rs file, specfically the form_urlencoded crate from https://github.com/servo/rust-url.
Invocation: python main.py /workspaces/rust-url/form_urlencoded

Changes to fm-agent:

  • Edited FM-Agent/md/workflow_spec_step4_batch.md to append a project-specific directive:

Project-specific ground truth

Before generating specs for any function in this project:

  1. If a SPEC.md (or any *.spec.md) file exists at the repository root or alongside the source file being spec'd, read it first and treat its normative text as ground truth for [SPEC] generation.
  2. For the percent_encoding crate specifically, the authoritative specification is the WHATWG URL Standard, section Percent-encoded bytes. Prefer the anchors #percent-encode, #percent-decode, #string-percent-decode, #string-percent-encode-after-encoding, #utf-8-percent-encode, and the named percent-encode sets (#c0-control-percent-encode-set, #fragment-percent-encode-set, #query-percent-encode-set, #special-query-percent-encode-set, #path-percent-encode-set, #userinfo-percent-encode-set, #component-percent-encode-set, #application-x-www-form-urlencoded-percent-encode-set).
  3. In every [SPEC] block you emit for this project, cite the spec anchor that governs the function's post-condition (e.g. // Per https://url.spec.whatwg.org/#percent-decode step 2.2: %-byte with non-hex-digit follower is passed through unchanged).
  4. When the implementation's behavior cannot be reconciled with the cited spec clause, record the discrepancy as a bug candidate in the [SPEC] pre/post-conditions — do not paper over it by matching the spec to the code. The bug-validator downstream will confirm.
  5. RFC 3986 is not the governing spec for this crate. Do not cite it as ground truth unless the code under inspection explicitly references it.

Before invoking fm-agent, I had Claude code generate a markdown version from the verbatim text of the relevant spec (in html form).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions